automation & robotics research institute (arri) the university of texas at arlington

Automation & Robotics Research Institute (ARRI)The University of Texas at Arlington

F.L. Lewis, Fellow IEEE, Fellow U.K. Inst. Meas. & ControlMoncrief-O’Donnell Endowed Chair

Head, Controls & Sensors Group

http://ARRI.uta.edu/[email protected]

http://www.isphere.com/arri/

Trm

RtrRtcvCt

T

qRtcd

MEMS Modeling and Control

Thermal Model

TsCR

TR

sq

sT

ttot

ttot

1

1 2 3 4 5 6 7 8 940

60

80

100

120

140

160

Rtto

t[K/W

]

V[V]

txkdt

dxxc

dt

xdm mm f

2

2

Mechanical Model

Electrical Model

RC

L

Z(s)=R+sL+1/sC

FEA Experiment

sVTRsCLCs

sCsI

eee

e

12

Optical Model2/1

20

22

20

exp2

),(

w

st

wstU

Bruno Borovic

With Ai Qun Liu, NTU Singapore

Electrostatic Comb Drive Actuator for Optical Switch Control

Optical fibre

Size of human hair150 microns wide

Micro Electro Mechanical Optical Systems - MEMS

Electrostatic Parallel Plate Actuator

Use feedback linearization to solve the pull in instability problem

Objective: Develop new control algorithms for faster, more precise control of DoD and industrial systems Confront complex systems with backlash, deadzone, flexible modes, vibration, unknown disturbances and dynamics

Three patents on neural network learning control Internet- Remote Site Control and Monitoring Rigorous Proofs and Design Algorithms Applications to Tank Gun Barrels, Vehicle Active

Suspension, Industrial machine control SBIR contracts for industry implementation Numerous Books and Papers Award-Winning PhD students

Neural Network Learning Controllersfor Vibratory Systems- Gun barrel, HMMWV suspension

Input Membership Fns. Output Membership Fns.

Fuzzy Logic Rule Base

NN

Input

NN

Output

Linguistics- Fuzzy Logic

Nervous System- Neural Network (NN)

Input x Output u

Input x Output u

(Includes Adaptive Control)

Input Membership Fns. Output Membership Fns.

Fuzzy Logic Rule Base

NN

Input

NN

Output

Linguistics- Fuzzy Logic

Nervous System- Neural Network (NN)

Input x Output u

Input x Output u

(Includes Adaptive Control)

Intelligent Control Tools

[I]

Robust ControlTerm vi(t)

Tracking Loop

rKr

Nonlinear FB Linearization Loop

F1(x)^ qr = qrqr.e

ee = .

qd =qdqd.

..qd

RobotSystem

1/KB1 i

F2(x)^

K

id

NN#1

NN#2

Backstepping Loop

ue[I]


Tracking Loop

rKrKr


F1(x)F1(x)^ qr = qrqr.qr =qr = qrqr.qrqr.e

ee = .e = .

qd =qdqd.qd =qd =qdqd.qdqd.

..qd

..qd

RobotSystem

1/KB1 i

F2(x)F2(x)^

KK

id

NN#1

NN#2

Backstepping Loop

ue[I]


Tracking Loop

rKrKr


F1(x)F1(x)^ qr = qrqr.qr =qr = qrqr.qrqr.e

ee = .e = .

qd =qdqd.qd =qd =qdqd.qdqd.

..qd

..qd

RobotSystem

1/KB1 i

F2(x)F2(x)^

KK

id

NN#1

NN#2

Backstepping Loop

ue[I]


Tracking Loop

rKrKr


F1(x)F1(x)^ qr =qr = qrqr.qrqr.qr =qr = qrqr.qrqr.e

ee = .e = .

qd =qd =qdqd.qdqd.qd =qd =qdqd.qdqd.

..qd

..qd

RobotSystem

1/KB1 i

F2(x)F2(x)^

KK

id

NN#1

NN#2

Backstepping Loop

ueKK

id

NN#1

NN#2

Backstepping Loop

ue

PID loop

Nonlinearlearning loops

Nonlinear Intelligent Control


Relevance - Machine Feedback Control

qd

qr1qr2

AzEl

barrel flexiblemodes qf

compliant coupling

moving tank platform

turret with backlashand compliant drive train

terrain andvehicle vibrationdisturbances d(t)

Barrel tipposition

qd

qr1qr2

AzEl

barrel flexiblemodes qf

compliant coupling

moving tank platform

turret with backlashand compliant drive train

terrain andvehicle vibrationdisturbances d(t)

Barrel tipposition

Vehicle mass m

ParallelDamper

mc

activedamping

uc

(if used)

kc cc

vibratory modesqf(t)

forward speedy(t)

vertical motionz(t)

surface roughness(t)

k c

w(t)

Series Damper+

suspension+

wheel

Single-Wheel/Terrain System with Nonlinearities

High-Speed Precision Motion Control with unmodeled dynamics, vibration suppression, disturbance rejection, friction compensation, deadzone/backlash control

VehicleSuspension

IndustrialMachines

Military LandSystems

Aerospace

Aydin Yesildirek- Neural networks for control

S. Jagannathan- DT NN control

Sesh Commuri- CMAC, fuzzy

Rafael Fierro- robotic nonholonomic systems. Hybrid systems

Rastko Selmic- actuator nonlinearities

Javier Campos- fuzzy control, actuator nonlinearities

Ognjen Kuljaca- implementation of NN control

The Perpetrators

qd

Robot System[I]

Robust ControlTerm

q

v(t)

e

PD Tracking Loop

rKv

Neural Network Robot Controller

^

qd

f(x)

Nonlinear Inner Loop

..

Feedforward Loop

Universal Approximation Property

Problem- Nonlinear in the NN weights sothat standard proof techniques do not work

Feedback linearization

Easy to implement with a few more lines of codeLearning feature allows for on-line updates to NN memory as dynamics changeHandles unmodelled dynamics, disturbances, actuator problems such as frictionNN universal basis property means no regression matrix is neededNonlinear controller allows faster & more precise motion

Theorem 1 (NN Weight Tuning for Stability)

Let the desired trajectory )(tqd and its derivatives be bounded. Let the initial tracking error be

within a certain allowable set U . Let MZ be a known upper bound on the Frobenius norm of the

unknown ideal weights Z .

Take the control input as

vrKxVW vTT )ˆ(ˆ with rZZKtv MFZ )()( .

Let weight tuning be provided by

WrFxrVFrFW TTT ˆˆ'ˆˆˆ , VrGrWGxV TT ˆ)ˆ'ˆ(ˆ

with any constant matrices 0,0 TT GGFF , and scalar tuning parameter 0 . Initialize the weight estimates as randomVW ˆ,0ˆ .

Then the filtered tracking error )(tr and NN weight estimates VW ˆ,ˆ are uniformly ultimately bounded. Moreover, arbitrarily small tracking error may be achieved by selecting large control

gains vK . Backprop terms-

Werbos

Extra robustifying terms- Narendra’s e-mod extended to NLIP systems

Forward Prop term?

Can also use simplified tuning- Hebbian

Extension of Adaptive Control to nonlinear-in parameters systems No regression matrix needed

[I]


Tracking Loop

rKr


F1(x)^ qr = qr

qr.e

ee = .

qd =qd

qd.

..qd

RobotSystem

1/KB1i

F2(x)^

K

id

NN#1

NN#2

Backstepping Loop

ue

[I]


Tracking Loop

rKrKr


F1(x)F1(x)^

Neural network backstepping controller for Flexible-Joint robot arm

qr = qr

qr.qr =qr = qr

qr.qr

qr.e

ee = .e = .

qd =qd

qd.qd =qd =qd

qd.qd

qd.

..qd

..qd

RobotSystem

1/KB1i

F2(x)F2(x)^

KK

id

NN#1

NN#2

Backstepping Loop

ue

Backstepping

Advantages over traditional Backstepping- no regression functions needed

Add an extra feedback loopTwo NN neededUse passivity to show stability

DT backstepping- noncausal- Javier Campos patent

DT Systems- Jagannathan

Nonlinear

SystemK

v[

v1

rexd

Estimate

of Nonlinear

Function

--

x

( )f x

des

[0

--

yd

(n)

-

Backlash

-

1/s

Filterv

2

Backstepping loop

des

NN Compensator

-

dx r

Kb

u

nny

F

Z

Dynamic inversion NN compensator for system with Backlash

U.S. patent- Selmic, Lewis, Calise, McFarland

Force Control

Flexible pointing systems

Vehicle active suspensionSBIR Contracts

Andy LoweScott IkenagaJavier Campos

2. Neural Network Solution of Optimal Design Equations – 2002-2006

Nearly Optimal ControlBased on HJ Optimal Design EquationsKnown system dynamicsPreliminary Off-line tuning

1. Neural Networks for Feedback Control – 1995-2002

Based on FB Control ApproachUnknown system dynamicsOn-line tuningNN- FB lin., sing. pert., backstepping, force control, dynamic inversion, etc.

3. Approximate Dynamic Programming – 2006-

Nearly Optimal ControlBased on recursive equation for the optimal valueUsually Known system dynamics (except Q learning) The Goal – unknown dynamicsOn-line tuningOptimal Adaptive Control

ARRI Research Roadmap in Neural Networks

Extended adaptive control to NLIP systemsNo regression matrix

Nearly optimal solution ofcontrols design equations.No canonical form needed.

Extend adaptive control toyield OPTIMAL controllers.No canonical form needed.

2 2Tz h h u

2

0

2

0

2

0

2

0

2

)(

)(

)(

)(

dttd

dtuhh

dttd

dttz T

System

),(

)()()(

uxz

xy

dxkuxgxfx

)(ylu

d

u

z

y control

Performance output

Measuredoutput

disturbance

where

Find control u(t) so that

For all L2 disturbancesAnd a prescribed gain

L2 Gain Problem

H-Infinity Control Using Neural Networks

Zero-Sum differential game

Murad Abu Khalaf

Cannot solve HJI !!

CT Policy Iteration for H-Infinity Control--- c.f. Howard

Consistency equationFor Value Function

Successive Solution- Algorithm 1:Let be prescribed and fixed.

0u a stabilizing control with region of asymptotic stability 0

1. Outer loop- update controlInitial disturbance 00 d

2. Inner loop- update disturbanceSolve Value Equation

0)()(2)( 2

0

iTi

u

TTj

Tj

i

dddhhkdgufx

V j

Inner loop update disturbance

x

Vxkd

ji

Ti

)(2

12

1

go to 2.Iterate i until convergence to jVd , with RAS j

Outer loop update control action

x

Vxgu

jTj )(2

11

Go to 1.Iterate j until convergence to

Vu , , with RAS

Murad Abu Khalaf

Results for this Algorithm

For this to occur it is required that 0*

The algorithm converges to )(*),(*,),(* 0000 duV

the optimal solution on the RAS 0

Sometimes the algorithm converges to the optimal HJI solution V*, * , u*, d*

For every iteration on the disturbance di one has

ji

ji VV 1 the value function increases

ji

ji 1 the RAS decreases

For every iteration on the control uj one has

1 jj VV the value function decreases

1 jj the RAS does not decrease

)()()(

)()( i

LT

Li

L

TL

iL WxW

x

L

x

V

Value function gradient approximation is

Substitute into Value Equation to get

Therefore, one may solve for NN weights at iteration (i,j)

Neural Network Approximation for Computational Technique

222),,()(),,()(0 i

jTi

jTi

ji

jTi

j duhhduxfxwduxrxxw

Neural Network to approximate V(i)(x)

( ) ( ) ( )

1

( ) ( ) ( ),L

i i T iL j j L L

j

V x w x W x

Problem- Cannot solve the Value Equation!

Murad Abu Khalaf

Neural Network Optimal Feedback Controller

1( ) .

2T T

L Ld k x W

Optimal Solution

LT

LT Wxgu )(2

1

A NN feedback controller with nearly optimal weights

Murad Abu Khalaf

Optimal cost

xuxgxf

x

txVL

t

txVT

tu

** ,min

,

Optimal control

x

txVxgRxu T

*

1* ,

2

1

This yields the time-varying Hamilton-Jacobi-Bellman (HJB) equation

0

,,

4

1,, *1

***

x

txVxgRxg

x

txVxQxf

x

txV

t

txV T

T

Fixed-Final-Time HJB Optimal Control

Finite Horizon Control Cheng Tao

Note that txt

x

x

x

txVL

TLL

TLL wσw

σ

,

where is the Jacobian xLσ xxL σ

xtt

txVL

TL

L σw

,

HJB Solution by NN Value Function Approximation Cheng Tao

Approximating in the HJB equation gives an ODE in the NN weights txV ,

xexQ

txxgRxgxt

xfxtxt

L

LTL

TL

TL

LTLL

TL

wσσw

σwσw

1

4

1

Solve by least-squares – simply integrate backwards to find NN weights

)(2

1 1* twxgRxu LTL

T Control is

Time-varying weights

Necessary and Sufficient Conditions for Bounded L2 Gain OPFB Control:

H-infinity OPFB control Theorem

1

iii

)( ii.

such that 0 and matrices exixts thereand

detectable is ) ,( i.

ifonly and if by boundedgain with

stableally asymptotic is )(such that gain OPFBan exists there*,given aFor

112

1

2

LRLPBPBRPPDDQPAPA

LPBRKC

PPL

CA

L

BKCAA

TTTT

T

T

o

Jyotirmay Gadewadikar V. Kucera, Lihua Xie

H-Infinity Static Output-Feedback Control for Rotorcraft

J. Gadewadikar*, F. L. Lewis*, K. Subbarao$, K. Peng+, B. Chen+,

*Automation & Robotics Research Institute (ARRI),University of Texas at Arlington

+Department of Electrical and Computer Engineering, National University of Singapore


http://www.uta.edu/

ki

iiki

kh uxrxV ),()(

)())(,()( 1 khkkkh xVxhxrxV

))())(,((min)( 1*

khkkh

k xVxhxrxV

Hamiltonian

))(),((min)( 1**

kkku

k xVuxrxVk

))(),((minarg)(* 1*

kkku

k xVuxrxhk

Discrete-Time Optimal Control

cost

Value function recursion

)()())(,()),(,( 1 khkhkkkk xVxVxhxrhxVxH

Optimal cost

Bellman’s Principle

Optimal Control

System dynamics does not appear

)( kk xhu = the prescribed control input function

Solutions by Comp. Intelligence Community

Adaptive Dynamic Programming

ADP for H∞ Optimal Control Systems

2 2Tz h h u

( ) ( ) ( ),( , )

x f x g x u k x dy xz x u

( )u l y

d

u

z

y Control

Penalty output

Measuredoutput

Disturbance

where

2 2

20 0

2 2

0 0

( ) ( )

( ) ( )

Tz t dt h h u dt

d t dt d t dt

Find control u(t) so that

for all L2 disturbancesand a prescribed gain 2 when the system is at rest, x0=0.

Asma Al-Tamimi

Discrete-Time GameDynamic Programming:

Backward-in-time Formulation• Consider the following continuous-state and action spaces

discrete-time dynamical system

• The zero-sum game problem can be formulated as follows:

• The goal is to find the optimal strategies (State-feedback) for this multi-agent problem

• Using Bellman optimality principle “Dynamic Programming”

,1

kk

kkkk

xy

EwBuAxx

ki i

T

ii

T

ii

T

iwuk wwuuQxxxV 2maxmin)(

xLxu ** )( xKxw ** )(

).2^(maxmin

))(),((maxmin)(

11

1

k

T

kk

T

kk

T

kk

T

kwu

kkkwuk

PxxwwuuQxx

xVuxrxV

kk

kk

),)((

))((12

112

APBAPEIEPEEPB

BPEIEPEEPBBPBIL

iT

iT

iT

iT

iT

iT

iT

iT

i

).)((

))((1

112

APEAPBBPBIBPE

EPBBPBIBPEIEPEK

iT

iT

iT

iT

iT

iT

iT

iT

i

xppxV Tii ),(ˆ

),,,,,,,( 2132

221

21 nnnn xxxxxxxxxx

12

1 )()()()( kTiki

Tkiki

Tkik

Tkk

Ti xpxKxKxLxLQxxxp

HDP- Linear System Case

Value function update

Control update

Control gain

Disturbance gain

Showed that this is equivalent to iteration on the Underlying Game Riccati equation

APE

APB

IEPEAPE

EPBBPBIEPABPAQAPAP

iT

iT

iT

iT

iT

iT

iT

iT

iT

i

1

21 ][

Which is known to converge- Stoorvogel, Basar

Solve by batch LS or RLS

xLLxu Tii ),(ˆ xKKxw T

ii ),(ˆ

A, B, E needed

Asma Al-Tamimi

( ) Tk k kV x x Px

1( , , ) ( , , ) ( )k k k k k k k

TT T T T T Tk k k k k k

Q x u w r x u w V x

x u w H x u w

TTk

Tk

Tki

Tk

Tk

Tkk

Tkk

Tkk

Tk

TTk

Tk

Tki

Tk

Tk

Tk wuxHwuxwwuuRxxwuxHwux ][][][][ 111111

21

wwwuwx

uwuuux

xwxuxx

HHH

HHH

HHH

( ) , ( )i k i k i k i ku x L x w x K x

1 1 1

1 1 1

( ) ( ),

( ) ( ).

i i i i i i i ii uu uw ww wu uw ww wx ux

i i i i i i i ii ww wu uu uw wu uu ux wx

L H H H H H H H H

K H H H H H H H H

))(ˆ),(ˆ,(

)(ˆ)(ˆ)(ˆ)(ˆ))(ˆ),(ˆ,(

111

21

kikiki

kiT

kikiT

kikTkkikiki

xwxuxQ

xwxwxuxuRxxxwxuxQ

Linear Quadratic case- V and Q are quadratic

Q function update

Control Action and Disturbance updates

A, B, E NOT needed

Asma Al-Tamimi

Q learning for H-inf

Asma Al-Tamimi

),( uxfx

t

T

t

dtRuuxQdtuxrtxV ))((),())((

),,(),(),(),(),(0 ux

VxHuxruxf

x

Vuxrx

x

VuxrV

TT

0)0( V

),(),(min),(min0*

)(

*

)(uxf

x

Vuxrx

x

Vuxr

T

tu

T

tu

x

VxgRtxh T

*

12

1* )())((

dx

dVggR

dx

dVxQf

dx

dV T

TT *1

*

41

*

)(0

0)0( V

System

Cost

Hamiltonian

Optimal cost

Optimal control

HJB equation

Continuous-Time Optimal Control

Bellman

),(),(min),(min0)()(

uxfx

Vuxrx

x

Vuxr

T

tu

T

tu

)())(,()( 1 khkkkh xVxhxrxV

c.f. DT value recursion, where f(), g() do not appear

),,(),(),(0 ux

VxHuxruxf

x

VT

))(,())(,(0 xhxrxhxfx

Vjj

Tj

0)0( jV

x

VxgRxh jT

j

)()( 12

11

CT Policy Iteration

• Convergence proved by Saridis 1979 if Lyapunov eq. solved exactly

• Beard & Saridis used complicated Galerkin Integrals to solve Lyapunov eq.

• Abu Khalaf & Lewis used NN to approx. V for nonlinear systems and proved convergence

RuuxQuxr T )(),(Utility

Cost for any given u(t)

Lyapunov equation

Iterative solution

Pick stabilizing initial control

Find cost

Update control

Full system dynamics must be known

LQR Policy iteration = Kleinman algorithm

1. For a given control policy solve for the cost:

2. Improve policy:

If started with a stabilizing control policy the matrix monotonically converges to the unique positive definite solution of the Riccati equation.

Every iteration step will return a stabilizing controller. The system has to be known.

xLu k

kT

kT

kkkT

k RLLCCAPPA 0

11

k

Tk PBRL

kk BLAA

0L kP

Kleinman 1968

Lyapunov eq.

Policy Iteration Solution

PBPBRPPAPAPRic TT 1)(

1 1( ) ( ) 0T T T Ti i i i i iA BB P P P A BB P PBB P Q

1

1 ( ), 0,1,ii i P iP P Ric Ric P i

Policy iteration

This is in fact a Newton’s Method

Then, Policy Iteration is

Frechet Derivative

1. For a given control policy update the cost:

2. Improve policy (discrete update of a continuous time controller):

11010

0

0

)( xPxdtRuuQxxxPx kT

Tt

t

kTkTk

T

Solving for the cost – Our approach

))(()())(( 001

0

0

TtxVdtRuuQxxtxV k

Tt

t

kTkTk

ADP Greedy iteration

xLu kk

11

1

kT

k PBRL

No initial stabilizing control needed the cost is convergent to the optimal cost the controller is stabilizing the plant after a number of iterations any initial 00 P

Now Greedy ADP can be defined for CT SystemsDraguna Vrabie

))(())(( txWtxV Tkk

u(t+T) in terms of x(t+T) - OK

Analysis of the algorithm

For a given control policy

Tt

t

tAk

Tkkk

tAkk dteRLLQAPAPePP k

Tk

0

0

)(1

xLu kk k

Tk PBRL 1

)0(xex tAk kT

k PBBRAA 1

)()(1

00

0

0

)( tTAk

tTATt

t

tAk

Tk

tAk

kTkk

Tk ePedteRLLQeP

)0(; xBuAxx

Greedy update is equivalent to

Can show

When ADP converges, the resulting P satisfies the Continuous-Time ARE !!

ADP solves the CT ARE without knowledge of the system dynamics f(x)

with

1 0( ( )) ( ( )), 0t T T T

i i i itV x t x Qx u Ru d V x t T V

a strange pseudo-discretized RE

Draguna Vrabie

TtA

kT

kkktA

kk dteRLLQAPAPePP kTk

0

1 )(

Lemma 2. CT HDP is equivalent to

kT

k PBBRAA 1

Analysis of the algorithmDraguna Vrabie

a strange pseudo-discretized RE

TAk

TAT

tAk

Tk

tAk

kT

kkTk ePedteRLLQeP

0

1 )(

Equivalent to underlying equation

This extra term means the initial Control action need not be stabilizing

Direct OPTIMAL ADAPTIVE CONTROL

Solve the Riccati Equation WITHOUT knowing the plant dynamics

Model-free ADP

Works for Nonlinear Systems

Proofs?Robustness?Comparison with adaptive control methods?

DT ADP vs. Receding Horizon Optimal Control

0

)(

0

11

P

APBBPBIBPAQAPAP iT

iT

iT

iT

i

0

)( 11

111

N

kT

kT

kT

kT

k

P

APBBPBIBPAQAPAP

Forward-in-time ADP

Backward-in-time optimization – 1-step RHC

New Books

J. Campos and F.L. Lewis, "Method for Backlash Compensation Using Discrete-Time Neural Networks," SN 60/237,580, The Univ. Texas at Arlington, awarded March 2006.

RecentPatent

Use filter to overcome DT backstepping noncausality problem

International Collaboration

Jie Huang

Sam Ge

Organized and invited by Professor Jie Huang, CUHK

SCUT / CUHK Lectures on Advances in ControlMarch 2005

http://www.uta.edu/

F.L. Lewis, Assoc. Director for ResearchMoncrief-O’Donnell Endowed Chair

Head, Controls, Sensors, MEMS Group

Automation & Robotics Research Institute (ARRI)The University of Texas at Arlington

Sponsored byIEEE Singapore SMC, R&A, and Control Chapters

Organized and invited by Professor Sam Ge, NUS

UTA / IEEE Singapore Short CourseWireless Sensor Networks for Monitoring Machinery,

Human Biofunctions, and Biochemical Agents

http://www.uta.edu/

Mediterranean Control AssociationFounded 1992

Conferences held inCrete, Cyprus, RhodesIsraelItalyDubrovnik, CroatiaKusadasi Turkey 2004

Founding MembersM. ChristodoulouP. IoannouK. ValavanisF.L. LewisP. AntsaklisTed DjaferisP. Groumpos

Bringing the Mediterranean Together for Collaboration and Research

automation & robotics research institute (arri) the university of texas at arlington

Documents

eduicslprofessor systems

new control algorithms

manufacturing workcell

battlefield cc systems

wireless sensor system

vibratory systems gun

realtime controller

mica2 network circle