automation & robotics research institute (arri) the university of texas at arlington
DESCRIPTION
F.L. Lewis, Fellow IEEE, Fellow U.K. Inst. Meas. & Control Moncrief-O’Donnell Endowed Chair Head, Controls & Sensors Group. Automation & Robotics Research Institute (ARRI) The University of Texas at Arlington. http://ARRI.uta.edu/acs [email protected]. Bruno Borovic. MEMS Modeling and Control. - PowerPoint PPT PresentationTRANSCRIPT
Automation & Robotics Research Institute (ARRI)The University of Texas at Arlington
F.L. Lewis, Fellow IEEE, Fellow U.K. Inst. Meas. & ControlMoncrief-O’Donnell Endowed Chair
Head, Controls & Sensors Group
http://ARRI.uta.edu/[email protected]
Trm
RtrRtcvCt
T
qRtcd
MEMS Modeling and Control
Thermal Model
TsCR
TR
sq
sT
ttot
ttot
1
1 2 3 4 5 6 7 8 940
60
80
100
120
140
160
Rtto
t[K/W
]
V[V]
txkdt
dxxc
dt
xdm mm f
2
2
Mechanical Model
Electrical Model
RC
L
Z(s)=R+sL+1/sC
FEA Experiment
sVTRsCLCs
sCsI
eee
e
12
Optical Model2/1
20
22
20
exp2
),(
w
st
wstU
Bruno Borovic
With Ai Qun Liu, NTU Singapore
Electrostatic Comb Drive Actuator for Optical Switch Control
Optical fibre
Size of human hair150 microns wide
Micro Electro Mechanical Optical Systems - MEMS
Electrostatic Parallel Plate Actuator
Use feedback linearization to solve the pull in instability problem
Objective: Develop new control algorithms for faster, more precise control of DoD and industrial systems Confront complex systems with backlash, deadzone, flexible modes, vibration, unknown disturbances and dynamics
Three patents on neural network learning control Internet- Remote Site Control and Monitoring Rigorous Proofs and Design Algorithms Applications to Tank Gun Barrels, Vehicle Active
Suspension, Industrial machine control SBIR contracts for industry implementation Numerous Books and Papers Award-Winning PhD students
Neural Network Learning Controllersfor Vibratory Systems- Gun barrel, HMMWV suspension
Input Membership Fns. Output Membership Fns.
Fuzzy Logic Rule Base
NN
Input
NN
Output
Linguistics- Fuzzy Logic
Nervous System- Neural Network (NN)
Input x Output u
Input x Output u
(Includes Adaptive Control)
Input Membership Fns. Output Membership Fns.
Fuzzy Logic Rule Base
NN
Input
NN
Output
Linguistics- Fuzzy Logic
Nervous System- Neural Network (NN)
Input x Output u
Input x Output u
(Includes Adaptive Control)
Intelligent Control Tools
[I]
Robust ControlTerm vi(t)
Tracking Loop
rKr
Nonlinear FB Linearization Loop
F1(x)^ qr = qrqr.e
ee = .
qd =qdqd.
..qd
RobotSystem
1/KB1 i
F2(x)^
K
id
NN#1
NN#2
Backstepping Loop
ue[I]
Robust ControlTerm vi(t)
Tracking Loop
rKrKr
Nonlinear FB Linearization Loop
F1(x)F1(x)^ qr = qrqr.qr =qr = qrqr.qrqr.e
ee = .e = .
qd =qdqd.qd =qd =qdqd.qdqd.
..qd
..qd
RobotSystem
1/KB1 i
F2(x)F2(x)^
KK
id
NN#1
NN#2
Backstepping Loop
ue[I]
Robust ControlTerm vi(t)
Tracking Loop
rKrKr
Nonlinear FB Linearization Loop
F1(x)F1(x)^ qr = qrqr.qr =qr = qrqr.qrqr.e
ee = .e = .
qd =qdqd.qd =qd =qdqd.qdqd.
..qd
..qd
RobotSystem
1/KB1 i
F2(x)F2(x)^
KK
id
NN#1
NN#2
Backstepping Loop
ue[I]
Robust ControlTerm vi(t)
Tracking Loop
rKrKr
Nonlinear FB Linearization Loop
F1(x)F1(x)^ qr =qr = qrqr.qrqr.qr =qr = qrqr.qrqr.e
ee = .e = .
qd =qd =qdqd.qdqd.qd =qd =qdqd.qdqd.
..qd
..qd
RobotSystem
1/KB1 i
F2(x)F2(x)^
KK
id
NN#1
NN#2
Backstepping Loop
ueKK
id
NN#1
NN#2
Backstepping Loop
ue
PID loop
Nonlinearlearning loops
Nonlinear Intelligent Control
Relevance - Machine Feedback Control
qd
qr1qr2
AzEl
barrel flexiblemodes qf
compliant coupling
moving tank platform
turret with backlashand compliant drive train
terrain andvehicle vibrationdisturbances d(t)
Barrel tipposition
qd
qr1qr2
AzEl
barrel flexiblemodes qf
compliant coupling
moving tank platform
turret with backlashand compliant drive train
terrain andvehicle vibrationdisturbances d(t)
Barrel tipposition
Vehicle mass m
ParallelDamper
mc
activedamping
uc
(if used)
kc cc
vibratory modesqf(t)
forward speedy(t)
vertical motionz(t)
surface roughness(t)
k c
w(t)
Series Damper+
suspension+
wheel
Single-Wheel/Terrain System with Nonlinearities
High-Speed Precision Motion Control with unmodeled dynamics, vibration suppression, disturbance rejection, friction compensation, deadzone/backlash control
VehicleSuspension
IndustrialMachines
Military LandSystems
Aerospace
Aydin Yesildirek- Neural networks for control
S. Jagannathan- DT NN control
Sesh Commuri- CMAC, fuzzy
Rafael Fierro- robotic nonholonomic systems. Hybrid systems
Rastko Selmic- actuator nonlinearities
Javier Campos- fuzzy control, actuator nonlinearities
Ognjen Kuljaca- implementation of NN control
The Perpetrators
qd
Robot System[I]
Robust ControlTerm
q
v(t)
e
PD Tracking Loop
rKv
Neural Network Robot Controller
^
qd
f(x)
Nonlinear Inner Loop
..
Feedforward Loop
Universal Approximation Property
Problem- Nonlinear in the NN weights sothat standard proof techniques do not work
Feedback linearization
Easy to implement with a few more lines of codeLearning feature allows for on-line updates to NN memory as dynamics changeHandles unmodelled dynamics, disturbances, actuator problems such as frictionNN universal basis property means no regression matrix is neededNonlinear controller allows faster & more precise motion
Theorem 1 (NN Weight Tuning for Stability)
Let the desired trajectory )(tqd and its derivatives be bounded. Let the initial tracking error be
within a certain allowable set U . Let MZ be a known upper bound on the Frobenius norm of the
unknown ideal weights Z .
Take the control input as
vrKxVW vTT )ˆ(ˆ with rZZKtv MFZ )()( .
Let weight tuning be provided by
WrFxrVFrFW TTT ˆˆ'ˆˆˆ , VrGrWGxV TT ˆ)ˆ'ˆ(ˆ
with any constant matrices 0,0 TT GGFF , and scalar tuning parameter 0 . Initialize the weight estimates as randomVW ˆ,0ˆ .
Then the filtered tracking error )(tr and NN weight estimates VW ˆ,ˆ are uniformly ultimately bounded. Moreover, arbitrarily small tracking error may be achieved by selecting large control
gains vK . Backprop terms-
Werbos
Extra robustifying terms- Narendra’s e-mod extended to NLIP systems
Forward Prop term?
Can also use simplified tuning- Hebbian
Extension of Adaptive Control to nonlinear-in parameters systems No regression matrix needed
[I]
Robust ControlTerm vi(t)
Tracking Loop
rKr
Nonlinear FB Linearization Loop
F1(x)^ qr = qr
qr.e
ee = .
qd =qd
qd.
..qd
RobotSystem
1/KB1i
F2(x)^
K
id
NN#1
NN#2
Backstepping Loop
ue
[I]
Robust ControlTerm vi(t)
Tracking Loop
rKrKr
Nonlinear FB Linearization Loop
F1(x)F1(x)^
Neural network backstepping controller for Flexible-Joint robot arm
qr = qr
qr.qr =qr = qr
qr.qr
qr.e
ee = .e = .
qd =qd
qd.qd =qd =qd
qd.qd
qd.
..qd
..qd
RobotSystem
1/KB1i
F2(x)F2(x)^
KK
id
NN#1
NN#2
Backstepping Loop
ue
Backstepping
Advantages over traditional Backstepping- no regression functions needed
Add an extra feedback loopTwo NN neededUse passivity to show stability
DT backstepping- noncausal- Javier Campos patent
DT Systems- Jagannathan
Nonlinear
SystemK
v[
v1
rexd
Estimate
of Nonlinear
Function
--
x
( )f x
des
[0
--
yd
(n)
-
Backlash
-
1/s
Filterv
2
Backstepping loop
des
NN Compensator
-
dx r
Kb
u
nny
F
Z
Dynamic inversion NN compensator for system with Backlash
U.S. patent- Selmic, Lewis, Calise, McFarland
Force Control
Flexible pointing systems
Vehicle active suspensionSBIR Contracts
Andy LoweScott IkenagaJavier Campos
2. Neural Network Solution of Optimal Design Equations – 2002-2006
Nearly Optimal ControlBased on HJ Optimal Design EquationsKnown system dynamicsPreliminary Off-line tuning
1. Neural Networks for Feedback Control – 1995-2002
Based on FB Control ApproachUnknown system dynamicsOn-line tuningNN- FB lin., sing. pert., backstepping, force control, dynamic inversion, etc.
3. Approximate Dynamic Programming – 2006-
Nearly Optimal ControlBased on recursive equation for the optimal valueUsually Known system dynamics (except Q learning) The Goal – unknown dynamicsOn-line tuningOptimal Adaptive Control
ARRI Research Roadmap in Neural Networks
Extended adaptive control to NLIP systemsNo regression matrix
Nearly optimal solution ofcontrols design equations.No canonical form needed.
Extend adaptive control toyield OPTIMAL controllers.No canonical form needed.
2 2Tz h h u
2
0
2
0
2
0
2
0
2
)(
)(
)(
)(
dttd
dtuhh
dttd
dttz T
System
),(
)()()(
uxz
xy
dxkuxgxfx
)(ylu
d
u
z
y control
Performance output
Measuredoutput
disturbance
where
Find control u(t) so that
For all L2 disturbancesAnd a prescribed gain
L2 Gain Problem
H-Infinity Control Using Neural Networks
Zero-Sum differential game
Murad Abu Khalaf
Cannot solve HJI !!
CT Policy Iteration for H-Infinity Control--- c.f. Howard
Consistency equationFor Value Function
Successive Solution- Algorithm 1:Let be prescribed and fixed.
0u a stabilizing control with region of asymptotic stability 0
1. Outer loop- update controlInitial disturbance 00 d
2. Inner loop- update disturbanceSolve Value Equation
0)()(2)( 2
0
iTi
u
TTj
Tj
i
dddhhkdgufx
V j
Inner loop update disturbance
x
Vxkd
ji
Ti
)(2
12
1
go to 2.Iterate i until convergence to jVd , with RAS j
Outer loop update control action
x
Vxgu
jTj )(2
11
Go to 1.Iterate j until convergence to
Vu , , with RAS
Murad Abu Khalaf
Results for this Algorithm
For this to occur it is required that 0*
The algorithm converges to )(*),(*,),(* 0000 duV
the optimal solution on the RAS 0
Sometimes the algorithm converges to the optimal HJI solution V*, * , u*, d*
For every iteration on the disturbance di one has
ji
ji VV 1 the value function increases
ji
ji 1 the RAS decreases
For every iteration on the control uj one has
1 jj VV the value function decreases
1 jj the RAS does not decrease
)()()(
)()( i
LT
Li
L
TL
iL WxW
x
L
x
V
Value function gradient approximation is
Substitute into Value Equation to get
Therefore, one may solve for NN weights at iteration (i,j)
Neural Network Approximation for Computational Technique
222),,()(),,()(0 i
jTi
jTi
ji
jTi
j duhhduxfxwduxrxxw
Neural Network to approximate V(i)(x)
( ) ( ) ( )
1
( ) ( ) ( ),L
i i T iL j j L L
j
V x w x W x
Problem- Cannot solve the Value Equation!
Murad Abu Khalaf
Neural Network Optimal Feedback Controller
1( ) .
2T T
L Ld k x W
Optimal Solution
LT
LT Wxgu )(2
1
A NN feedback controller with nearly optimal weights
Murad Abu Khalaf
Optimal cost
xuxgxf
x
txVL
t
txVT
tu
** ,min
,
Optimal control
x
txVxgRxu T
*
1* ,
2
1
This yields the time-varying Hamilton-Jacobi-Bellman (HJB) equation
0
,,
4
1,, *1
***
x
txVxgRxg
x
txVxQxf
x
txV
t
txV T
T
Fixed-Final-Time HJB Optimal Control
Finite Horizon Control Cheng Tao
Note that txt
x
x
x
txVL
TLL
TLL wσw
σ
,
where is the Jacobian xLσ xxL σ
xtt
txVL
TL
L σw
,
HJB Solution by NN Value Function Approximation Cheng Tao
Approximating in the HJB equation gives an ODE in the NN weights txV ,
xexQ
txxgRxgxt
xfxtxt
L
LTL
TL
TL
LTLL
TL
wσσw
σwσw
1
4
1
Solve by least-squares – simply integrate backwards to find NN weights
)(2
1 1* twxgRxu LTL
T Control is
Time-varying weights
Necessary and Sufficient Conditions for Bounded L2 Gain OPFB Control:
H-infinity OPFB control Theorem
1
iii
)( ii.
such that 0 and matrices exixts thereand
detectable is ) ,( i.
ifonly and if by boundedgain with
stableally asymptotic is )(such that gain OPFBan exists there*,given aFor
112
1
2
LRLPBPBRPPDDQPAPA
LPBRKC
PPL
CA
L
BKCAA
TTTT
T
T
o
Jyotirmay Gadewadikar V. Kucera, Lihua Xie
H-Infinity Static Output-Feedback Control for Rotorcraft
J. Gadewadikar*, F. L. Lewis*, K. Subbarao$, K. Peng+, B. Chen+,
*Automation & Robotics Research Institute (ARRI),University of Texas at Arlington
+Department of Electrical and Computer Engineering, National University of Singapore
ki
iiki
kh uxrxV ),()(
)())(,()( 1 khkkkh xVxhxrxV
))())(,((min)( 1*
khkkh
k xVxhxrxV
Hamiltonian
))(),((min)( 1**
kkku
k xVuxrxVk
))(),((minarg)(* 1*
kkku
k xVuxrxhk
Discrete-Time Optimal Control
cost
Value function recursion
)()())(,()),(,( 1 khkhkkkk xVxVxhxrhxVxH
Optimal cost
Bellman’s Principle
Optimal Control
System dynamics does not appear
)( kk xhu = the prescribed control input function
Solutions by Comp. Intelligence Community
Adaptive Dynamic Programming
ADP for H∞ Optimal Control Systems
2 2Tz h h u
( ) ( ) ( ),( , )
x f x g x u k x dy xz x u
( )u l y
d
u
z
y Control
Penalty output
Measuredoutput
Disturbance
where
2 2
20 0
2 2
0 0
( ) ( )
( ) ( )
Tz t dt h h u dt
d t dt d t dt
Find control u(t) so that
for all L2 disturbancesand a prescribed gain 2 when the system is at rest, x0=0.
Asma Al-Tamimi
Discrete-Time GameDynamic Programming:
Backward-in-time Formulation• Consider the following continuous-state and action spaces
discrete-time dynamical system
• The zero-sum game problem can be formulated as follows:
• The goal is to find the optimal strategies (State-feedback) for this multi-agent problem
• Using Bellman optimality principle “Dynamic Programming”
,1
kk
kkkk
xy
EwBuAxx
ki i
T
ii
T
ii
T
iwuk wwuuQxxxV 2maxmin)(
xLxu ** )( xKxw ** )(
).2^(maxmin
))(),((maxmin)(
11
1
k
T
kk
T
kk
T
kk
T
kwu
kkkwuk
PxxwwuuQxx
xVuxrxV
kk
kk
),)((
))((12
112
APBAPEIEPEEPB
BPEIEPEEPBBPBIL
iT
iT
iT
iT
iT
iT
iT
iT
i
).)((
))((1
112
APEAPBBPBIBPE
EPBBPBIBPEIEPEK
iT
iT
iT
iT
iT
iT
iT
iT
i
xppxV Tii ),(ˆ
),,,,,,,( 2132
221
21 nnnn xxxxxxxxxx
12
1 )()()()( kTiki
Tkiki
Tkik
Tkk
Ti xpxKxKxLxLQxxxp
HDP- Linear System Case
Value function update
Control update
Control gain
Disturbance gain
Showed that this is equivalent to iteration on the Underlying Game Riccati equation
APE
APB
IEPEAPE
EPBBPBIEPABPAQAPAP
iT
iT
iT
iT
iT
iT
iT
iT
iT
i
1
21 ][
Which is known to converge- Stoorvogel, Basar
Solve by batch LS or RLS
xLLxu Tii ),(ˆ xKKxw T
ii ),(ˆ
A, B, E needed
Asma Al-Tamimi
( ) Tk k kV x x Px
1( , , ) ( , , ) ( )k k k k k k k
TT T T T T Tk k k k k k
Q x u w r x u w V x
x u w H x u w
TTk
Tk
Tki
Tk
Tk
Tkk
Tkk
Tkk
Tk
TTk
Tk
Tki
Tk
Tk
Tk wuxHwuxwwuuRxxwuxHwux ][][][][ 111111
21
wwwuwx
uwuuux
xwxuxx
HHH
HHH
HHH
( ) , ( )i k i k i k i ku x L x w x K x
1 1 1
1 1 1
( ) ( ),
( ) ( ).
i i i i i i i ii uu uw ww wu uw ww wx ux
i i i i i i i ii ww wu uu uw wu uu ux wx
L H H H H H H H H
K H H H H H H H H
))(ˆ),(ˆ,(
)(ˆ)(ˆ)(ˆ)(ˆ))(ˆ),(ˆ,(
111
21
kikiki
kiT
kikiT
kikTkkikiki
xwxuxQ
xwxwxuxuRxxxwxuxQ
Linear Quadratic case- V and Q are quadratic
Q function update
Control Action and Disturbance updates
A, B, E NOT needed
Asma Al-Tamimi
Q learning for H-inf
Asma Al-Tamimi
),( uxfx
t
T
t
dtRuuxQdtuxrtxV ))((),())((
),,(),(),(),(),(0 ux
VxHuxruxf
x
Vuxrx
x
VuxrV
TT
0)0( V
),(),(min),(min0*
)(
*
)(uxf
x
Vuxrx
x
Vuxr
T
tu
T
tu
x
VxgRtxh T
*
12
1* )())((
dx
dVggR
dx
dVxQf
dx
dV T
TT *1
*
41
*
)(0
0)0( V
System
Cost
Hamiltonian
Optimal cost
Optimal control
HJB equation
Continuous-Time Optimal Control
Bellman
),(),(min),(min0)()(
uxfx
Vuxrx
x
Vuxr
T
tu
T
tu
)())(,()( 1 khkkkh xVxhxrxV
c.f. DT value recursion, where f(), g() do not appear
),,(),(),(0 ux
VxHuxruxf
x
VT
))(,())(,(0 xhxrxhxfx
Vjj
Tj
0)0( jV
x
VxgRxh jT
j
)()( 12
11
CT Policy Iteration
• Convergence proved by Saridis 1979 if Lyapunov eq. solved exactly
• Beard & Saridis used complicated Galerkin Integrals to solve Lyapunov eq.
• Abu Khalaf & Lewis used NN to approx. V for nonlinear systems and proved convergence
RuuxQuxr T )(),(Utility
Cost for any given u(t)
Lyapunov equation
Iterative solution
Pick stabilizing initial control
Find cost
Update control
Full system dynamics must be known
LQR Policy iteration = Kleinman algorithm
1. For a given control policy solve for the cost:
2. Improve policy:
If started with a stabilizing control policy the matrix monotonically converges to the unique positive definite solution of the Riccati equation.
Every iteration step will return a stabilizing controller. The system has to be known.
xLu k
kT
kT
kkkT
k RLLCCAPPA 0
11
k
Tk PBRL
kk BLAA
0L kP
Kleinman 1968
Lyapunov eq.
Policy Iteration Solution
PBPBRPPAPAPRic TT 1)(
1 1( ) ( ) 0T T T Ti i i i i iA BB P P P A BB P PBB P Q
1
1 ( ), 0,1,ii i P iP P Ric Ric P i
Policy iteration
This is in fact a Newton’s Method
Then, Policy Iteration is
Frechet Derivative
1. For a given control policy update the cost:
2. Improve policy (discrete update of a continuous time controller):
11010
0
0
)( xPxdtRuuQxxxPx kT
Tt
t
kTkTk
T
Solving for the cost – Our approach
))(()())(( 001
0
0
TtxVdtRuuQxxtxV k
Tt
t
kTkTk
ADP Greedy iteration
xLu kk
11
1
kT
k PBRL
No initial stabilizing control needed the cost is convergent to the optimal cost the controller is stabilizing the plant after a number of iterations any initial 00 P
Now Greedy ADP can be defined for CT SystemsDraguna Vrabie
))(())(( txWtxV Tkk
u(t+T) in terms of x(t+T) - OK
Analysis of the algorithm
For a given control policy
Tt
t
tAk
Tkkk
tAkk dteRLLQAPAPePP k
Tk
0
0
)(1
xLu kk k
Tk PBRL 1
)0(xex tAk kT
k PBBRAA 1
)()(1
00
0
0
)( tTAk
tTATt
t
tAk
Tk
tAk
kTkk
Tk ePedteRLLQeP
)0(; xBuAxx
Greedy update is equivalent to
Can show
When ADP converges, the resulting P satisfies the Continuous-Time ARE !!
ADP solves the CT ARE without knowledge of the system dynamics f(x)
with
1 0( ( )) ( ( )), 0t T T T
i i i itV x t x Qx u Ru d V x t T V
a strange pseudo-discretized RE
Draguna Vrabie
TtA
kT
kkktA
kk dteRLLQAPAPePP kTk
0
1 )(
Lemma 2. CT HDP is equivalent to
kT
k PBBRAA 1
Analysis of the algorithmDraguna Vrabie
a strange pseudo-discretized RE
TAk
TAT
tAk
Tk
tAk
kT
kkTk ePedteRLLQeP
0
1 )(
Equivalent to underlying equation
This extra term means the initial Control action need not be stabilizing
Direct OPTIMAL ADAPTIVE CONTROL
Solve the Riccati Equation WITHOUT knowing the plant dynamics
Model-free ADP
Works for Nonlinear Systems
Proofs?Robustness?Comparison with adaptive control methods?
DT ADP vs. Receding Horizon Optimal Control
0
)(
0
11
P
APBBPBIBPAQAPAP iT
iT
iT
iT
i
0
)( 11
111
N
kT
kT
kT
kT
k
P
APBBPBIBPAQAPAP
Forward-in-time ADP
Backward-in-time optimization – 1-step RHC
New Books
J. Campos and F.L. Lewis, "Method for Backlash Compensation Using Discrete-Time Neural Networks," SN 60/237,580, The Univ. Texas at Arlington, awarded March 2006.
RecentPatent
Use filter to overcome DT backstepping noncausality problem
International Collaboration
Jie Huang
Sam Ge
Organized and invited by Professor Jie Huang, CUHK
SCUT / CUHK Lectures on Advances in ControlMarch 2005
F.L. Lewis, Assoc. Director for ResearchMoncrief-O’Donnell Endowed Chair
Head, Controls, Sensors, MEMS Group
Automation & Robotics Research Institute (ARRI)The University of Texas at Arlington
Sponsored byIEEE Singapore SMC, R&A, and Control Chapters
Organized and invited by Professor Sam Ge, NUS
UTA / IEEE Singapore Short CourseWireless Sensor Networks for Monitoring Machinery,
Human Biofunctions, and Biochemical Agents
Mediterranean Control AssociationFounded 1992
Conferences held inCrete, Cyprus, RhodesIsraelItalyDubrovnik, CroatiaKusadasi Turkey 2004
Founding MembersM. ChristodoulouP. IoannouK. ValavanisF.L. LewisP. AntsaklisTed DjaferisP. Groumpos
Bringing the Mediterranean Together for Collaboration and Research