automation & robotics research institute (arri) the ... talks/montreal plenary.pdfunknown...

88
Automation & Robotics Research Institute (ARRI) The University of Texas at Arlington F.L. Lewis Moncrief-O’Donnell Endowed Chair Head, Controls & Sensors Group http://ARRI.uta.edu/acs Nonlinear Network Structures for Feedback Control

Upload: others

Post on 24-Feb-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Automation & Robotics Research Institute (ARRI)The University of Texas at Arlington

F.L. LewisMoncrief-O’Donnell Endowed Chair

Head, Controls & Sensors Group

http://ARRI.uta.edu/acs

Nonlinear Network Structures forFeedback Control

Page 2: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Organized and invited by Professor Jie Huang, CUHK

SCUT / CUHK Lectures on Advances in ControlMarch 2005

Page 3: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Relevance- Machine Feedback Control

qd

qr1qr2

AzEl

barrel flexiblemodes qf

compliant coupling

moving tank platform

turret with backlashand compliant drive train

terrain andvehicle vibrationdisturbances d(t)

Barrel tipposition

qd

qr1qr2

AzEl

barrel flexiblemodes qf

compliant coupling

moving tank platform

turret with backlashand compliant drive train

terrain andvehicle vibrationdisturbances d(t)

Barrel tipposition

Vehicle mass m

ParallelDamper

mc

activedamping

uc(if used)

kc cc

vibratory modesqf(t)

forward speedy(t)

vertical motionz(t)

surface roughnessρ(t)

k c

w(t)

Series Damper+

suspension+

wheel

Single-Wheel/Terrain System with Nonlinearities

High-Speed Precision Motion Control with unmodeled dynamics, vibration suppression, disturbance rejection, friction compensation, deadzone/backlash control

VehicleSuspension

IndustrialMachines

Military LandSystems

Aerospace

Page 4: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Newton’s Law

v(t)

p(t)

F(t)m

)()( tum

tFx

xmmaF

≡=

==

&&

&&

Mechanical Motion Systems (Vehicles, Robots)

τqB)+τq+G(q)+F(q)q(q,+Vq)qM( dm )(=&&&&&&

Coriolis/centripetalforce

gravity friction disturbances

Actuatorproblems

inertia

Control Input

LaGrange’s Eqs. Of Motion

Page 5: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

2122

2111

xdxcxxxbxaxx

+−=−=

&

&

Darwinian Selection & Population Dynamics

x1= preyx2= predator

Volterra’s fishes

Stable Limit Cycle

2122

212111

xdxcxxexxbxaxx

+−=−−=

&

&

Effects of OvercrowdingLimited food and resources

Stable Equilibrium POINT

Favorable to Prey!

Page 6: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Dynamical System Models

)()()(

xhyuxgxfx

=+=&

Nonlinear system

Continuous-Time Systems Discrete-Time Systems

)()()(1

kk

kkkk

xhyuxgxfx

=+=+

Linear system

CxyBuAxx

=+=&

kk

kkk

CxyBAxx

=+=+1

1/s

f(x)

h(x)g(x)

z-1

xx& yu

Control Inputs Internal States Measured Outputs

Page 7: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Issues in Feedback Control

system

Feedbackcontroller

Feedforwardcontroller

Measured outputs

Control inputs

Desired trajectories

Sensornoise

Disturbances

StabilityTracking BoundednessRobustness

to disturbancesto unknown dynamics

Page 8: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Definitions of System Stability

xe

xe+B

xe-B

Const Bound B

tt0 t0+T

T

x(t)

x(t)

t

x(t)

t

Asymptotic Stability Marginal Stability

Uniform Ultimate Boundedness

)()(

1 kk xfxxfx

==

+

&

d

B(d)

Page 9: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

plant

controlu(t)

outputy(t)

controller

systemidentifier estimated

output

)(ˆ ty

identificationerror

desiredoutput

)(tyd

plant

controlu(t)

outputy(t)

controllerdesiredoutput

)(tyd

trackingerror

plant

controlu(t)

outputy(t)

controller #1

controller #2

desiredoutput

)(tyd

trackingerror

Indirect Scheme

Controller Topologies

Direct Scheme

Feedback/FeedforwardScheme

Page 10: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Cell Homeostasis The individual cell is a complex feedback control system. It pumps ions across the cell membrane to maintain homeostatis, and has only limited energy to do so.

Cellular Metabolism

Permeability control of the cell membrane

http://www.accessexcellence.org/RC/VL/GG/index.html

Optimality in Biological Systems

Page 11: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Optimality in Control Systems DesignR. Kalman 1960

Rocket Orbit Injection

http://microsat.sm.bmstu.ru/e-library/Launch/Dnepr_GEO.pdf

FmmmF

rwvv

mF

rrvw

wr

−=

+−

=

+−=

=

&

&

&

&

φ

φμ

cos

sin2

2

ObjectivesGet to orbit in minimum timeUse minimum fuel

Dynamics

Page 12: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Performance Index, Cost, or Value function

∫∫ =+=TT

dtuxrdtuRxQJ00

),()]()([CT

Strategic utility utility

),(0

kk

N

kuxrJ ∑

=

=DT

Minimum energy RuuQxxuxr TT +=),(Minimum fuel uuxr =),(

Minimum time 1),( =uxr Then TdtuxrJT

== ∫0

),(

Discounting ),(0

kk

N

k

k uxrJ ∑=

= γ ∫ −=T

t dtuxreJ0

),(γ

Page 13: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Input Membership Fns. Output Membership Fns.

Fuzzy Logic Rule Base

NN

Input

NN

Output

Fuzzy Associative Memory (FAM) Neural Network (NN)

INTELLIGENT CONTROL TOOLS

Input x Output u

Input x Output u

Both FAM and NN define a function u= f(x) from inputs to outputs

FAM and NN can both be used for: 1. Classification and Decision-Making2. Control

(Includes Adaptive Control)

NN Includes Adaptive Control (Adaptive control is a 1-layer NN)

Page 14: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Neural Network Properties

Learning

Recall

Function approximation

Generalization

Classification

Association

Pattern recognition

Clustering

Robustness to single node failure

Repair and reconfiguration

Nervous system cell. http://www.sirinet.net/~jgjohnso/index.html

Page 15: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

First groups working on NN Feedback Control in CS community

Werbos

NarendraSanner & SlotineF.C. Chen & KhalilLewisPolycarpou & IoannouChristodoulou & Rovithakis

A.J. Calise, McFarland, Naira HovakimyanEdgar Sanchez & PoznyakSam Ge, Zhang, et al.

Jun Wang, Chinese Univ. Hong Kong

c. 1995

Page 16: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Robot System[Λ I]qd e

Unity-Gain Tracking Loop

τrKv q

Industry Standard- PD Controller

)()()( tetetr Λ+= &Desiredtrajectory

Actualtrajectory

Easy to implement with COTS controllersFastCan be implemented with a few lines of code- e.g. MATLAB

But -- Cannot handle-High-order unmodeled dynamics Unknown disturbancesHigh performance specifications for nonlinear systemsActuator problems such as friction, deadzones, backlash

Controlinput

Page 17: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Two-layer feedforward static neural network (NN)

σ(.)

σ(.)

σ(.)

σ(.)

x1

x2

y1

y2

VT WT

inputs

hidden layer

outputs

xn ym

1

2

3

L

σ(.)

σ(.)

σ(.)

Summation eqs Matrix eqs

)( xVWy TTσ=⎟⎟⎠

⎞⎜⎜⎝

⎛+⎟⎟

⎞⎜⎜⎝

⎛+= ∑ ∑

= =

K

ki

n

jkjkjiki wvxvwy

10

10σσ

Page 18: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Control System Design Approach

ττ =++++ dm qFqGqqqVqqM )()(),()( &&&&&

)()()( tqtqte d −= eer Λ+= &

ττ −++−= dm xfrVrM )(&

Robot dynamics

Tracking Error definition

Error dynamics

Page 19: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

qdRobot System[Λ I]

qe

PD Tracking Loop

τr

ττ =++++ dm qFqGqqqVqqM )()(),()( &&&&&

Robot dynamics

?controller

)()()( tqtqte d −=Tracking error

eer Λ+= &Sliding variable

The equations give the FB controller structure

Page 20: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Control System Design Approach

ττ =++++ dm qFqGqqqVqqM )()(),()( &&&&&

)()()( tqtqte d −= eer Λ+= &

ττ −++−= dm xfrVrM )(&

Robot dynamics

Tracking Error definition

Error dynamics

vrKxVW vTT −+= )ˆ(ˆ στDefine control input

εσ += )()( xVWxf TTApprox. unknown function by NN

Universal Approximation Property UNKNOWN FN.

)()ˆ(ˆ)( tvxVWxVWrKrVrM dTTTT

vm ++−++−−= τσεσ&

Closed-loop dynamics

)(~ tvfrKrVrM dvm +++−−= τ&

Page 21: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

qdRobot System[Λ I]

Robust ControlTerm

q

v(t)

e

PD Tracking Loop

τrKv

Neural Network Robot Controller

^

qd

f(x)

Nonlinear Inner Loop

..

Feedforward Loop

Universal Approximation Property

Problem- Nonlinear in the NN weights sothat standard proof techniques do not work

Feedback linearization

Easy to implement with a few more lines of codeLearning feature allows for on-line updates to NN memory as dynamics changeHandles unmodelled dynamics, disturbances, actuator problems such as frictionNN universal basis property means no regression matrix is neededNonlinear controller allows faster & more precise motion

Page 22: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Stability Proof based on Lyapunov Extension

Define a Lyapunov Energy Function

)~~()~~( 21

21

21 VVtrWWtrMrrL TTT ++=

Differentiate

)()'ˆˆ~(~)ˆ'ˆˆ~(~

)2(21

vwrWxrVVtr

xrVrWWtr

rVMrrKrL

TTTT

TTTT

mT

vT

++++

−++

−+−=

σ

σσ&

&

&&

Using certain special tuning rules, one can show that the energyderivative is negative outside a compact set.

L&negative

)(tr

)(~ tW This proves that all signals are bounded

Problems—1. How to characterize the NN weight errors as ‘small’?- use Frobenius Norm2. Nonlinearity in the parameters requires extra care in the proof

Page 23: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Theorem 1 (NN Weight Tuning for Stability)

Let the desired trajectory )(tqd and its derivatives be bounded. Let the initial tracking error bewithin a certain allowable set U . Let MZ be a known upper bound on the Frobenius norm of theunknown ideal weights Z . Take the control input as

vrKxVW vTT −+= )ˆ(ˆ στ with rZZKtv MFZ )()( +−= .

Let weight tuning be provided by

WrFxrVFrFW TTT ˆˆ'ˆˆˆ κσσ −−=& , VrGrWGxV TT ˆ)ˆ'ˆ(ˆ κσ −=&

with any constant matrices 0,0 >=>= TT GGFF , and scalar tuning parameter 0>κ . Initialize the weight estimates as randomVW == ˆ,0ˆ .

Then the filtered tracking error )(tr and NN weight estimates VW ˆ,ˆ are uniformly ultimately bounded. Moreover, arbitrarily small tracking error may be achieved by selecting large controlgains vK . Backprop terms-

WerbosExtra robustifying terms-Narendra’s e-mod extended to NLIP systems

Forward Prop term?

Can also use simplified tuning- Hebbian

Page 24: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

010

2030

4050

0

12

3

4

5

6

-20

-15

-10

-5

0

5

10

15

weights

W2 weights, x

d=[0.5sin(t) 0.5cos(t)]T

time

W2 w

eigh

ts

NN weights converge to the best learned values for the given system

Page 25: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

0 2 4 6 8 10 12 14 16-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

Time(second)

(a)

0 2 4 6 8 10 12 14 16-0.04

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

Time(second)

Leng

th(m

eter

)

(a)

0 2 4 6 8 10 12 14 16-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

0.25

Time(second)

Leng

th(m

eter

)

(b)

NN Friction Compensator

Desired trajectory

Tracking errors- solid = fixed gain controller, dashed= NN controller

Trajectory Tracking Controller

Position Velocity

position

velocity

Fixed gain

NN

Page 26: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Robot System[Λ I]

Robust ControlTerm

qqd

v(t)

qd

e

Tracking Loop

τf(x)

rKv

Nonlinear Inner Loop

..

^

Feedforward Loop

Static NN => Dynamic NN Feedback Controller

Dynamic NN and Passivity

Page 27: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

1s C

A

xx.Bu2

H(s)

u1

1s C

A

xx.Bu2

H(s)

u1

kkTT

kk uxVWAxx ++=+ )(1 σ

Closed-Loop System wrt Neural Networkis a Dynamic (Recursive NN)

Discrete time case

,

TT rWGxV )ˆ'ˆ(ˆ σ=&

TTT xrVFrFW ˆ'ˆˆˆ σσ −=&The backprop tuning algorithms

make the closed-loop system passive

WrFxrVFrFW TTT ˆˆ'ˆˆˆ κσσ −−=&

VrGrWGxV TT ˆ)ˆ'ˆ(ˆ κσ −=&

The enhanced tuning algorithms

make the closed-loop system state-strict passive

SSP gives extra robustness properties to disturbances and HF dynamics

Page 28: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Force Control

Flexible pointing systems

Vehicle active suspensionSBIR Contracts

What about practical Systems?

Page 29: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Flexible Systems with Vibratory Modes

τ⎥⎦

⎤⎢⎣

⎡=⎥

⎤⎢⎣

⎡+⎥

⎤⎢⎣

⎡+⎥

⎤⎢⎣

⎡⎥⎦

⎤⎢⎣

⎡+⎥

⎤⎢⎣

⎡⎥⎦

⎤⎢⎣

⎡+⎥

⎤⎢⎣

⎡⎥⎦

⎤⎢⎣

f

rrr

f

r

fff

r

fffr

rfrr

f

r

fffr

rfrr

BBGF

qq

Koqq

VVVV

qq

MMMM

0000

&

&

&&

&&

Rigid dynamics

Flexible dynamics

Problem- only one control input !

Flexible link pointing system

acceleration velocityposition Flex. modes

Page 30: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

[Λ I]

Robust ControlTerm

v(t)

Tracking Loop

τrKv

Nonlinear Inner Loop

f(x)^

Neural network controller for Flexible-Link robot arm

qr = qrqr.e

ee = .

qd =qdqd.

..qd

Robot Systemqfqf.

Fast PDgains

Br-1

Manifoldequation

τ

τF

ξFast Vibration Suppression Loop

Singular PerturbationsAdd an extra feedback loopUse passivity to show stability

Page 31: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Coupled Systems

ee

Tdm

uqiRiLiKqGqFqqqVqqM

=++

=++++

ττ

),()()(),()(

&&

&&&&&

Motor electrical dynamics

Robot mechanical dynamics

Problem- only one control input !

Sprung mass (car body) smsz

Unsprung mass (tire) umuz

F+

F−

rzterrain

tK

VehicleActiveSuspensioncontrol

Page 32: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

[Λ I]

Robust ControlTerm vi(t)

Tracking Loop

rKr

Nonlinear FB Linearization Loop

F1(x)^ qr = qrqr.e

ee = .

qd =qdqd.

..qd

RobotSystem1/KB1 i

F2(x)^

ηid

NN#1

NN#2Backstepping Loop

ue[Λ I]

Robust ControlTerm vi(t)

Tracking Loop

rKrKr

Nonlinear FB Linearization Loop

F1(x)F1(x)^

Neural network backstepping controller for Flexible-Joint robot arm

qr = qrqr.qr =qr = qrqr.qrqr.e

ee = .e = .

qd =qdqd.qd =qd =qdqd.qdqd.

..qd..qd

RobotSystem1/KB1 i

F2(x)F2(x)^

KηKη

ηid

NN#1

NN#2Backstepping Loop

ue

Backstepping

Advantages over traditional Backstepping- no regression functions needed

Add an extra feedback loopTwo NN neededUse passivity to show stability

Page 33: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

τ=D(u)

ud+

-d-

m+

m-

.

τ

ud+

d-

mu

τ

BacklashDeadzone

System

Feedbackcontroller

Outputs

ActualControl inputsActuator

nonlinearity

AppliedControl inputs

Actuator Nonlinearities

Page 34: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

MechanicalSystem

Kv[ΛΤ Ι]

v

reqd

Estimateof NonlinearFunction

w

--

D(u)u

NN DeadzonePrecompensator

I

II

$( )f x

τ q

dq&&

NN in Feedforward Loop- Deadzone Compensation

iiiTTTTT

iii WWrTkWrTkUuUWrwUTW ˆˆˆ)('ˆ)(ˆ21 −−= σσ

WrSkrwUWUuUSW TTiii

TT ˆ)(ˆ)('ˆ1−−= σσ

Acts like a 2-layer NNWith enhanced backprop tuning !

little critic network

Page 35: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

0 5 10 15-1

-0.5

0

0.5

1

x2(k)

time

0 5 10 15-1

-0.5

0

0.5

e2(k)

time

0 5 10 15-2

-1

0

1

2

x2(k)

time

0 5 10 15-1

-0.5

0

0.5

e2(k)

time

Performance Results

PD control-deadzone chops out the middle

NN control fixes the problem

Page 36: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Nonlinear

SystemK

v[ ΛΤ Ι]

v1

rexd

Estimate

of Nonlinear

Function

--

x

$( )f x

desτ

[0 ΛΤ ]

--

yd

(n)

-

Backlash

-

1/s

Filter v2

Backstepping loop

τ

desτ&

NN Compensator

-

dx r

Kb

nny

FZ

Dynamic inversion NN compensator for system with Backlash

U.S. patent- Selmic, Lewis, Calise, McFarland

Page 37: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Performance Results

PD control-backlash chops off tops & bottoms

NN control fixes the problem

0 1 2 3 4 5 6 7 8 9 10-1.5

-1

-0.5

0

0.5

1

time

x 1(t)

PD controller with backlash

0 1 2 3 4 5 6 7 8 9 10-0.1

-0.05

0

0.05

0.1

0.15

time

e 1(t)

0 1 2 3 4 5 6 7 8 9 10-2

-1

0

1

2

time

x 2(t)

PD controller with backlash

0 1 2 3 4 5 6 7 8 9 10-1

-0.5

0

0.5

1

time

e 2(t)

position

velocity

error

0 1 2 3 4 5 6 7 8 9 10-1.5

-1

-0.5

0

0.5

1

time

x 1(t)

PD controller with NN backlash compensation

0 1 2 3 4 5 6 7 8 9 10-0.04

-0.03

-0.02

-0.01

0

0.01

timee 1(t)

position

Tracking error

Page 38: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

~x 1

$ ( $ , $ )h x xo 1 2

• ••

• ••

• ••

• ••

• ••

• ••

$ ( $ , $ )h x xc 1 2

ROBOT

Kv

[λ Ι ]vc

kD

KvkpM-1(.)

$$

$q

xx=

⎡⎣⎢

⎤⎦⎥

1

2

eee

=⎡

⎣⎢

⎦⎥$&

qqqdd

d=

⎣⎢⎤

⎦⎥&$x1

$x2

$z2

1x)(tτ)(ˆ tr

Neural Network Observer

Neural Network Controller

1~x

111

212

121

~)()()ˆ,ˆ(ˆˆ

~ˆˆ

xKxMxxWz

xxz

++=

+=− t

k

oTo

D

τσ&

&

TooDo k 1

~)ˆ(ˆ xxFW σ−=&

oooooo WFWxF ˆˆ~1 κκ −−

)()(ˆ)ˆ,ˆ(ˆ)( 21 ttt cvcT

c vrKxxW −+= στ

Tccc rxxFW ˆ)ˆ,ˆ(ˆ

21σ=&

ccc WrF ˆˆκ−

)()(ˆ)(ˆ ttt eer Λ+= &

NN ObserversNeeded when all states are not measured

Page 39: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

)())(())(()1( kukxgkxfkx +=+

NN Control for Discrete Time Systems

dynamics

)(ˆ)(ˆ)(ˆ)(ˆ)(ˆ)(ˆ)1(ˆ kWkkIkykkWkW iTiii

Tiiiii ϕϕαφα −Γ−−=+

NN Tuning

layerlastforkrkyandNiforkrKkkWky NviT

ii ),1()(ˆ1,,1),()(ˆ)(ˆ)(ˆ +≡−=+≡ Lϕ

Error-based tuning

Gradient descent with momentum

Extra robust term

U.S. Patent- Jagannathan, Lewis

Page 40: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Neural Network Properties

Learning

Recall

Function approximation

Generalization

Classification

Association

Pattern recognition

Clustering

Robustness to single node failure

Repair and reconfiguration

Nervous system cell. http://www.sirinet.net/~jgjohnso/index.html

USED

???

Page 41: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

x 2

x1

FL Membership Functions for 2-D Input Vector x

1

0

1 0

X1i X1

i+1

X2j

X2j+

1x1

i x1i+1

x 2j

x 2j+

1

Relation Between Fuzzy Systems and Neural Networks

Page 42: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Separable Gaussian activation functions for RBF NN

Separable triangular activation functions for CMAC NN

Page 43: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Two-layer NN as FL System

σ(.)

σ(.)

σ(.)

σ(.)

x1

x2

y1

y2

VT WT

inputs

hidden layer

outputs

xn ym

1

2

3

L

σ(.)

σ(.)

σ(.)

Standard thresholdsθ1

θ11

θ12

θ1n

FL system = NN with VECTOR thresholds

θ2

Page 44: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

.e)b,a,z(2l

ii2l

i

li

)bz(ali

liiA

⎟⎠⎞⎜

⎝⎛ −−

rWKkr)bBaAˆ(KW WWT

W −−−= Φ&

raKkrWAKa aaT

a −=&

rbKkrWBKb bbT

b −=&

Gaussian membership function

Tuning laws

ControlledPlantKv[ ΛΤ I]

r(t)

-

Input MembershipFunctions

Fuzzy Rule Base

Output MembershipFunctions

xd(t)

e(t)

-

-)x,x(g d

x(t)

Fuzzy Logic Controllers

Page 45: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Dynamic Focusing of Awareness

Initial MFs

Final MFs

Page 46: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Effect of change of membership function spread "a"

Effect of change of membership function elasticities "c"

2cB )b,a,z()c,b,a,z( φφ =

2

22

2

1

c

)bz(a))bz(a(cos)c,b,a,z( ⎥

⎤⎢⎣

⎡−+−

Elastic Fuzzy Logic- c.f. P. WerbosWeights importance of factors in the rules

Page 47: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

ControlledPlantKv[ ΛΤ I]

r(t)

-

Input MembershipFunctions

Fuzzy Rule Base

Output MembershipFunctions

xd(t)

e(t)

-

-)x,x(g d

x(t)

)x,x(grK)t(u dv −−=raKkrWAKa aa

Ta −=&

rbKkrWBKb bbT

b −=&

rWKkr)cCbBaAˆ(KW WWT

W −−−−= Φ& rcKkrWCKc ccT

c −=&

Elastic Fuzzy Logic ControlControl Tune Membership Functions

Tune Control Rep. Values

Page 48: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Better Performance

Page 49: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

UnknownPlant

PerformanceEvaluator

InstantaneousUtility

r(t)

DesiredTrajectory

Action Generating NN

x(t)u(t)

tuning

d(t)

R(t)

)(ˆ xfUnknown

Plant

PerformanceEvaluator

InstantaneousUtility

r(t)

DesiredTrajectory

Action Generating NN

x(t)u(t)

FL Critic

tuning

d(t)

R(t)

)(ˆ xf

Fuzzy Logic Critic NN controller

Page 50: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

UnknownPlant

PerformanceEvaluator

d(t)

)(^ xg u(t) x(t)

.

R

r

+

v(t)

Action Generating NN

xd(t)Kv

Kv

∫ ( 6-15 )

( 6-14 )

+

+

11ˆ;ˆ VW

-σ(.)

σ(.)

σ(.)

σ(.)

x1

x2

y1

y2

VT WT

inputs

hidden layer

outputs

xn ym

1

2

3

L

REFERENCE

input membershipfunctions

fuzzy rulle base

output membershipfunctions

+

ρ ρ&

R

Learning FL Critic Controller

,ˆ)ˆ('ˆˆ1111 VrVWrHV TTT Φμ −−=

&

,ˆ)ˆ(ˆ1111 WRrVW TT Γμ −−=

&

2211122222ˆˆ)ˆ('ˆ)()(ˆ WVrVWRrW TTTTT ΓμχσΓχσΓ −−=

&

Tune Action generating NN (controller)

Tune Fuzzy Logic Critic

FL Critic

Action generating NN

Critic requires MEMORY

Page 51: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

User input:Reference Signal Performance

MeasurementMechanism

Reinforcement Signal

r(t)

Action Generating Neural Net

PLANT

RobustTerm

Kv

q(t)u(t)

qd(t)

v(t)

$g(x)-

-+

Utility

Critic Element

R(t)

d(t)fr(t)

Control Action

σ( )×

σ( )×

σ( )×

σ( )×

y1

ym-1

ym

Input Layer Hidden

Layer

Output Layer

z2

zN-1

zN

Inpu

t Pre

-pro

cess

ing W

x1

xn-1

xn

1z1=1q

d(t)

Reinforcement Learning NN Controller

Page 52: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

High-Level NN Controllers Need Exotic Lyapunov Fns.

1))(sgn()( ±== trtR

)~~(21)( 1

1WFWtrrtL T

n

ii

=

+= ∑

WFRxFW T )& κσ −= )(ˆ

)~~(21)1ln()1ln()( 1)()( WFWtreetL Ttrtr −− ++++= αα

& ( ) & ( ~ ~& )L trT T= −sgn +r r W F W1

)~~()(11

1)()(

WFW &&& −

α

α−

++⎟

⎟⎠

⎞⎜⎜⎝

+

α−+

+

α=

−+T

trtrtrtr

eeL

Reinforcement NN control

Simplified critic signal

Lyapunov Fn

Lyap. Deriv. contains R(t) !!

Tuning Law only contains R(t)

Adaptive Reinforcement Learning

,)(ˆ11 ρχσ +⋅= TWR

Critic is output of NN #1

)(ˆ),(ˆ 22 χσTd Wxxg =

Action is output of second NN

,ˆ)(ˆ111 WRW T −−= χσ&

( ) ,ˆˆ)(')(ˆ211122 WRWVrW

TT Γ−+⋅Γ= χσχσ&

The tuning algorithm treats this as a SINGLE 2-layer NN

Page 53: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Principe- Entropy

0000 ),(ln),())(),,(,( dxduuxpuxpuptxuxH ∫ ∫−=

Brockett- Minimum-Attention Controlawareness & effort (partial derivatives in PM)

Renyi’s entropyCorentropy

dtdxxub

tuadtuxruxV

22

0 ),(),( ⎟⎠⎞

⎜⎝⎛

∂∂

+⎟⎠⎞

⎜⎝⎛

∂∂

+= ∫ ∫∫

Encode Information into the Value Function

Information-Theoretic Learning

Page 54: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

2. Neural Network Solution of Optimal Design Equations

Nearly Optimal ControlBased on HJ Optimal Design EquationsKnown system dynamicsPreliminary Off-line tuning

1. Neural Networks for Feedback Control

Based on FB Control ApproachUnknown system dynamicsOn-line tuning

Before-

Page 55: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

2 2Tz h h u= +

2

0

2

0

2

0

2

0

2

)(

)(

)(

)(γ≤

+=

∫∞

dttd

dtuhh

dttd

dttz T

System

),(

)()()(

uxzxy

dxkuxgxfx

ψ==

++=&

)(ylu =

d

u

z

y control

Performance output

Measuredoutput

disturbance

where

Find control u(t) so that

For all L2 disturbancesAnd a prescribed gain γ2

L2 Gain Problem

H-Infinity Control Using Neural Networks

Zero-Sum differential game

Page 56: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Standard Bounded L2 Gain Problem

Take Ruuu T=2 ddd T=2and

Hamilton-Jacobi Isaacs (HJI) equation

xTT

xxTT

xTT

x VkkVVggRVhhfV 21

41

410γ

+−+= −

Stationary Point

xT VxgRu )(* 1

21 −−=

xT Vxkd )(

21* 2γ

=

If HJI has a positive definite solution V and the associated closed-loop system is ASthen L2 gain is bounded by γ2

Problems to solve HJIBeard proposed a successive solution method using Galerkin approx.

Viscosity Solution

Optimal control

Worst-case disturbance

( )∫∞

−+=0

222),( dtduhhduJ T γ Game theory value function

Page 57: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Bounded L2 Gain Problem for Constrained Input Systems

This is a quasi-norm

∫ −=u

Tq

du0

2 )(2 ννφ

Weaker than a norm –homogeneity property is replaced by the weaker symmetry property qq

xx −=

(Used by Lyshevsky for H2 control)

Control constrained by saturation function φ(.)tanh(p)

p

1

-1

∫ ∫∞

⎟⎟⎠

⎞⎜⎜⎝

⎛−+=

0

22

0

)(2),( dtddhhduJu

TT γννφ

Encode constraint into Value function

Page 58: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Hamiltonian

( ) dddhhkdgufxVduVxH T

uTT

T

x2

0

)(2),,,( γννφ −++++∂∂

≡ ∫ −

Stationarity conditions

)(20 1 uVguH

xT −+=

∂∂

= φ

dVkdH

xT 220 γ−=

∂∂

=

Optimal inputs

( )xT Vxgu )(* 2

1 φ−= Note u(t) is bounded!

xT Vxkd )(

21* 2γ

=

Leibniz’s Formula

Solve for u(t)

Page 59: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Cannot solve HJI !! Successive Solution- Algorithm 1:Let γ be prescribed and fixed.

0u a stabilizing control with region of asymptotic stability 0Ω

1. Outer loop- update controlInitial disturbance 00 =d

2. Inner loop- update disturbanceSolve Value Equation

( ) 0)()(2)( 2

0

=−++++∂

∂∫ − iTiu

TTj

Tj

i

dddhhkdgufx

V j

γννφ

Inner loop update

xVxkd j

iTi

∂∂

=+ )(2

12

1

γgo to 2.Iterate i until convergence to jVd ∞∞ , with RAS j

∞Ω

Outer loop update

⎟⎟⎠

⎞⎜⎜⎝

⎛∂

∂−=

+ xVxgu jT

j )(21

1 φ

Go to 1.Iterate j until convergence to ∞

∞∞ Vu , , with RAS ∞

∞Ω

CT Policy Iteration for H-Infinity Control--- c.f. Howard

Consistency equation

Page 60: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Results for this Algorithm

For this to occur it is required that 0* Ω⊆Ω

The algorithm converges to )(*),(*,),(* 0000 ΩΩΩΩ duV

the optimal solution on the RAS 0Ω

Sometimes the algorithm converges to the optimal HJI solution V*, *Ω , u*, d*

For every iteration on the disturbance di one hasj

ij

i VV 1+≤ the value function increasesj

ij

i 1+Ω⊇Ω the RAS decreases

For every iteration on the control uj one has1+

∞∞ ≥ jj VV the value function decreases

1+∞∞ Ω⊆Ω jj the RAS does not decrease

Page 61: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

)()()(

)()( i

LT

Li

L

TL

iL WxW

xL

xV

σσ

∇=∂

∂=

∂∂

Value function gradient approximation is

Substitute into Value Equation to get

Therefore, one may solve for NN weights at iteration (i,j)

Neural Network Approximation for Computational Technique

222),,()(),,()(0 i

jTi

jTi

ji

jTi

j duhhduxfxwduxrxxw γσσ −++∇=+∇= &

Neural Network to approximate V(i)(x)

( ) ( ) ( )

1( ) ( ) ( ),

Li i T i

L j j L Lj

V x w x W xσ σ=

= =∑

Problem- Cannot solve the Value Equation!

Page 62: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Neural Network Feedback Controller

1 ( ) .2

T TL Ld k x Wσ= ∇

Optimal Solution

( )LT

LT Wxgu σφ ∇−= )(2

1

A NN feedback controller with nearly optimal weights

Page 63: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Example: Linear system

1u ≤1 1

2 2

0 0.5 0,

1 1.5 1

x xu

x x

−= +

⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎤⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦⎣ ⎦ ⎣ ⎦

&

&

2 2 4

15 1 2 1 1 2 2 3 1 2 4 1

4 3 2 2 3 6 6

5 2 6 1 2 7 1 2 8 1 2 9 1 10 2

5 4 2 3 3 2 4 5

11 1 2 12 1 2 13 1 2 14 1 2 15 1 2

( , )V x x w x w x w x x w x

w x w x x w x x w x x w x w x

w x x w x x w x x w x x w x x

= + + + +

+ + + + +

+ + + +

Activation functions = even polynomial basis up to order 6

Page 64: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

RAS found by integrating )(xfx −=&

That is, reverse time τddt −=

Initial Gain found by LQR Optimal NN solution

Page 65: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Rotational-Translational Actuator Benchmark Problem

Control input is torque NF is a disturbance

Page 66: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Rotational-Translational Actuator Benchmark Problem

22

1 4 3 32 2 2 2

3 3

42

3 1 4 32 22 2

33

( ) ( ) ( )

0sin cos

1 cos 1 cos( ) , ( )

01cos ( sin )

1 cos1 cos0.2

x f x g x u k x dx

x x x xx x

f x g xx

x x x xxx

ε εε ε

ε εεε

ε

= + +

⎡ ⎤ ⎡ ⎤⎢ ⎥ ⎢ ⎥− + −⎢ ⎥ ⎢ ⎥⎢ ⎥− −⎢ ⎥

= =⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥−⎢ ⎥ ⎢ ⎥−− ⎣ ⎦⎣ ⎦

=

&

Page 67: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

-0.1 0 0.1 0.2 0.3 0.4 0.5 0.6-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6State Evolution for both controlllers

x1

x2

Switching Surface methodNonquadratic functionals method

( )1

0 0

tanh( ) 2 ( )u

TTV x Q x Rd dtφ μ μ∞

−⎡ ⎤= +⎢ ⎥

⎣ ⎦∫ ∫

Minimum-Time ControlEncode into Valua Function

Page 68: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

2. Neural Network Solution of Optimal Design Equations

Nearly Optimal ControlBased on HJ Optimal Design EquationsKnown system dynamicsPreliminary Off-line tuning

1. Neural Networks for Feedback Control

Based on FB Control ApproachUnknown system dynamicsOn-line tuning

Before-

3. Approximate Dynamic Programming

Nearly Optimal ControlBased on recursive equation for the optimal valueUsually Known system dynamics (except Q learning)

The Goal – unknown dynamicsOn-line tuning

Page 69: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

IEEE Trans. Neural NetworksSpecial Issue on Neural Networks for Feedback Control

Lewis, Wunsch, Prokhorov, Jie Huang, Parisini

Due date 1 December

Bring together:Feedback control system communityApproximate Dynamic Programming communityNeural Network community

Page 70: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

)())(,()( 1++= khkkkh xVxhxrxV γ

),(1 kkk uxfx =+

Discrete-Time Systems

Recursive formConsistency equation

Howard Policy Iteration- Iterate the following until convergence 1. Find the value for the prescribed policy

solve completely2. Policy improvement

)())(,()( 1++= kjkjkkj xVxhxrxV γ

))(),((minarg)( 11 ++ += kjkkukj xVuxrxhk

γ

∑=

−=N

kikk

kik uxrxV ),()( γ

Value in difference form -

Page 71: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Four ADP Methods proposed by Werbos

Heuristic dynamic programming

Dual heuristic programming

AD Heuristic dynamic programming

AD Dual heuristic programming

(Watkins Q Learning)

Critic NN to approximate:

Value

Gradient xV

∂∂

)( kxV Q function ),( kk uxQ

GradientsuQ

xQ

∂∂

∂∂ ,

Action NN to approximate the Control

Bertsekas- Neurodynamic Programming

Barto & Bradtke- Q-learning proof (Imposed a settling time)

Page 72: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

xVxgtxu T

∂∂

−=*

* )(21))((

),(

)()()(

uxzxy

dxkuxgxfx

ψ==

++=&

)( ylu =

d

u

z

y),(

)()()(

uxzxy

dxkuxgxfx

ψ==

++=&

)( ylu = )( ylu =

d

u

z

y

),,,(),,()(0 duxVxHduxrkdguf

xV T

∂∂

≡+++⎟⎠⎞

⎜⎝⎛

∂∂

=

xVxktxd T

∂∂

=*

2* )(

21))((γ

dxdVkk

dxdV

dxdVgg

dxdVhhf

dxdV T

TT

TT

T **

2

***

41

410 ⎟⎟

⎞⎜⎜⎝

⎛+⎟⎟

⎞⎜⎜⎝

⎛−+⎟⎟

⎞⎜⎜⎝

⎛=

γ

Continuous-Time Systems

HJB equation

Consistency equation

∫=T

t

dtduxrtxV ),,())((

Value in differential form -

)())(,()( 1++= khkkkh xVxhxrxV γ

Page 73: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Continuous Time Policy Iteration Select a stabilizing initial control1. Outer loop- update control

Initial disturbance set to zero

2. Inner loop- update disturbanceSolve Lyapunov equation

Inner loop disturbance update

go to 2.Until convergence

Outer loop update

Go to 1.Until convergence

( ) 0)( 222=−++++

∂∂ i

jTi

j

Tj

i

duhhkdgufx

V γ

xVxkd j

iTi

∂∂

=+ )(2

12

1

γ

⎟⎟⎠

⎞⎜⎜⎝

⎛∂

∂−=+ x

Vxgu ji

Tj )(2

11

Abu-Khalaf and Lewis- H inf

c.f. Howard work in DT Systems

Saridis – H2

Page 74: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

)(),(ˆ xwwxV Tijj

i σ=

Neural Network Approximation of Value Function

222),,()(),,()(0 i

jTi

jTi

ji

jTi

j duhhduxfxwduxrxxw γσσ −++∇=+∇= &

Lyapunov equation becomes

*12

1* )()()( wxxgRxu TT σ∇−= −

Control action

CT Nearly Optimal NN feedback

CT Approx Policy IterationAbu-Khalaf & Lewis

Nearly optimal FB controlOff-line tuningKnown dynamics

Page 75: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Continuous-time adaptive critic

0),(),(),(),(),,( =+⎟⎠⎞

⎜⎝⎛

∂∂

=+⎟⎠⎞

⎜⎝⎛

∂∂

=+=∂∂ uxruxf

xVuxrx

xVuxrVu

xVxH

TT

&&

),(),()(),()(),( uxruxfxwuxrxxwuxrdt

dw TTT

+∇=+∇=+= σσσδ &

residual eq error

221 δ=E

gradient ),()()()( uxfxtw

twE σδδδ ∇=

∂∂

=∂∂

Update weights using, e.g., gradient descentδσα ),()( uxfxw ∇−=&

Critic NN

Abu-Khalaf & Lewis (c.f. Doya)

Hamiltonian (CT consistency check)

Or RLS

)()( xwxV Tσ=On-line tuning

Target value

Page 76: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Action NN

wwxxgRY TTT φσ =∇−= − )()(12

12

)()()( 12

1 xxgRx TTT σφ ∇−= −

vxY T )(2 φ=

])[(][)()(ˆ)( 12

1222 wvxwvxxgRYYxe TTT −=−∇−=−= − φσ

update weights by gradient descent)()( 2 xexv φβ−=&

Target action

Action NN

Activation fns depend on system dynamics

Critic weights

Alternative, simply set wwxxgRYxu TTT φσ =∇−== − )()()( 12

12

Does not work- proof development so far indicates that critic NN must be tuned faster than action NNi.e. α > β

c.f. Bradtke & Barto DT Q learning work

Page 77: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

tuxr

tVV

uxrt

VVuxrxVu

xVxH tt

Dtttt

Δ+

Δ−

≈+Δ−

≈+=∂∂ ++ ),(

),(),()(),,( 11&

Small Time-Step Approximate Tuning for Continuous-Time Adaptive Critics

txVxVuxr

uxA ttttD

tt Δ−+

= + )()(),(),(

*1*

1

Baird’s Advantage function

)())(,()( 1++= khkkkh xVxhxrxV γThis is not in standard DT form

Sampled data systems

Page 78: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Optimal ControlLewis & Syrmos 1995

For More InformationJournal papers on http://arri.uta.edu/acs

Page 79: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such
Page 80: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

In Progress: M. Abu-Khalaf, Jie Huang, F.L. LewisNearly Optimal Control by HJ Equation Solution Using Neural Networks

Page 81: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such
Page 82: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Theorem 1. Necessary and Sufficient Conditions for H-infinity Static OPFB Control

Assume that Q>0, then system (1) is output-feedback stabilizable with L2 gain bounded by γ If and only if:

i. (A, C) is detectable

ii. There exist matrices K* and L such that

)(* 1 LPBRCK T += −

where P>0, PT =P, is a solution of

01 112 =+−+++ −− LRLPBPBRPPDDQPAPA TTTT

γc.f. results by Kucera and De Souza

Note there is an (A,B) stabilizability condition hidden in the existence of Solution to the Riccati eq.

ONLY TWO COUPLED EQUATIONS

Page 83: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

1. Initialize:Set n=0, 00 =L , and select γ, Q, R

2. n-th iteration:solve for nP in the ARE

01 112 =+−+++ −−

nT

nnT

nnT

nnT

n LRLPBBRPPDDPQPAAPγ

Evaluate gain and update L

111 )()( −−

+ += TTn

Tn CCCLPBRK

nT

nn PBCRKL −= ++ 11

Solution Algorithm 1- c.f. Geromel

Until Convergence

Based on ARE, so no initial stabilizing gain needed !!

Tries to project gain onto nullspace perp. of C using degrees of freedom in L

Page 84: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Aircraft Autopilot Design

Page 85: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

F-16 Normal Acceleration Regulator Design

Aircraftq,α 2.20

2.20+s

kq

α q

_ kI

ke

s 1 _

__

r e ε u δe

nzz =

1010+ sG Command System

TF eqy ][ εα=

ykkkkKyu Ieq ][ α−=−=

SystemdynamicsActuator

dynamics

Sensordynamics

Page 86: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

Theorem 2. - new work

Parametrization of all H-infinity Static SVFB Controls

Assume that Q>0, then K is a stabilizing SVFB with L2 gain bounded by γ If and only if:

i.(A, B) is stabilizable

ii.There exist a matrix L such that

)(1 LPBRK T += −

where P>0, PT =P, is a solution of

01 112 =+−+++ −− LRLPBPBRPPDDQPAPA TTTT

γ

OPFB is a special case

Page 87: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

1s C

A

xx.Bu2

H(s)

u1

1s C

A

xx.Bu2

H(s)

u1

kkTT

kk uxVWAxx ++=+ )(1 σ

Chaos in Dynamic Neural Networksc.f. Ron Chen

Page 88: Automation & Robotics Research Institute (ARRI) The ... talks/montreal plenary.pdfUnknown disturbances High performance specifications for nonlinear systems Actuator problems such

%MATLAB file for chaotic NN from Jun Wang's paper

function [ki,x,y,z]=tcnn(N);y(1)= rand; ki(1)=1; z(1)= 0.08;a=0.9; e= 1/250; Io=0.65;g= 0.0001; b=0.001;

for k=1: N-1;ki(k+1)= k+1;x(k)= 1/(1+exp(-y(k)/e));y(k+1)= a*y(k) + g -

z(k)*(x(k) - Io);z(k+1)= (1-b)*z(k);

endx(N)= 1/(1+exp(-y(N)/e));

⎟⎠⎞

⎜⎝⎛ −

+−+=

=

−+

+

Ie

zgyy

zz

kykkk

kk

ρα

β

/1

1

11

Jun Wang