for control of and systems - uta 03...he who exerts his mind to the utmost knows nature’s pattern....

Moncrief-O’Donnell Chair, UTA Research Institute (UTARI)The University of Texas at Arlington, USA

and

F.L. Lewis, NAI

Talk available online at http://www.UTA.edu/UTARI/acs

Neural Networks for Control of Nonlinear Processes and Systems

Supported by :China Qian Ren Program, NEUChina Education Ministry Project 111 (No.B08015)NSF, ONR

Qian Ren Consulting Professor, State Key Laboratory of Synthetical Automation for Process Industries,

Northeastern University, Shenyang, China

Y.D. Song

Bringing Together Control Theory and Computational Intelligence

He who exerts his mind to the utmost knows nature’s pattern.

The way of learning is none other than finding the lost mind.

Meng Tz500 BC

Man’s task is to understand patterns innature and society.

Mencius

Importance of Feedback Control

Darwin 1850- FB and natural selectionVito Volterra 1890- FB and fish population balanceAdam Smith 1760- FB and international economyJames Watt 1780- FB and the steam engineFB and cell homeostasis

The resources available to most species for their survival are meager and limited

Nature uses Optimal control

Feedback Control Systems

Aircraft autopilotsCar engine controlsShip controllersCompute Hard disk drive controllersIndustry process control – chemical, manufacturingRobot control

Industrial Revolution –Windmill control, British millwrights - 1600sSteam engine and prime movers

James Watt 1769SteamshipSteam Locomotive boiler control

Sputnik 1957Aerospace systems

Relevance- Machine Feedback Control

qd

qr1 qr2

AzEl

barrel flexiblemodes qf

compliant coupling

moving tank platform

turret with backlashand compliant drive train

terrain andvehicle vibrationdisturbances d(t)

Barrel tipposition

qd

qr1 qr2

AzEl

barrel flexiblemodes qf

compliant coupling

moving tank platform

turret with backlashand compliant drive train

terrain andvehicle vibrationdisturbances d(t)

Barrel tipposition

Vehicle mass m

ParallelDamper

mc

activedamping

uc(if used)

kc cc

vibratory modesqf(t)

forward speedy(t)

vertical motionz(t)

surface roughness(t)

k c

w(t)

Series Damper+

suspension+

wheel

Single-Wheel/Terrain System with Nonlinearities

High-Speed Precision Motion Control with unmodeled dynamics, vibration suppression, disturbance rejection, friction compensation, deadzone/backlash control

VehicleSuspension

IndustrialMachines

Satellite pointing Land Systems

Aerospace

Relevance- Industrial Process ControlPrecision Process Control with unmodeled dynamics, disturbance rejection, time-varying parameters, deadzone/backlash control

ChemicalVaporDeposition

IndustrialMachines

XY Table

Autoclave

Newton’s Law

v(t)

x(t)

F(t)m

)()( tum

tFx

xmmaF

Industrial Process and Motion Systems (Vehicles, Robots)

τqB)+τq+G(q)+F(q)q(q,+Vq)qM( dm )(

Coriolis/centripetalforce

gravity frictiondisturbances

Actuatorproblems

inertia

Control Input

Lagrange’s Eqs. Of Motion

Lagrange Dynamical systems

Dynamical System Models

)()()(

xhyuxgxfx

Nonlinear system

Continuous-Time Systems Discrete-Time Systems

)()()(1

kk

kkkk

xhyuxgxfx

Linear system

CxyBuAxx

kk

kkk

CxyBAxx

1

1/s

f(x)

h(x)g(x)

z-1

xx yuControl Inputs Internal States Measured Outputs

Issues in Feedback Control

system

Feedbackcontroller

Feedforwardcontroller

Measured outputs

Control inputs

Desired trajectories

Sensornoise

Disturbances

StabilityTracking BoundednessRobustness

to disturbancesto unknown dynamics

Unknown Process dynamicsProcess NonlinearitiesUnknown Disturbances

plant

controlu(t)

outputy(t)

controller

systemidentifier estimated

output

)(ˆ ty

identificationerror

desiredoutput

)(tyd

plant

controlu(t)

outputy(t)

controllerdesiredoutput

)(tyd

trackingerror

plant

controlu(t)

outputy(t)

controller #1

controller #2

desiredoutput

)(tyd

trackingerror

Indirect Scheme

Controller Topologies

Direct Scheme

Feedback/FeedforwardScheme

Feedback LinearizationAdaptive ControlNeural Networks for ControlNeural-adaptive Control

6 US PatentsSponsored by:China Qian RenChina Project 111US NSF, ARO, ONR, AFOSR

Definitions of System Stability

xe

xe+B

xe-B

Const Bound B

tt0 t0+T

T

x(t)

x(t)

t

x(t)

t

Asymptotic StabilityMarginal or Bounded Stability- SISL

Uniform Ultimate Boundedness

)()(

1 kk xfxxfx

d

B(d)

For every B there exists a d

There exists bound B that cannot be prescribed

21 2

1y us a s a

Example 1. Linear System

1 2y a y a y u

desired to track a reference input ( )dy t

de y y Tracking error

1 2de y a y a y u

r e e Sliding variable1 2dr e e y e a y a y u

du v y e Auxiliary input

1 2r a y a y v Error dynamics

Of the form ( )r f x v

1 2 1 2( ) ( )Ty

f x a y a y a a W xy

Unknown function

Feedback Linearization

Unknown parameters

Known Regression Vector

Example 2. Nonlinear Lagrange System( , ) ( )y d y y k y u

unknown nonlinear damping term unknown nonlinear friction

( )r f x v

( ) ( , ) ( )f x d y y k y with

Assume Linear in the Parameters (LIP)

11

( , )( ) ( )

( , )Td y yf x D K W x

k y y

Known possibly nonlinear regression function

Unknown parameters

desired to track a reference input ( )dy t

de y y Tracking error

( , ) ( )de y d y y k y u

r e e Sliding variable

Error dynamicsdu v y e Auxiliary input

Feedback Linearization

Lagrangian System Appears in:Process controlMechanical systemsRobots

plantr(t)

Feedback Linearization Controller

du v y e

u

yy d

d

yy e

e

I

r e e

e v?

controller

The equations give the FB controller structure

Feedforward terms

Tracking Loop

plantr(t)

Feedback Linearization Controller

du v y e

u

yy d

d

yy

ee

I

r e e

e

v


( ) vv f x K r

vK

( )f x

Tracking Loop

Feedback terms

( )r f x v Error dynamics

If f(x) is known, use

Adaptive Control

Error Dynamics

( )r f x v r(t)= control errorControl input

Unknown nonlinearities

( )Tr W x v

Error Dynamics

Assume: f(x) is known to be of the structure

( ) ( )Tf x W x

LINEAR-IN-THE-PARAMETERS (LIP)

Known basis set= regression vector – DEPENDS ON THE SYSTEM

Unknown parameter vector

Controller

ˆ ˆ( ) ( ) ( )Tv vv f x K r W t x K r

ˆ( ) ( )W t W W t Parameter estimation error

ˆ( ) ( ) ( )T T T vr W x v W x W x K r ( )T vr W x K r

closed-loop system becomes

Est. error drives the control error

ˆ ˆ ( ) TdW W F x rdt

Parameter estimate is updated (tuned) using the adaptive tuning law

Adaptive Control

Pos. def. control gain

ESTIMATE of unknown parameters

plantr(t)

Feedback Linearization Adaptive ControllerA dynamic controller

du v y e

u

yy d

d

yy

ee

I

r e e

e

v



vK

ˆ ( ) TW F x rˆ ( ) ( )TW t x

ˆ ( )f x

Tunable inner loop

Tracking Loop

Feedback terms

Error dynamics

f(x)

Kv

^

Adaptive ControlTuning

Dynamics

r(t)

Adaptive Controller

Adaptive Controller

ˆ ( ) TW F x r

A dynamic controller

Adaptive ControlPerformance of Adaptive Controller: Using the adaptive controller, the closed-loop

system is asymptotically stable, i.e. the control error r(t) goes to zero.

If an additional Persistence of Excitation (PE) condition holds, the parameter estimates converge to the actual unknown parameters.

11 12 2 { }

T TL r r tr W F W Proof:

1{ }T TL r r tr W F W

1 1( ) { } { ( ) }T T T T T Tv vL r W x K r tr W F W r K r tr W F W x r

ˆ ( ) TW F x r ( ) TW F x r

( )T vr W x K r Error dynamics

Parameter tuning law

TvL r K r

Therefore Lyapunov shows that

0

0

( ), ( )r t W t are bounded( ), ( )r t W t

or

Typical Behavior of Adaptive Controllers

Control errors go to zero and the parameter estimates converge.

This assumes that ( ) ( )Tf x W x holds exactly, and that there are no disturbances in the system.

( )r f x Error dynamics

Control input

Unknown nonlinearities

Assume

know a fixed nominal value or estimate ˆ ( )f x for unknown f(x),

ˆ( ) ( )f f x f x

( ) ( )f x F x

estimation error is bounded like

Known bounding function, maybe nonlinear

ˆ ( ) v rf x K r v Controller

( ) ,

( ) ,r

F xr rr

vF xr r

Robust Control

Error dynamics

Robust ControlTerm vr(t)

f(x)

Kv

^

Robust Controller

FixedNominal ControlTerm

r(t)

Robust Controller

A NON Dynamic controller

Closed-loop Error dynamics

ˆ( ) ( ) ( ) v rr f x f x f x K r v ( ) v rr f x K r v

r

Performance of Robust Controller:With this control, the closed-loop system is bounded stable with

bounded with a magnitude near

Proof:12

TL r r

TL r r

( ) ( )T T Tv r v rL r f x K r v r K r r f x v 2

min ( ) ( )T

v rL K r r F x r v

2 2min

2min

( ) ( ) ( ) /

( )

v

v

L K r r F x r F x r

K r

r Case 1:

0

2 2min

2min

( ) ( ) ( ) /

( ) ( ) 1 /

v

v

L K r r F x r F x

K r r F x r

Case 2: r

indefinite

( ) ,

( ) ,r

F xr rr

vF xr r

Typical Behavior of Robust Controllers

Control error does not go to zero but does indeed stay small.

Robust ControlAdaptive Control

Robot System[I]

Robust ControlTerm

v(t)

qd

Tracking Loop

f(x)

rKv

Nonlinear Inner Loop

..

^

Multiloop Nonlinear Controller Structure

q = qq.e =

ee.

qd =qdqd.

Adaptive ControlTerm

Adaptive plus robust control

( ) ,

( ) ,r

F xr rr

vF xr r

ˆ ( ) TW F x r

Neural Networks for Control

F.L. Lewis, S. Jagannathan, and A.Yesildirek,Neural Network Control of RobotManipulators and Nonlinear Systems,Taylor and Francis, London, 1999.

NN control in Chapter 4

Neural Network Properties

Learning

Recall

Function approximation

Generalization

Classification

Association

Pattern recognition

Clustering

Robustness to single node failure

Repair and reconfiguration

Nervous system cell. http://www.sirinet.net/~jgjohnso/index.html

Two-layer feedforward static neural network (NN)

(.)

(.)

(.)

(.)

x1

x2

y1

y2

VT WT

Inputs

Hidden layer

Outputs

xn ym

1

2

3

L

(.)

(.)

(.)

Summation eqs Matrix eqs( )( )

T T

T

y W V xW x

K

ki

n

jkjkjiki wvxvwy

10

10

Extra freedom to select basis set by tuning first-layer weights V.

Neural Network Universal Approximation Property

Let f(x) be any smooth nonlinear function

Then f(x) can be approximated by

1

( ) ( ) ( )L

i ii

f x w x x

For appropriate choice of the basis functions ( ), 1,i x i L

Moreover, the approximation error goes uniformly to zero as ( )x L

This was shown by Weierstrass for polynomial bases functions (Taylor series)

Hornik and Stinchcomb, Sandberg showed that is bounded on a compact set For a large class of approximating functions

( )x

Need to find the unknown weights

Do this by NN learning – tuning the NN weights

iw

Common activation functions (.)

Nonlinear System[I]

Unity-Gain Tracking Loop

rKv

Industry Standard- PD Controller

)()()( tetetr Desiredtrajectory

Actualtrajectory

Easy to implement with COTS controllersFastCan be implemented with a few lines of code- e.g. MATLAB

But -- Cannot handle-High-order unmodeled dynamics Unknown disturbancesHigh performance specifications for nonlinear systemsActuator problems such as friction, deadzones, backlash

Controlinpute

e

d

d

yy y

y

u

plantr(t)

Feedback Linearization Adaptive ControllerA dynamic controller

du v y e

u

yy d

d

yy

ee

I

r e e

e

v



vK

ˆ ( ) TW F x rˆ ( ) ( )TW t x

ˆ ( )f x

Control System Design Approach

( ) ( , ) ( ) ( )m dM q q V q q q F q G q

( ) ( ) ( )de t q t q t

r e e

dm xfrVrM )(

Robot dynamics

Tracking Error definition

Error dynamics

Siding variable

( ) M( ( ) ( ) )dMr M e e q t q t e

( ) ( )( ) ( , )( ) ( ) ( )d m df x M q q e V q q q e F q G q

Where the unknown function is

qdRobot System[I]

qe

PD Tracking Loop

r

dm qFqGqqqVqqM )()(),()( Robot dynamics

?controller

)()()( tqtqte d Tracking error

eer Sliding variable


qdRobot System[I]

qe

PD Tracking Loop

r

dm xfrVrM )(

( ) ( )( ) ( , )( ) ( ) ( )d m df x M q q e V q q q e F q G q Where

Error dynamics is

If f(x) is known use the control ( ) vf x K r Then ( )m v dMr V K r

vK

( )compute f x

Computed Torque Control

Control System Design Approach

dm qFqGqqqVqqM )()(),()(

)()()( tqtqte d eer

dm xfrVrM )(

Robot dynamics

Tracking Error definition

Error dynamics

vrKxVW vTT )ˆ(ˆ Define control input

)()( xVWxf TTApprox. unknown function by NN

Universal Approximation Property UNKNOWN FN.

)()ˆ(ˆ)( tvxVWxVWrKrVrM dTTTT

vm

Closed-loop dynamics

)(~ tvfrKrVrM dvm

qdRobot System[I]

Robust ControlTerm

q

v(t)

e

PD Tracking Loop

rKv

Neural Network Robot Controller

^

qd

f(x)


..

Feedforward Loop

Universal Approximation Property

Problem- Nonlinear in the NN weights sothat standard proof techniques do not work

Feedback linearization

Easy to implement with a few more lines of codeLearning feature allows for on-line updates to NN memory as dynamics changeHandles unmodelled dynamics, disturbances, actuator problems such as frictionNN universal basis property means no regression matrix is neededNonlinear controller allows faster & more precise motion

Theorem 1 (NN Weight Tuning for Stability)

Let the desired trajectory )(tqd and its derivatives be bounded. Let the initial tracking error bewithin a certain allowable set U . Let MZ be a known upper bound on the Frobenius norm of theunknown ideal weights Z . Take the control input as

vrKxVW vTT )ˆ(ˆ with rZZKtv MFZ )()( .

Let weight tuning be provided by

WrFxrVFrFW TTT ˆˆ'ˆˆˆ

, VrGrWGxVTT ˆ)ˆ'ˆ(ˆ

with any constant matrices 0,0 TT GGFF , and scalar tuning parameter 0 . Initialize the weight estimates as randomVW ˆ,0ˆ .

Then the filtered tracking error )(tr and NN weight estimates VW ˆ,ˆ are uniformly ultimately bounded. Moreover, arbitrarily small tracking error may be achieved by selecting large controlgains vK . Backprop terms-

WerbosExtra robustifying terms-Narendra’s e-mod extended to NLIP systems

Adaptive part Robust part

Stability Proof based on Lyapunov Extension

Define a Lyapunov Energy Function

)~~()~~( 212121 VVtrWWtrMrrLTTT

Differentiate

)()'ˆˆ~(~)ˆ'ˆˆ~(~

)2(21

vwrWxrVVtr

xrVrWWtr

rVMrrKrL

TTTT

TTTT

mT

vT

Using certain special tuning rules, one can show that the energy derivative is negative outside a compact set.

Lnegative

)(tr

)(~ tW This proves that all signals are bounded

Problems—1. How to characterize the NN weight errors as ‘small’?- use Frobenius Norm2. Nonlinearity in the parameters requires extra care in the proof

UUB- uniform ultimate boundednes

010

2030

4050

0

12

3

4

5

6

-20

-15

-10

-5

0

5

10

15

weights

W2 weights, x

d=[0.5sin(t) 0.5cos(t)]T

time

W2 w

eigh

ts

NN weights converge to the best learned values for the given system

τqB)+τq+G(q)+F(q)q(q,+Vq)qM( dm )(Structured Control NN

x

q,

2,, qq

q

q

1ˆ M

2ˆ mV

Ĝ

F̂

+)(ˆ xf

Partitioned neural network

Non

linea

r pre

proc

essi

ng

sin( ), cos( )i iq q

i jq q

2iq

coriolis

centripetal

sgn( )iqiqe

1s C

A

xx.Bu2

H(s)

u1

1s C

A

xx.Bu2

H(s)

u1

kkTT

kk uxVWAxx )(1

Chaos in Dynamic Neural Networks

c.f. Ron Chen

Recurrent or Dynamic NN

%MATLAB file for chaotic NN from Jun Wang's paper

function [ki,x,y,z]=tcnn(N);y(1)= rand; ki(1)=1; z(1)= 0.08;a=0.9; e= 1/250; Io=0.65;g= 0.0001; b=0.001;

for k=1: N-1;ki(k+1)= k+1;x(k)= 1/(1+exp(-y(k)/e));y(k+1)= a*y(k) + g -

z(k)*(x(k) - Io);z(k+1)= (1-b)*z(k);

endx(N)= 1/(1+exp(-y(N)/e));

Ie

zgyy

zz

kykkk

kk

/1

1

11

Jun Wang

Kung Tz 500 BCConfucius

ArcheryChariot driving

MusicRites and Rituals

PoetryMathematics

孔子Man’s relations to

FamilyFriendsSocietyNationEmperorAncestors

Tian xia da tongHarmony under heaven

124 BC - Han Imperial University in Chang-an

Actuator Dynamics

=D(u)

ud+

-d-

m+

m-

.

ud+

d-

mu

BacklashDeadzone

System

Feedbackcontroller

Outputs

ActualControl inputsActuator

nonlinearity

AppliedControl inputs

Actuator Nonlinearities

Friction

Friction Compensation

dm qFqGqqqVqqM )()(),()(

Lagrangian System Dynamics – e.g. Robot

FrictionUse the standard NN from Lecture 1

qdRobot System[I]

Robust ControlTerm

q

v(t)

e

PD Tracking Loop

rKv

^

qd

f(x)


.

.Feedforward Loop

Friction compensation

0 2 4 6 8 10 12 14 16-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

Time(second)

(a)

0 2 4 6 8 10 12 14 16-0.04

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

Time(second)

Leng

th(m

eter

)

(a)

0 2 4 6 8 10 12 14 16-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

0.25

Time(second)

Leng

th(m

eter

)

(b)

NN Friction Compensator

Desired trajectory

Tracking errors- solid = fixed gain controller, dashed= NN controller

Trajectory Tracking Controller

Position Velocity

position

velocity

Fixed gain

NN

Deadzone

=D(u)

ud+

-d-

m+

m-

u

u

Sat(u)

d+-d-

u

=

-

Key fact

NN Approximation of Functions with Jumps

-3 -2 -1 0 1 2 3 4 50

0.2

0.4

0.6

0.8

1

-3 -2 -1 0 1 2 3 4 50

0.2

0.4

0.6

0.8

1

0(x) 1(x)

-3 -2 -1 0 1 2 3 4 50

0.2

0.4

0.6

0.8

1

-3 -2 -1 0 1 2 3 4 50

0.2

0.4

0.6

0.8

1

2(x) 3(x)

Jump Function Basis Set

1 1

x y

1

2

L

V1T

V2T

1cc

c

W1T

W2T

....

...

2

1

N

Augmented NN

Standard NN

Extra Nodes withJump Activation Functions

MechanicalSystem

Kv[

v

reqd

Estimateof NonlinearFunction

w

--

D(u)u

NN DeadzonePrecompensator

I

II

( )f x

q

dq

NN in Feedforward Loop- Deadzone Compensation

iiiTTTTT

iii WWrTkWrTkUuUWrwUTW ˆˆˆ)('ˆ)(ˆ 21

WrSkrwUWUuUSW TTiiiTT ˆ)(ˆ)('ˆ 1

Acts like a 2-layer NNWith enhanced backprop tuning !

little critic network

0 5 10 15-1

-0.5

0

0.5

1

x2(k)

time

0 5 10 15-1

-0.5

0

0.5

e2(k)

time

0 5 10 15-2

-1

0

1

2

x2(k)

time

0 5 10 15-1

-0.5

0

0.5

e2(k)

time

Performance Results

PD control-deadzone chops out the middle

NN control fixes the problem

ud+

d-

mu

Backlash

( , , )B u u

A dynamic nonlinearity

u

w

d+

d-

u

w

1/m

+=

Nominal feedforward path + Unknown dynamic nonlinearity

( , , )B u u

Nonlinear

SystemK

v[

v1

rexd

Estimate

of Nonlinear

Function

--

x

( )f x

des

[0

--

yd

(n)

-

Backlash

-

1/s

Filter v2

Backstepping loop

des

NN Compensator

-

dx r

Kb

û̂

nny

FẐ

Dynamic Inversion NN compensator for system with Backlash

U.S. patent- Selmic, Lewis, Calise, McFarland

Performance Results

PD control-backlash chops off tops & bottoms

NN control fixes the problem

0 1 2 3 4 5 6 7 8 9 10-1.5

-1

-0.5

0

0.5

1

time

x 1(t)

PD controller with backlash

0 1 2 3 4 5 6 7 8 9 10-0.1

-0.05

0

0.05

0.1

0.15

time

e 1(t)

0 1 2 3 4 5 6 7 8 9 10-2

-1

0

1

2

time

x 2(t)

PD controller with backlash

0 1 2 3 4 5 6 7 8 9 10-1

-0.5

0

0.5

1

time

e 2(t)

position

velocity

error

0 1 2 3 4 5 6 7 8 9 10-1.5

-1

-0.5

0

0.5

1

time

x 1(t)

PD controller with NN backlash compensation

0 1 2 3 4 5 6 7 8 9 10-0.04

-0.03

-0.02

-0.01

0

0.01

timee 1

(t)

position

Tracking error

~x 1

( , )h x xo 1 2

( , )h x xc 1 2

ROBOT

Kv

[vc

kD

KvkpM-1(.)

qxx

1

2

eee

qqqdd

d

x1

x2

z2

1x)(t)(ˆ tr

Neural Network Observer

Neural Network Controller

1~x

111

212

121

~)()()ˆ,ˆ(ˆˆ

~ˆˆ

xKxMxxWz

xxz

t

k

oTo

D

NN Observers Needed when all states are not measuredi.e. Output feedback

TooDo k 1

~)ˆ(ˆ xxFW

oooooo WFWxF ˆˆ~1

Tccc rxxFW ˆ)ˆ,ˆ(ˆ 21

ccc WrF ˆˆ

Tune NN observer -

Tune Action NN -

Recurrent NN Observer

Fuzzy-Neural Control

Neural Network Properties

Learning

Recall

Function approximation

Generalization

Classification

Association

Pattern recognition

Clustering

Robustness to single node failure

Repair and reconfiguration

Nervous system cell. http://www.sirinet.net/~jgjohnso/index.html

USED

???

Input Membership Fns. Output Membership Fns.

Fuzzy Logic Rule Base

NN

Input

NN

Output

Fuzzy Associative Memory (FAM) Neural Network (NN)

INTELLIGENT CONTROL TOOLS

Input x Output u

Input x Output u

Both FAM and NN define a function u= f(x) from inputs to outputs

FAM and NN can both be used for: 1. Classification and Decision-Making2. Control

(Includes Adaptive Control)

NN Includes Adaptive Control (Adaptive control is a 1-layer NN)

x 2

x1

FL Membership Functions for 2-D Input Vector x

1

0

1 0

X1i X1i+1

X2j

X2j

+1x1i x1i+1

x 2j

x 2j+

1

Relation Between Fuzzy Systems and Neural Networks

Separable Gaussian activation functions for RBF NN

Separable triangular activation functions for CMAC NN

Two-layer NN as FL System

(.)

(.)

(.)

(.)

x1

x2

y1

y2

VT WT

inputs

hidden layer

outputs

xn ym

1

2

3

L

(.)

(.)

(.)

Standard thresholds

n

FL system = NN with VECTOR thresholds

.e)b,a,z(2l

ii2l

i

li

)bz(ali

liiA

rŴKkr)b̂BâAˆ(KŴ WWT

W

râKkrŴAKâ aaT

a

rb̂KkrŴBKb̂ bbT

b

Gaussian membership function

Tuning laws

ControlledPlantKv[ I]

r(t)

-

Input MembershipFunctions

Fuzzy Rule Base

Output MembershipFunctions

xd(t)

e(t)

-

-)x,x(ĝ d

x(t)

Fuzzy Logic Controllers

Dynamic Focusing of Awareness

Initial MFs

Final MFs

Depend on desired reference trajectory

Effect of change of membership function spread "a"

Effect of change of membership function elasticities "c"

2cB )b,a,z()c,b,a,z( 2

22

2

1

c

)bz(a))bz(a(cos)c,b,a,z(

Elastic Fuzzy Logic- c.f. P. WerbosWeights importance of factors in the rules

ControlledPlantKv[ I]

r(t)

-

Input MembershipFunctions

Fuzzy Rule Base

Output MembershipFunctions

xd(t)

e(t)

-

-)x,x(ĝ d

x(t)

)x,x(ĝrK)t(u dv râKkrŴAKâ aa

Ta

rb̂KkrŴBKb̂ bbT

b

rŴKkr)ĉCb̂BâAˆ(KŴ WWT

W rĉKkrŴCKĉ cc

Tc

Elastic Fuzzy Logic ControlControl Tune Membership Functions

Tune Control Rep. Values

Better Performance

Initial MFs

Final MFs

UnknownPlant

PerformanceEvaluator

InstantaneousUtility

r(t)

DesiredTrajectory

Action Generating NN

x(t)u(t)

tuning

d(t)

R(t)

)(ˆ xfUnknown

Plant


InstantaneousUtility

r(t)

DesiredTrajectory


x(t)u(t)

FL Critic

tuning

d(t)

R(t)

)(ˆ xf

Fuzzy Logic Critic NN controller

UnknownPlant


d(t)

)(^ xg u(t) x(t)

.

R

r

+

v(t)


xd(t)Kv

Kv

( 6-15 )

( 6-14 )

+

+

11ˆ;ˆ VW

-(.)

(.)

(.)

(.)

x1

x2

y1

y2

VT WT

inputs

hidden layer

outputs

xn ym

1

2

3

L

REFERENCE

input membershipfunctions

fuzzy rulle base

output membershipfunctions

+

R

Learning FL Critic Controller

,ˆ)ˆ('ˆˆ 1111 VrVWrHVTTT

,ˆ)ˆ(ˆ 1111 WRrVWTT

2211122222ˆˆ)ˆ('ˆ)()(ˆ WVrVWRrW TTTTT

Tune Action generating NN (controller)

Tune Fuzzy Logic Critic

FL Critic

Action generating NN

Critic requires MEMORY

User input:Reference Signal Performance

MeasurementMechanism

Reinforcement Signal

r(t)

Action Generating Neural Net

PLANT

RobustTerm

Kv

q(t)u(t)

qd(t)

v(t)

g(x)-

-+

Utility

Critic Element

R(t)

d(t)fr(t)

Control Action

( )×

( )×

( )×

( )×

y1

ym-1

ym

Input Layer Hidden Layer

Output Layer

z2

zN-1

zN

Inpu

t Pre

-pro

cess

ing W

x1

xn-1

xn

1z1=1q

d(t)

Reinforcement Learning NN Controller

High-Level NN Controllers Need Exotic Lyapunov Fns.

1))(sgn()( trtR

)~~(21)( 1

1WFWtrrtL T

n

ii

WFRxFW T )(ˆ

)~~(21)1ln()1ln()( 1)()( WFWtreetL Ttrtr

( ) ( ~ ~ )L trT T sgn +r r W F W1

)~~()(11

1)()(

WFW

T

trtrtrtr

eeL

Reinforcement NN control

Simplified critic signal

Lyapunov Fn

Lyap. Deriv. contains R(t) !!

Tuning Law only contains R(t)

Adaptive Reinforcement Learning

,)(ˆ 11 TWR

Critic is output of NN #1

)(ˆ),(ˆ 22 T

d Wxxg

Action is output of second NN

,ˆ)(ˆ 111 WRWT

,ˆˆ)(')(ˆ 211122 WRWVrW TT

The tuning algorithm treats this as a SINGLE 2-layer NN

Energy-based control

for control of and systems - uta 03...he who exerts his mind to the utmost knows nature’s pattern....

Documents