university of calgary robust control design for

UNIVERSITY OF CALGARY

Robust Control Design for Teleoperation Systems with Haptic Feedback

using Neural-Adaptive Backstepping

by

Dean Matthew Richert

A THESIS

SUBMITTED TO THE FACULTY OF GRADUATE STUDIES

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE

DEGREE OF MASTER OF SCIENCE

DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING

CALGARY, ALBERTA

APRIL, 2010

c� Dean Matthew Richert 2010

Abstract

Teleoperation holds a promising future as humans push the limits of technology by

allowing human presence in otherwise hostile or remote environments. This the-

sis examines specifically teleoperation with the use of force reflecting haptic devices

as it pertains to robotic surgery. Neural networks operate online to learn the un-

known system dynamics and provide a completely adaptive control design. There

are three novel contributions made in this thesis. First, a unique error definition

allows for force control in constrained motion while permitting position control in

unconstrained motion. Secondly, the backstepping technique smoothes out control

signals and ensures that high frequency vibrations of the robot/environment dynam-

ics are not excited by the proposed controller. Finally, a novel supervisory neural

network update law ensures fast convergence of neural network weights and improves

robustness. The entire system is shown to be globally Lyapunov stable. Using the

Lyapunov redesign method a robust control law is also derived.

ii

Acknowledgements

Thanks must first go to my wife, Jane, who patiently (and sometimes even atten-

tively) listened to me explain my daily findings. I’m sure she is the only student in the

International Development department who knows the difference between Lyapunov

stability and stability by passivity. If you ever talk to her about your research she’ll

be sure to ask you, ”is your model linear or nonlinear.” Our relationship has been

a blessing throughout my studies and has brought joy and laughter to me even in

stressful and discouraging times. She has undeserved belief in me and the motivation

for all I do comes from her and God.

From a research perspective, my supervisor Dr. Chris Macnab and co-supervisor

Dr. Jeff Pieper deserve many thanks. They have both given me invaluable direction

and supplied me with the resources I needed to complete this work. Indeed, the

vision and many of the preliminary details of this work were inspired by Dr. Macnab

and his foresight in to this topic ensured me a gentle road to graduation. I would

also like to thank Dr. Macnab for particular opportunities he made available to me

including attending conferences and teaching tutorial periods.

Finally, I would like to thank my office mates: Javad, Sanaz, Tayyab, and Khalid.

They have always respected me and kept our lab a functional place to do work. In

particular Tayyab and Javad have been instrumental in me attaining this degree.

I’ve been thankful that Tayyab and I were able to take all of the same courses and

our discussions immensely helped me understand the course material. Javad always

asked difficult questions that would challenge me to explore the very foundations of

control systems and his work ethic inspired me everyday.

iii

Table of Contents

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii

Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.1 Introduction to Teleoperation Vocabulary . . . . . . . . . . . 4

1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.3 State of the Art in Teleoperation (literature review) . . . . . . . . . . 8

1.3.1 Sensor choice . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.3.2 Choice of control system . . . . . . . . . . . . . . . . . . . . . 10

1.4 Introduction to the Proposed Solution . . . . . . . . . . . . . . . . . 13

2 Background Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.1 Lyapunov Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.1.1 Control Lyapunov Function . . . . . . . . . . . . . . . . . . . 17

2.1.2 Lyapunov Redesign (robust design) . . . . . . . . . . . . . . . 18

2.1.3 Backstepping . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.1.4 Adaptive Backstepping and Tuning Functions . . . . . . . . . 24

2.2 Radial Basis Functions Networks . . . . . . . . . . . . . . . . . . . . 26

2.2.1 RBFN in controller design . . . . . . . . . . . . . . . . . . . . 28

3 Proposed Controller Design . . . . . . . . . . . . . . . . . . . . . . . 32

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

iv

3.2 System description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.3 Proposed Control Law . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.3.1 Design choice 1: Error definition . . . . . . . . . . . . . . . . . 34

3.3.2 Design choice 2: Backstepping . . . . . . . . . . . . . . . . . . 36

3.3.3 RBFNs used in controller . . . . . . . . . . . . . . . . . . . . 40

3.3.4 Control Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4 Stability Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5 Alternative Control Designs for Comparison . . . . . . . . . . . . . . 51

5.1 H2 Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

5.1.1 Discussion on H2 controller . . . . . . . . . . . . . . . . . . . 54

5.2 Output Feedback Control . . . . . . . . . . . . . . . . . . . . . . . . 55

5.2.1 Discussion on output feedback control . . . . . . . . . . . . . 56

6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

6.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

6.1.1 Master device . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

6.1.2 Virtual components . . . . . . . . . . . . . . . . . . . . . . . . 62

6.1.3 Implementation Considerations . . . . . . . . . . . . . . . . . 64

6.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

6.2.1 Stiff Contact Test and Loss of Contact Test (comparison with

H2 and output feedback controller) . . . . . . . . . . . . . . . 65

6.2.2 Proposed controller performance . . . . . . . . . . . . . . . . . 77

6.2.2.1 Filtering properties of Backstepping . . . . . . . . . 77

6.2.2.2 Neural Network Outputs, Boundedness of Neural Net-

work Weights, Evolution of System States . . . . . . 82

6.2.2.3 Time delay . . . . . . . . . . . . . . . . . . . . . . . 90

v

7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

7.0.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

A Stability Analysis Including Disturbances . . . . . . . . . . . . . . . . 106

B Robust control for scaled tuning functions . . . . . . . . . . . . . . . 110

vi

List of Tables

6.1 Experiment Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . 64

6.2 Average maximum measured force at the slave end effector due to the

proposed controller, H2 controller, and the output feedback controller 68

A.1 Bounds which contribute to V2,d being negative definite . . . . . . . . 109

vii

List of Figures

1.1 neuroArm haptic hand controllers . . . . . . . . . . . . . . . . . . . . 3

1.2 neuroArm surgical (slave) robot manipulators . . . . . . . . . . . . . 4

1.3 Teleoperation setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.4 Impedance interpretation of a teleoperation system . . . . . . . . . . 6

1.5 Linear system based control architecture . . . . . . . . . . . . . . . . 11

2.1 General shape of the robust control . . . . . . . . . . . . . . . . . . . 21

2.2 General spatial derivative of robust control . . . . . . . . . . . . . . . 22

3.1 Mechanical model of a 1 DOF surgical slave robot in contact with an

environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

5.1 Output feedback control architecture . . . . . . . . . . . . . . . . . . 55

6.1 Master haptic device used for experiments. . . . . . . . . . . . . . . . 59

6.2 Screen capture of the Simulink model used for experiments. . . . . . . 62

6.3 Force profile of the simulated remote environment . . . . . . . . . . . 63

6.4 Comparison of the three controllers for the stiff contact test. Proposed

controller hits the wall with 118N less force than the H2 controller and

21N less force than the output feedback controller. . . . . . . . . . . 69

6.5 Zoomed in version of Fig. 6.4 to emphasize the performance benefit

of the proposed controller . . . . . . . . . . . . . . . . . . . . . . . . 70

6.6 A version of Fig. 6.5 with only the proposed controller performance.

Axis are the same as in Fig. 6.5. . . . . . . . . . . . . . . . . . . . . . 71

viii

6.7 Comparison between the proposed controller and an output feedback

controller for the loss of contact test. The proposed controller has

less positional overshoot than the output feedback controller, but a

greater negative velocity. . . . . . . . . . . . . . . . . . . . . . . . . . 72

6.8 Comparison of the three controllers for the stiff contact test using the

PI human model. The controller response is quite similar to those

shown in Fig. 6.4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

6.9 Zoomed in version of Fig. 6.8 . . . . . . . . . . . . . . . . . . . . . . 74

6.10 A version of Fig. 6.9 with only the proposed controller performance.

Axis are the same as in Fig. 6.9. . . . . . . . . . . . . . . . . . . . . . 75

6.11 Comparison between the proposed controller and an output feedback

controller for the loss of contact test using the PI human model. Again,

the controller response is quite similar to the results shown in Fig. 6.7. 76

6.12 Filtered control signal, Fc, without backstepping for various filter

break frequencies. Contact with wall is made when Fc is approxi-

mately 2. Higher break frequencies allow excitation of the system’s

normal modes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

6.13 Frequency spectrum of the filtered control signal, Fc, without back-

stepping for various filter break frequencies. Higher frequency compo-

nents tend to excite the natural modes of the system. . . . . . . . . . 80

6.14 Showing the filtering properties of the backstepping method. The

backstepping technique attenuates high frequency control signals and

thus allows stable operation. . . . . . . . . . . . . . . . . . . . . . . . 81

6.15 Neural Network outputs in stiff contact. A wall is hit at around 3.5s

and the neural network outputs react accordingly. . . . . . . . . . . . 84

ix

6.16 Neural Network outputs in loss of contact. The puncture occurs at

around 2.5s and neural network outputs react quickly. . . . . . . . . . 85

6.17 Root mean square neural network weights for 200 trials. Weight con-

vergence is achieved. . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

6.18 Maximum neural network weights for 200 trials. Weight convergence

is achieved. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

6.19 Root mean square neural network weight when there is no supervised

learning in ˙p. Instability occurs after 28 trials. . . . . . . . . . . . . . 88

6.20 Convergence of states s and z over 200 trials. . . . . . . . . . . . . . 89

6.21 Force response of the proposed controller in the presence of time delay

for a stiff contact test. Impact force remains the same for arbitrary

time delays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

6.22 Slave position in the presence of time delay for a loss of contact test

(contact lost at x = 0.1m). Positional overshoot increases with in-

creased time delay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

x

1

Chapter 1

Introduction

1.1 Motivation

Teleoperation of robot manipulators provides a means for humans to manipulate

remote, virtual, or otherwise hostile environments. Applications vary widely from

the construction industry, such as backhoe operation, video games, to deep sea ex-

ploration. Humans by nature interact in many different ways with our environment.

Humans are equipped with many sensing devices that help us effectively manipulate

objects to our desire. Sight, smell, touch, hearing, and taste are famously referred to

as our ’five senses’ and are the primary methods by which we attain information from

our surroundings. In teleoperation terms, we are multi-modal in our interactions [1].

The sense of sight tends to be of utmost importance in teleoperation applications [2].

Millan and Colgate [3] have stated that 70% our sensory input as humans is visual

information. As a result, significant technological advances have been made in past

years in trying to give humans a sense of physical presence in remote scenes. Cam-

eras, video recorders, and television are but a few such examples. Research also

suggests that touch vastly improves the performance of human operators in teleop-

eration applications [4–8]. Though the sense of touch only accounts for 5% of our

sensory input, its role in teleoperation is invaluable [3]. Indeed, current advances in

force-reflecting devices echo this demand to be able to feel the remote environment

we are manipulating. In the 1950s the world saw its first force-reflecting applica-

tion in a large modern aircraft. The use of servomechanisms to operate mechanical

2

control surfaces in aircraft effectively eliminated any sense of force applied at these

control surfaces. High frequency buffeting at the brink of a stall no longer warned

the pilot of the impending danger. As a result and out of necessity, a force reflecting

device was required. This thesis examines another application in which the sense of

touch is paramount: robot assisted surgery. Because surgeons must exhibit extreme

delicacy and precision, visual cues alone are inadequate.

Let’s first examine the motivation for robot assisted surgery. A recent project

labeled “Project neuroArm” at the Foothills Medical Research Center is, in more

ways that one, the first of its kind in terms of robot assisted surgery (detailed below)

and is a primary motivator for this work [9]. The ultimate goal of neuroArm is

to provide the means for neuro-surgeons to perform surgery on patients while the

patient is being monitored by Magnetic Resonance Imaging (MRI), an environment

otherwise hostile to surgeons. Allowing the surgeon access to near real-time imagery

of the patient’s brain can ensure that a surgeon removes all traces of cancerous tumors

at the surgery’s conclusion. A result of this is a reduction of follow up surgeries

and hence reduced risk to patients and less strain on already over-booked surgical

equipment. The neuroArm system also employs two haptic (force-feedback) hand

controllers (Fig. 1.1) that the surgeon uses to control the surgical robot. In its current

configuration, manipulating the hand controllers provides positional commands to

each of the surgical robot arms (Fig. 1.2) while force sensor measurements at the

robot/environment interface provide force commands to the haptic devices. Other

uses for robots in surgery include minimally invasive surgery (in which only small

incisions are made to gain access to a desired organ. The commercially available Da

Vinci system [10] eliminates the need for large abdominal incisions in gynecological

surgeries), telesurgery [11] (in which a centrally located surgeon can perform surgery

3

Figure 1.1: neuroArm haptic hand controllers

in remote locations), and tremor reduction or improved performance at the mircoscale

level [12].

A multitude of research has agreed that force-reflecting haptic devices enhance

task performance in robot assisted surgery [13]. However, the addition of haptic

devices certainly complicates the system as well. For instance, there is a necessity for

haptic devices to reproduce forces at a certain level of accuracy. Indeed, inaccurate

or misleading forces reflected by the haptic device could be detrimental, causing the

surgeon to apply excessive force as an example. Accordingly, research in haptics

is dominated by engineers as they use their skills to improve force transparency

between the surgeon and the remote environment. As it tends to proceed with

these undertakings, progress is being made from both a mechanical approach and a

control systems approach. Haptic devices pose some unique challenges, that will be

addressed in this thesis, to which a mechanical approach cannot solve.

4

Figure 1.2: neuroArm surgical (slave) robot manipulators

1.1.1 Introduction to Teleoperation Vocabulary

In order to properly discuss the problems at hand and their associated proposed

solutions, an understanding of certain terms is necessary. Consider a typical teleop-

eration setup found in Fig. 1.3. Starting at the left is the human. From a block

diagram perspective, the human operator is part of the system architecture. The

human receives information about the remote environment from the master device,

processes this information, and based on their intended task sends information back

to the master device. In this thesis, the master device is a robot manipulator, re-

ferred to as the haptic device, outfitted with actuators (to send information, such

as a force, to the human) and encoders and force sensors (to receive information

from the human). The interaction between the human and the master device is a

5

HumanOperator

Communications (T)

MasterDevice

Controller

ControllerSlaveDevice

RemoteEnvironment

Master Side Slave Side

Figure 1.3: Teleoperation setup

physical connection between the human’s hand and the master device. Both the

human and master device reside in a central control room, physically removed from

the remote environment. Information received by the master is then sent through a

communications channel which distorts the information through time delays, noise,

or otherwise. A controller at the slave (or remote environment) side receives the

human commanded information, processes it, and generates a control signal sent to

the actuators of the slave device. Again, in this thesis the slave device is a robot

manipulator equipped with actuators (to manipulate the environment) and encoders

and force sensors (to receive information from the environment). From here, the

process is reversed back to the human.

By construction a teleoperation architecture operates in feedback, with the human

and environment closing the loop. Thus, any additional internal feedback from the

slave/environment to the controller (not shown in Fig. 1.3) complicates the control

design.

To further understand the following discussion, formal definitions of impedance,

admittance, and transparency are given below. These terms follow from electrical

network theory. The analogy between electrical and mechanical systems give rise to

a relation between the effort/flow pair in mechanical systems and the voltage/current

pair in electrical systems.

6

Human Operator

Tele- operator

Environment

Figure 1.4: Impedance interpretation of a teleoperation system

Definition 1. The impedance Z ∈ �n×n of an n-port mechanical system maps ve-

locity v ∈ �n×1 to force f ∈ �n×1

f = Zv. (1.1)

Definition 2. Conversely, the admittance Y ∈ �n×n of an n-port mechanical system

maps force f ∈ �n×1 to velocity v ∈ �n×1

v = Y f. (1.2)

It follows that

Y = Z−1

. (1.3)

Definition 3. Consider the teleoperation setup as in Fig. 1.4. A system is said to be

transparent if the impedance perceived by the human Zt is the same as the impedance

of the environment

Zt = Ze. (1.4)

The same can be said for the admittances. In this case it is said that the teleoperator

achieves perfect transparency

Note that these definitions do not determine which quantities are inputs and

outputs, but only relate system states to each other.

7

1.2 Problem Statement

The biggest obstacle that control engineers must address in haptic systems is the

fact that the human and, more troublesome, the environment enter into the stability

analysis. When interacting with arbitrary environments, the environment can exhibit

zero stiffness (slave robot motion in free space), near infinite stiffness (slave motion in

constrained motion, such as pressing against a wall), and anywhere in between. When

analyzing the system the environment stiffness enters into the stability analysis.

It turns out that the environment stiffness directly affects the gain margin of the

system. Thus, it is difficult to design a controller that maintains stable and desired

performance over a variety of environment interactions.

This thesis aims at designing a controller which maintains stability and desired

performance in all possible scenarios, in the absence of any kind of switching or gain

scheduled control, which as any seasoned control designer understands in difficult

to implement in reality. In particular, the controller is tested for the two most

challenging scenarios: when transitions between extreme environment stiffness are

encountered. These correspond to:

• sudden loss of contact between the slave and the environment occurs. Real life

situations that reflect this case are puncturing through tissue or sliding off the

edge of an object.

• sudden contact with a near rigid environment as, for example, when the slave

hits a wall, table, or bone.

This second scenario also gives rise to a couple of other issues surrounding haptic

control design [14]. When a slave robot comes into contact with a stiff environment

8

it tends to have a large impact force which can cause damage to expensive force

sensors, the slave robot in general, and can be most detrimental to the environment

(i.e. surgical patient). Also, following the initial impact the slave robot has trouble

settling onto the hard surface and repeatedly bounces on and off the surface.

1.3 State of the Art in Teleoperation (literature review)

This section will introduce some of the most advanced and common setups of haptic

systems. Differences between haptic systems arise from

• which sensors are mounted on the slave/master. Typical choices are encoders

and/or force sensors.

• what type of control system is used.

• what properties/behavior does the remote environment exhibit.

1.3.1 Sensor choice

Considering the variety and number of sensors potentially appropriate for the setup in

Fig. 1.3 to work, there are a great number of possibilities and configurations of which

states to control. Generally, the system states of most interest are force and position.

Literature categorizes these possibilities into 2,3, and 4-channel architectures.

4-channel architecture refers to master positions and human commanded force

measurements being sent to the slave-side controller while slave-position and envi-

ronment force measurements are sent to the master-side controller. An advantage to

this type of architecture is that both position and force tracking occur at both ends of

the system. Papers that examine this type of architecture [15,16] show that perfect

9

transparency can be achieved. However, this claim is made only theoretically. Sys-

tem transparency is analyzed using linear system theory and the assumption is that

the environment behavior (impedance or stiffness) is known. Performance degrades

as the environment deviates from the estimated impedance and more importantly

the robustness margin may vanish. For systems operating in highly defined task

spaces, this architecture is appropriate.

2-channel architecture sends one of master-position or human commanded force

to the slave-side controller, while either one of slave-position or environment mea-

sured force is sent to the master-side controller. Literature characterizes these ar-

chitectures as position-position [15, 17] position-force [15, 18–20], force-position [21],

and force-force [22, 23]. An advantage to this architecture is that fewer sensors are

required and a relatively good sense of transparency is maintained. Controller design

tends to be simpler in 2-channel architectures. An assumption that makes 2-channel

architecture valid is that the human operator and the environment behave either as

an admittance or impedance. For example, a system under position control at the

slave-side will interpret the remote environment as an impedance, receiving position

inputs (from the slave) and producing a force output (to the slave). Thus, force com-

mands would be redundant for the slave controller. The difficulty comes when trying

to generate slave positions in order to produce a desired force when the environment

impedance is unknown. Fortunately, humans are able to do this naturally and 2-

channel architecture can exploit this fact because the human is part of the system.

Indeed, the human varies his/her apparent admittance (or impedance, whatever the

case may be) based on their interpretation of the environments impedance (or admit-

tance). Thus, rather than having a controller “guess” the environment impedance,

2-channel architecture lets the human do this while the controllers are only concerned

10

with controlling a single state. Our human brains are quite adept to producing hand

positions in order to apply a certain force. Alternatively, a certain force can be

applied with our hand to achieve a desired position. We switch between the two

methods [21] based on the task at hand (in fact, more accurately we probably em-

ploy a mixture of both approaches). We also do this switching unconsciously. The

ultimate implication is that 2-channel architecture works well because our brains

perform the necessary impedance transformation (or admittance transformation, as

it may be). Another way of looking at this point is to consider the argument that

humans are capable of optimally producing desired position (or force) trajectories

based on given force (or position) and visual information. For this reason, I believe

that 2-channel architecture is superior to 4-channel architecture which essentially

unnecessarily “over controls” the system.

3-channel architecture, as the name suggests, sends 2 desired measurements to the

slave-side (or master-side) controller and 1 desired measurement to the master-side

(or slave-side) controller.

1.3.2 Choice of control system

Rather than an examination of specific controllers, this section gives an overview of

various approaches to controller designs in haptic systems. Literature is dominated

by two methods of controller design: linear system based, passivity based.

Linear system based designs typically utilize the structure displayed in Fig. 1.5

and originally proposed by [15] and seen in many others [16, 24, 25] The C-blocks

are the controller blocks to be designed and 2,3, or 4-channel architecture designs

are achieved by setting appropriate C-blocks to 0 (ie. for position-force 2-channel

architecture set C3 = C4 = 0). Stability of the overall system is guaranteed through

11

Human Operator

Master Controllers Slave Remote Environment

Figure 1.5: Linear system based control architecture

linear stability theory (Routh-Hurwitz criteria [26], scattering conditions [27], or

other frequency domain methods). Additionally, the C-blocks are designed using a

priori knowledge of Zh and Ze (human and environment impedances). Under perfect

knowledge of these impedances, the system achieves perfect transparency. However,

in a general robot assisted surgery application, the environment impedances can

range from free space Ze = 0 to contact with an ideally rigid environment Ze = ∞.

In [16] it is shown that a rigid (or even near rigid) environment easily causes the

system to go unstable. This issue is circumnavigated in [24] by using scaling factors

in some of the C-block controllers or by imposing restrictions on C-blocks which

both trade off system performance for stability as well as reduce flexibility in control

design. Even with these scaling factors, the controller is only guaranteed stable for a

limited range of environment impedances. Additionally, practical use of the scaling

factor method would require online adaptation of the controller based on changes

12

in the environment. It is also unreasonable to assume that an environment behaves

linearly.

Passivity based controllers have gained popularity among haptic researchers for

many reasons. The primary goal of a passivity based controller is to ensure stability

for any environment impedance. A key result of passivity theory is that the inter-

connection of passive systems is also passive. The implication of this result is that

the system can be split into multiple components, each proven passive, and then

the entire system can be assumed passive. Looking at Fig. 1.3, both the human

and environment blocks are considered passive (that is, neither the human nor the

environment behave to destabilize the system). Additionally, the master and slave

devices can be considered passive. The blocks that require design are the controllers

and the communications. In any system the time delay can cause instability and

the amount of allowable time delay is called (or roughly related to) phase margin.

Papers such as [28, 29] have addressed the issue of stabilizing the communications

channel in terms of passivity. In fact, the results are powerful and are able to ensure

stability under an arbitrary time delay. The real problem in passivity based controls

is designing the controller block. In fact, most passivity based systems do not have a

controller at all. This is because to prove that a controller is passive can be difficult.

Indeed, the passivity condition is as follows:

Theorom 1. An n-port system with input u ∈ �n and output y ∈ �n is passive, and

thus stable, iff

�uτ , yτ � ≥ 0, (1.5)

where the inner product is defined in the normal way

�uτ , yτ � :=

�τ

0

u(t)T y(t)dt. (1.6)

13

If u and y represent a power pair, an interpretation of passivity is that the system

must always be dissipating energy. Passivity is a generalization of bounded-input-

bounded-output (BIBO) stability. Yet the whole advantage of passivity based anal-

ysis is to analyze the controller separate from the rest of the system. The difficulty

in designing a controller based on passivity is that the entire set of possible inputs

is probably unknown. This is why stability by passivity is referred to as a type of

unconditional stability.

As such, there is no methodical or constructive method for proving passivity.

Hannaford et. al. [30] has come the closest to developing a general control design by

“observing” the passivity of the system and appropriately injecting any shortage of

passivity. Analogously, this controller adds damping to the system until it stabilizes.

In Khalil’s text [31], this control is called output-feedback control and is discussed

in more detail in Section 5.2.

1.4 Introduction to the Proposed Solution

A deviation from both of the above mentioned control system designs is proposed.

Using the theory of Lyapunov stability and neural networks, stability is guaranteed

for all possible environment impedances while maintaining good and consistent trans-

parency between the human and the remote environment. The proposed controller

has no prior knowledge of the remote environment and is fully adaptive, without

the need for switching controllers or gain scheduling. By sending force commands

to the slave and receiving force information from the environment a 2-channel con-

trol architecture is designed. By doing so it is shown that the resulting force in the

case of stiff contact between the slave and environment reduced, thus protecting the

14

hardware and patient. Three main contributions can be identified in this work:

• A unique auxiliary error definition reduces force errors while providing position

control when the slave moves in free space,

• Using neural adaptive backstepping to ensure stable control when contact with

a stiff environment is made,

• Novel neural network update law ensures stable and robust control.

15

Chapter 2

Background Theory

There are seven key theorems and concepts that will prove necessary in subsequent

chapters of this thesis. First Lyapunov theory is examined, then the control Lya-

punov function, the Lyapunov redesign method, and backstepping as methods for

nonlinear stability analysis. Then, radial basis function neural networks (RBFNs)

are introduced and their ability to model uncertain functions is discussed. Finally,

this section will discuss the integration of Lyapunov theory and RBFN theory from

a control system’s perspective. Theory introduced in this section is not complete.

Rather, the important results pertaining to the following work are presented.

2.1 Lyapunov Theory

The dominant method for ensuring the stability of nonlinear systems stems from

Lyapunov theory. Lyapunov theory examines primarily the stability of equilibrium

points of dynamical systems, so a definition of stability in terms of equilibrium points

is a logical place to start.

Generally speaking, an equilibrium point is stable if solutions of the system start-

ing near the equilibrium point stay near the equilibrium point. The set of starting

points in which the solution happens to stay nearby the equilibrium point define a

region in which the system is stable. If solutions diverge away from the equilibrium,

this point is called an unstable equilibrium point. From an stability analysis per-

spective it is sufficient to prove that solutions stay nearby the equilibrium point, but

16

a stronger and more desirable condition is that, as time approaches infinity, solu-

tions of the system actually converge to the equilibrium point. This scenario, like

it’s linear counterpart, is called asymptotic stability. Formulized in a mathematical

definition [32]:

Definition 4. A point xe is said to be an equilibrium point of the system

x = f(x), (2.1)

if f(xe) = 0,∀t. In addition, xe is a stable equilibrium point if for any given t0

and positive � ∈ �+ ∃ a δ = δ(t0, �) ∈ �+ such that if ||x(t0) − xe|| < δ, then

||x(t : t0, x0)− xe|| ≤ � for ∀t ≥ t0.

Lyapunov theory bridges the gap between understanding this definition and an-

alyzing stability. Given an arbitrary dynamic system, Lyapunov theory allows us to

determine the stability of an equilibrium point by the following theorem

Theorem 2. Let D ⊂ Rn contain the equilibrium point x = 0 of the system

x = f(x). (2.2)

x = 0 is a stable equilibrium if ∃ a continuously differentiable smooth function of the

system states V : D → R such that

V (0) = 0, (2.3)

V (x) > 0 ∈ D, and (2.4)

V (x) ≤ 0 ∈ D. (2.5)

The reader is referred to [31] for a detailed proof of this theorem. Note that

there are few restrictions on the function V , and choosing an appropriate V is an

17

art rather than a science. On the other hand, this lack of restrictions allows great

flexibility in the analysis of the system and this freedom permits a control system

designer to achieve desired results. Additionally, it is important to realize that,

without affecting system behavior, through appropriate coordinate transformation

the point x = 0 can be made an equilibrium point. Finally, though the above theory

pertains to an autonomous system it can be extended to non-autonomous systems

provided the control is a function of the system states and is thus absorbed into f(x).

2.1.1 Control Lyapunov Function

Due to the descriptive rather than constructive nature of Theorem 2, discussion of

the control Lyapunov function method is necessary [33]. Lyapunov-based control

design follows a straightforward process and allows a control system designer great

flexibility in their design. This freedom can be exploited to achieve desired perfor-

mance, robustness, and control cost. First, a candidate Lyapunov control function

V satisfying the conditions

• V (0) = 0,

• V = V (x),

• V > 0 ∈ D,

is chosen. Then, the time derivative of V is determined. Finally, a control is designed

that forces V to be negative semi-definite. Such a process ensures stability of the

equilibrium point. The major leap made here is that a priori knowledge of the

derivative of V is not necessary in order to design a stabilizing control.

18

2.1.2 Lyapunov Redesign (robust design)

The method of Lyapunov redesign [34] follows from the Lyapunov control function

method and provides a method for robust control design. Lets consider the system

x = f(x) + gu + h(x)δ(x, t), (2.6)

where f is a known function, g �= 0 is a parameter, u is the control, h(x) is some

function that acts on the uncertain but bounded function δ(x, t),

|δ(x, t)| ≤ ρ(x) ≤ δmax. (2.7)

To begin the robust design process a control unom(x) that stabilizes the nominal

system is found

x = f(x) + gunom. (2.8)

The final control designed will be u = unom + urob where urob is the robustifying

component of the controller. If unom stabilizes Eq. (2.8), then ∃ some Lyapunov

function V (x) for the nominal system. Using the same Lyapunov control function as

for the nominal system, the derivative is found to be

V (x) =∂V

∂x

�f(x) + g[unom(x) + urob(x)] + h(x)δ(x, t)

�. (2.9)

urob can be designed at this stage and there are many possible designs, two of which

will be discussed. Choose the robust control

urob = −|h(x)|ρ(x)

g

∂VT/∂x

|∂V/∂x|, (2.10)

then

V (x) ≤∂V

∂x

�f(x) + gunom(x)

�+

��∂V

∂x

��

�− |h(x)|ρ(x) + |h(x)||δ(x, t)|

�, (2.11)

19

which is known to be negative because of the nominal design and the fact that

|δ(x, t)| ≤ ρ(x) and the system is asymptotically stable. Granted, this design of urob

causes the Lyapunov function to be discontinuous and thus violating a requirement

of Lyapunov theory. However, there are methods to smoothing out this control which

in return define a region of convergence rather than asymptotic stability. One such

example is to use tanh(∂V/∂x) rather than ∂VT

/∂x

|∂V/∂x| .

Another such design for urob which is employed in this thesis is

urob = −µ1

g|h(x)|r1

��∂V

∂x

��r2

, (2.12)

where r1 > 1 ∈ �, r2, andµ ∈ �+. If this robust control is being designed for a virtual

control of a backstepping design, this robust control law and its n-time-derivatives

must have an analytic solution at the equilibrium x = 0, where n is the number of

stages remaining in the backstepping procedure. This is because a robust control

is designed for each virtual control and will thus be differentiated at the following

step of backstepping. This constraint directly affects the parameters r1 and r2. For

example, if there are two stages of backstepping in a particular design r1 > 2 and

r2 > 1 must be satisfied and for the robust control designed in the first stage. Usually,

however, r1 = r2 = 1.1 is chosen.

To see the design reasoning for this control it is helpful to examine what Eq.

(2.12) does to the Lyapunov function,

V (x) =∂V

∂x

�f(x) + gunom(x)

�+

∂V

∂x

�− µ|h(x)|r1

��∂V

∂x

��r2

+ h(x)δ(x, t)

�. (2.13)

Consider the bounds on V (x) due to the robust control (the nominal component is

already known to be asymptotically stable),

Vrob(x) ≤

��∂V

∂x

��

�− µ|h(x)|r1

��∂V

∂x

��r2

+ |h(x)|δmax

�, (2.14)

20

and it can be seen that Vrob(x) < 0 when

��∂V

∂x

��|h(x)|r1−1

r2 >

�δmax

µ

�1/r2

. (2.15)

It is clear now the restrictions mentioned above on r1 and r2. In order to explicitly

find the bound of the error signal, assume that

V =1

2x

2, (2.16)

and the nominal control design results in an asymptotically stable system x = −Gx.

The Lyapunov derivative for the disturbance and robust terms is found to be

Vrob(x) = x(−µ|h(x)|r1|x|r2 + h(x)δ(x, t)), (2.17)

which is bounded by

Vrob(x) ≤ |x|(−µ|h(x)|r1|x|r2 + |h(x)|δmax). (2.18)

With certainty it can be said that |h(x)|r1 ≥ |h(x)| for r1 > 1. Therefore, as a

worst case when |h(x)|r1 = |h(x)|

Vrob(x) ≤ |x||h(x)|(−µ|x|r2 + δmax), (2.19)

and it can be said that

Vrob(x) < 0 when |x| >

�δmax

µ

�1/r2

. (2.20)

Because of the original control Lyapunov function, this is also the ultimate bound

on the signal. A point of interest is that this ultimate bound occurs when |h(x)|r1 =

|h(x)|, which implies h(x) = 0. Thus, the ultimate bound exists only when distur-

bances are absent and we are guaranteed a smaller bound otherwise.

21

Figure 2.1: General shape of the robust control

At first it seems desirable to design µ large but in reality it is desirable to achieve

µ ≤ δmax. This way, the robust control does not dominate over the nominal control.

The robust control laws are designed in this way for a few reasons. First, knowl-

edge of ρ(x) is not needed. Second, it is smooth and its derivative is also smooth

and analytic near the origin. Fig. 2.1 show the general shape of the robust control

and its spatial derivative is shown in Fig. 2.2. Thus, the requirements for Lyapunov

stability are met.

Third, a robust control is desired to bound the disturbance Lyapunov function

but in general it is not desired for this robust control to dominate over the actual

control. The nonlinear damping technique [35] would use powers of 2 or 3 (versus

22

Figure 2.2: General spatial derivative of robust control

1.1 in the proposed robust control) and shrinks the bounds as well as ensures fast

convergence. However, these high powers tend to dominate the nominal control and

affect system performance. Analyzing the bounds on Vrob in a conservative sense and

using maximum disturbances to derive ultimate bounds provides deceiving predic-

tions of the actual performance and is a more useful tool for simply guaranteeing

stability. Experiments show that using powers of 1.1 ensure sufficient robustness

while allowing the nominal control to behave as desired.

23

2.1.3 Backstepping

For systems in strict feedback form [31]

x1 = f1(x1) + g1x2, (2.21)

x2 = f2(x2) + g2x3, (2.22)

...

xn−1 = fn−1(xn−1) + gn−1xn, (2.23)

xn = fn(xn) + gnu, (2.24)

with x1...xn and g1...gn ∈ � the backstepping technique allows us to control states

x1 → xn−1 “directly” by stepping the integrator for state xn back through the system.

It is possible to do this by assuming that there exists some control α(x), called a

virtual control, that can stabilize the system described by states x1 → xn−1. Then

the actual control u is designed in hopes of attaining α. The backstepping technique

does so by introducing an additional state which is the difference between the actual

control and the desired virtual control

z = u− α. (2.25)

Again, the best way to visualize what is happening is by example. Consider the two

state system

x1 = f1(x1) + g1x2, (2.26)

x2 = f2(x2) + g2u. (2.27)

In the first stage of backstepping some control α is designed that can stabilize the

one state virtual system x1 = f1(x1)+ g1α by assuming a Lyapunov control function

24

V1 = 12x

21,

V1 = x1(f1(x1) + g1α) + g1x1z. (2.28)

It is seen that α = −g−11 (f1(x1) + G1x1) where G1 > 0 ∈ � is desired. However,

in reality the state x2 controls this subsystem so u must be designed such that x2

approaches the desired virtual control α. Introducing the error z = x2 − α and

designing u to drive z → 0 will achieve this goal. For the second stage, use a

Lyapunov control function V2 = V1 + 12z

2,

V2 = −G1x21 + g1x1z + z(x2 − α), (2.29)

= −G1x21 + g1x1z + z(f2(x2) + g2u− α). (2.30)

Choosing

u = g−12 (α− f2(x2)− g1x1 −G2z), (2.31)

where G2 > 0 ∈ � stabilizes the system asymptotically. G2 in this case determines

the control effort spent to reduce the virtual control error z.

2.1.4 Adaptive Backstepping and Tuning Functions

Note that the final control designed using the backstepping approach requires knowl-

edge of α. It turns out that this term must be evaluated analytically in order to ensure

good system performance. However, it may not always be possible to do this. In the

above example,

α = −g−11

�∂f1(x1)

∂x1x1 + G1x1

�. (2.32)

It is possible that some of the derivatives in Eq. (2.32) are unknown. In which case it

is common to model these unknown terms using a universal approximator (denoted

UA and described below in detail). If the unknown terms are lumped into µ(x1, x1)

25

and the known terms into γ(x1, x1),

α = γ(x1, x1) + φµ(x1, x1)wµ, (2.33)

where φµ(x1, x1)wµ models the unknown term

µ(x1, x1) = φµ(x1, x1)wµ. (2.34)

This approach is often referred to as a kind of adaptive backstepping and the learning

rule is based on the virtual control error z. It is important to note that the UA should

only be used to model terms that are unknown and any known derivatives should

be calculated analytically. The derivative of α must be calculated as analytically as

possible to ensure good performance and robustness.

Another possibility is that some of the terms in the first stage of backstepping were

unknown and also modelled using a UA. If this is the case, these unknown terms will

appear in the second stage of backstepping (for instance, x1 appears in Eq. (2.32)).

One could use another UA to model all unknown functions in Eq. (2.32), called an

over-parameterized system. Indeed, this must be done if the learning rule is designed

in the first stage of backstepping. However a much more robust method is to postpone

the learning rule design to the second stage of backstepping. This will allow use of the

original UA from the first stage to model the same terms in the second stage. Doing

so will introduce additional derivative terms into the learning rule which are called

“tuning functions” [36]. These tuning functions improves robustness by providing

the learning rule with information of additional system dynamics. Another way of

looking at the increased robustness is to see that a tuning function method allows a

more analytical model α. There is also the freedom to scale the tuning functions to

achieve a desired performance and a proof is shown in Appendix B. Note that there

26

may still be additional unknown terms in Eq. (2.32) that will be modeled using a

UA.

2.2 Radial Basis Functions Networks

A radial basis function network (RBFN) [37] is a type of neural network and is a

powerful and useful tool for modelling unknown and nonlinear functions. RBFNs are

a special case of an universal approximator, thus they cannot only model a function

but are guaranteed to model the function well enough that the modeling error is

bounded. This is seen from the following definition and theorem,

Definition 5. A family of functions g : D → � is of class G if

• the constant function g(x) = 1, x ∈ D belongs to G,

• the sum ag1 + bg2 is of class G for a, b ∈ � and g1, g2 ∈ G,

• the product g1g2 is of class G for g1, g2 ∈ G,

• g(x1) �= g(x2) for x1 �= x2 but x1, x2 ∈ D,

and

Theorem 3. Given a continuous function f : D → � ∃ for each � > 0 a function

{f ∈ G : D → �} such that

||f(x)− f(x)||∞ < �, x ∈ D, (2.35)

[38].

The above definition and theorem encompass the Stone-Weierstrauss approxima-

tion theorem and provides the foundation for proving system stability using universal

approximators [39].

27

For a single output RBFN there is an m-element row vector of n-dimensional

radial basis functions (kernels) φ(q) as well as an m-element column vector of weights

w. q are the inputs to the network and the output is given by

o = φ(q)w =m�

i=1

� n�

j=1

φi(qj)

�wi. (2.36)

The following corollary pertaining to RBFNs resulting from Theorem 3 can be stated

Corollary 1. Let f(q) : �n → � be a Lipschitz function and let φ(q) ⊂ G and

φ(q) ⊂ D be integral bounded kernel functions. The function f can be expressed as

f(q) =m�

i=1

� n�

j=1

φi(qj − cj

σ)

�wi + d(q), (2.37)

on D where ci > 0, σ > 0, and ||d(q)|| ≤ δ is bounded.

Lipschitz functions can be defined as the following:

Definition 6. A function f : X → Y is Lipschitz continuous if ∃ a K ∈ �+ such

that

|f(x1)− f(x2)| ≤ K|x1 − x2|, ∀x1, x2 ∈ X. (2.38)

Here K is the Lipschitz constant.

That is, a RBFN is able to approximate any smooth real-valued function f(q) on

a certain domain D and has associated with it a set of ideal weights w which result

in a modeling error expressed by the function d(q). Of course in reality access to

these ideal weights is impossible but a set of actual weights w are available and a

weight error can be defined by

w = w− w. (2.39)

A RBFN can be structured in many ways, though for the sake of this thesis each

m basis functions are centered at a unique point ci in the domain D and all have a

28

width σ. By width it is meant, in a statistical sense, the radius from the center ci

at which 1 standard deviation of the area under the basis function exists. All neural

networks in this thesis are implemented as RBFNs using Gaussian basis functions

φi(qj) = exp

�−

(qj − ci)2

2σ2

�, i = 1, ...,m. (2.40)

Examples of Gaussian basis functions in RBFN adaptive control design are numerous

and well documented in the literature [40–43]. It can be seen that φi(qj) is of class

G since

φi(qj) = 1 with σ = ∞. (2.41)

The other conditions are trivial to prove.

Finally, from a performance perspective, the minimum number of basis functions

required to approximate f(q) is 2n, where n is the dimension of the input vector q.

Requiring 2n basis functions allows for a very course approximation, allowing one

basis function for each dimension and in each direction. Often this requirement is

referred to as the “curse of dimensionality” because RBFNs with many inputs may

require too much computer memory to implement for large n.

2.2.1 RBFN in controller design

The first to formulize and justify the use of neural networks in controller design

was [44]. The easiest way to see how RBFNs are used in controller design is by

example. Consider a system with single state x, an equilibrium point at x = 0, and

dynamics

x = f(x) + u, (2.42)

29

where f(x) is a real-valued but unknown function which can thus be modeled by a

RBFN

f(x) = φ(x)w + d(x), |d(x)| < dmax ∀x ∈ D. (2.43)

Theorem 4. A stabilizing control for Eq. (2.42) is

u = −φ(x)w−Gx, (2.44)

where G > 0 ∈ � and the weights w are updated online according to ˙w = φTx− νw

Proof. Using a positive definite Lyapunov control function

V (x, w) =1

2(x2 + wT w), (2.45)

V will be shown to be bounded:

V = xx + wT

�d

dt(w− w)

�, (2.46)

using Eqs. (2.42),(2.43),(2.39),(2.44), and rearranging algebraically,

V = −Gx2 + xd(x) + νwT w− νwT w, (2.47)

≤ −G||x||2 + dmax||x|| + ν||w||||w||− ν||w||

2, (2.48)

≤ −G

��||x||−

dmax

2G

�2

−d

2max

4G2

�− ν

��||w||−

||w||

2

�2

−||w||2

4

�. (2.49)

Recognizing this as the general equation of an ellipse it can be said that V is negative

when

||x|| >dmax

2G+

�d2

max

4G+

ν||w||2

4G= δx, (2.50)

or

||w|| >||w||

2+

�d2

max

4νG+

||w||2

4= δw. (2.51)

30

An ultimate bound, xb on the system state x is explicitly using Vb = 12(||x||

2 + ||w||2)

Vb(||x|| = xb, ||w|| = 0) = Vb(||x|| = δx, ||w|| = δw), (2.52)

1

2x

2b

=1

2(δ2

x+ δ

2w), (2.53)

xb =�

δ2x

+ δ2w. (2.54)

There are some points of interest that arise from the above proof that are worth

discussing. First of all, the states are bounded by an elliptic region centered about

the point (dmax2G

,||w||2 ) in the (||x||, ||w||) plane and whose maximum size depends on

dmax, ν, and ||w||. It is desired to have this region as small as possible for the states

to be forced close to equilibrium. However, control system designers tend to be

restricted as to how small they can make this region. Since dmax is a property of the

RBFN and depends on the number of hidden layers, width, and centers of the basis

functions. Thus, decreasing dmax tends to be an art rather than a science. Factors

that effect dmax include the dispersion of basis functions within the domain D, widths

of the basis functions, number of basis functions, and types of basis functions.

||w|| is primarily a property of the unknown function f(x).

ν is a coefficient used to bound the RBFN weights and the term νw is called

a leakage term [45]. The leakage term is used to bound w since without it the

time derivative of the Lyapunov control function would not be a function of all the

system states. There are other methods to doing this other than leakage such as

projection [32], deadzone [46], and e-modification [47]. Again, choosing a value for

ν is largely heuristic rather than prescriptive.

This thesis will also use projection which will be examined now. Projection is a

31

robust weight update law defined by

˙w =

0 if w ≥ ||w||max and φTx > 0,

0 if w ≤ ||w||min and φTx < 0,

φTx otherwise.

(2.55)

Such an update law ensures that V is bounded by considering that a bound on w

exists by construction

||w|| = (||w||max − w) ∪ (w− ||w||min), (2.56)

and V is negative when

||x|| >

dmaxG

if ˙w �= 0,

dmax+φ||w||G

otherwise.

(2.57)

Projection is a weight update law used when certain limits of the neural network

output can cause the system to go unstable. It is commonly used when the neural

network output is inverted and thus the output must remain a certain distance from

0.

Another point to note is that the RBFN weights are updated based on the system

state x, not on w. As a result, it is incorrect to say that the RBFN “learns” f(x).

Rather, it is common to state that the RBFN “models” f(x) and in fact the RBFN

only aims at driving the system states to equilibrium. As such, the effectiveness of

a RBFN in terms of controller design stems from its ability to adapt quickly rather

than producing a highly accurate model. Although this has its disadvantages a key

advantage to updating weights based on system states is that RBFNs tend to be

quite robust to unmodeled uncertainties and disturbances.

32

Chapter 3

Proposed Controller Design

3.1 Introduction

Beginning with an overview of the system dynamics and block diagrams, error defini-

tions, and design choices, this section ultimately proposes a controller for a surgical

teleoperation haptic system. It is written under the constant consideration of two

extreme scenarios: When contact with the environment is suddenly lost, and when

the slave robot comes into contact with a wall.

3.2 System description

First of all, this thesis examines only a one-degree of freedom (DOF) translational

slave robot with the assumption that all of the theory mentioned hereafter can be

extended to the multi-DOF case through appropriate kinematic transformations. In

light of this, the controller design begins with the mass-spring-damper model shown

in Fig. 3.1. Dynamically, Fig. 3.1 model is represented by

Mx = −Drx−Dmx−K(x)x + Fc, (3.1)

where x(t) ∈ �. M represents the combined inertia of the robot and the surgical

tool. Dr contains damping coefficients arising from the robot joints. The terms K(x)

and Dm are properties of the tissue/material that the surgical robot is in contact

with. The approach of modeling tissue with a spring is well established in the liter-

ature [48], [49], however to be precise human tissue tends to exhibit a viscio-elastic

33

Figure 3.1: Mechanical model of a 1 DOF surgical slave robot in contact with anenvironment

property meaning K = K(x, t). For design purposes a shortcut is made by assuming

that some of this viscio-elastic behavior can be captured in a spring-damper model.

As will be seen the proposed controller is highly robust to time-varying properties of

the tissue due to fast adaptation in the neural-network control law. Tissue also has

mass, but we do not model this explicitly in Fig. 3.1 because any inertial properties

of the tissue can be included in M (the mass of the slave robot). K(x) also contains

the end-effector force sensor dynamics.

The slave robot positions, velocities, and accelerations are x, x, x respectively. Fi-

nally, Fc is the controller output and is what’s designed in this chapter.

From the above mentioned model it is implied that the force sensor on the end-effector

of the slave robot outputs

Fm = Dmx + K(x)x. (3.2)

34

3.3 Proposed Control Law

This section will introduce the proposed control law and justify design decisions. An

in-depth stability analysis follows in Chapter 4.

The proposed design focuses on attempting to improve three limitations in current

haptic controllers:

• When the slave robot comes into contact with stiff environments (such as a

wall, bone, or table) controllers tend to apply excess force to the environment,

potentially causing damage to the robot or the environment as well as causing

instability

• If the slave robot suddenly loses contact with the environment or punctures

through a layer of tissue there tends to be tool overshoot.

• Control commands are required to be filtered causing a possible source of in-

stability.

These items will be discussed in more detail to follow.

3.3.1 Design choice 1: Error definition

A primary objective of the proposed controller is to have the slave robot track hu-

man commanded force rather than position. Under a time delayed communications

channel, this will decrease the impact force when the slave hits a wall. For instance,

say the human operator commands a force Fd(t) and a time delay of T seconds exists

in the communications channel. A force tracking controller will aim for

Fm(t) = Fd(t− T ). (3.3)

35

That is, if the slave robot hits a wall and measures a large contact force the con-

troller will pull the slave back, at least initially. This is contrasted to a system under

position control in which the human may have commanded a slave position beyond

the physical constraint of the wall. A controller designed for position tracking will

push “through” the wall and may apply a dangerously large force to the slave and/or

environment before the human can react to the collision. Such high contact forces are

particularly dangerous in surgical applications where the “environment” is a human

patient and the robots are very expensive.

Although performance improves in the case of stiff contact, force control will com-

promise performance when a sudden loss of contact with the environment occurs. In

this case, the controller will cause the slave to overshoot as it tries to achieve the last

commanded force. Thus, another objective of the proposed controller is to reduce

tool overshoot in the case that contact is suddenly lost with the environment. Such

a scenario could represent a puncture through tissue or the sliding of the end-effector

off of a surface.

For these reasons, an auxiliary error is defined for the controller

s = Λ� + x, (3.4)

where � is the force error defined by � = Fm − Fd and Λ ∈ � is a positive tuning

parameter. A larger value for Λ will emphasize force tracking whereas a small Λ

slows down the slave and behaves like a damper. Admittedly, a slowly responding

robot is undesirable however the proposed controller allows for the surgeon to tune

the value of Λ to their liking. Also, this damping acts to maintain desirable control

in two ways: One, the robot is slowed down after a puncture or loss of contact and

two, if the slave bounces off a hard surface the controller will dampen any vibrations

36

that arise (which often occur in teleoperation). Achieving s ≡ 0 is the ultimate goal

of the proposed controller. Doing so implies

x = −Λ�. (3.5)

Let’s examine what the implications of achieving s ≡ 0 are in the case of stiff contact.

Substituting in the expression for Fm results in

x = −ΛFm + ΛFd, (3.6)

= −ΛK(x)x− ΛDmx− ΛFd, (3.7)

=−ΛK(x)

1 + ΛDm

x +Λ

1 + ΛDm

Fd, (3.8)

which constitutes a stable (nonlinear) system with input Fd and state x under the

reasonable assumption the K(x) is positive semi-definite. Thus, if a controller attains

s ≡ 0 the system behaves very much like a sliding mode controller in which the state

trajectory will asymptotically approach the origin along the line x = −Λ�.

In the case of unconstrained, free motion where there is no contact with material

and Fm = 0,

x = −ΛFd, (3.9)

and the system comes under velocity control in which the slave velocity is directly

proportional to human commanded force. Again, achieving s ≡ 0 will cause the state

trajectory to converge asymptotically to the origin along the line x = ΛFd.

3.3.2 Design choice 2: Backstepping

As mentioned previously, control commands must be filtered before being sent to

the slave to protect the actuators from damage. In linear (and even some simple

non linear) control applications the effect of this filter can be accounted for and

37

stability can be ensured. For example, an ideal relay controller has well known

properties and its describing function is used to ensure stability when the control

signal is filtered. However, if a more complex and nonlinear controller is desired,

the describing function method is not only laborious (or impossible) but can also

become an inaccurate model, unable to guarantee stability. As a result, filters are

added heuristically and the stability analysis is performed assuming their absence.

The reason why a control signal must be filtered before it is sent to the actuators

is because high frequency components in the signal can cause chattering, limit cycles,

un-due stress on the actuator hardware, and, in general, unexpected motor dynamics

in the high frequency range.

A novel solution is proposed by using the backstepping technique as a tool to filter

the control signal. This approach has its advantages in that the filtering properties

that are desired are built into the controller and the entire system is analyzed for

stability holistically. Using the backstepping technique in this way is certainly novel

as it is not being used it for its originally intended purpose but rather exploiting a

resulting characteristic of systems designed using backstepping. An extension from

the analysis of backstepping in Section 2.1.3 reveals the filtering properties of the

backstepping technique. Consider the Lyapunov candidate

V2(x1, z) = V1(x1) +1

2z

2, (3.10)

where z is the virtual control error u(t) − u∗(t) and V1 is the Lyapunov function

designed in the previous step. The derivative becomes

V2 = V1 + x2z, (3.11)

and in general, after virtual control design, ending up with

V2 = −G1x21 −G2z

2, (3.12)

38

implying that z has dynamics

z = −G2z, (3.13)

which is an asymptotically stable system, as t →∞ the virtual control error z → 0,

and there is no filtering effect beyond the transient response. If, however, in the likely

case that an exact expression for α is unknown but a “best estimate” derivative of

virtual control ˙α is attainable then, using the same control as in Eq. (2.31) but with

the “best estimate” derivative,

V2 = −G1x21 −G2z

2 + ( ˙α(t)− α(t))z. (3.14)

Thus, z has dynamics

z = −G2z + ˙α(t)− α(t). (3.15)

The frequency domain solution for z assuming zero initial conditions is

L{z} = L{−G2z + ˙α(t)− α(t)}, (3.16)

sZ(s) = −G2Z(s) + sA(s)− sA(s), (3.17)

(s + G2)(U(s)− A(s)) = sA(s)− sA(s), (3.18)

U(s) =sA(s)

s + G2+

G2A(s)

s + G2. (3.19)

Here L{α} = A(s) is used. It can be seen that the actual control u(t) is a sum of a

filtered α(t) and a filtered α(t). Thus, any high frequency components of either of

these signals do not appear in the actual control. Additionally, the break frequency

at which the control is filtered can be fully determined through G2. A small G2 will

smooth out the control more than a large G2.

In the proposed design, the virtual control error turns out to be

z = Fc − α, (3.20)

39

where

α = αnom + αrob, (3.21)

and

αnom = Fm + p−1(−φ1w1 + ΛFd −G1s), (3.22)

and αrob is a Lyapunov redesign robust control defined as

αrob = −|s|1.1(µ2|− Fm + Fc|

1.1 + µ1). (3.23)

Because p and φw are neural approximations, the terms that could potentially intro-

duce high frequency components in α are Fm and s. As will be seen, the derivatives of

these terms are modeled using neural networks, thus smoothing them and justifying

a backstepping design.

Originally, the backstepping technique was believed to filter out large spikes in

the term Fm, especially when the slave came into contact with a wall or suddenly lost

contact with the environment. However, testing has shown that the true advantage

of using the backstepping technique is when the slave robot is already in contact

with a wall. Consider the natural mode of vibration for a mass-spring model with

stiffness K and mass M having an undamped natural frequency of

fn =1

2π

�K

M. (3.24)

As K increases the frequency of this natural mode also increases. In the case of

a surgical slave robot, a large K may contain surgical tool flexibilities. It is well

documented that these high frequency natural modes are easily excited by a control

system [50]. Many have proposed various filter designs to try and dampen the high

frequency components of the control signal so that the controller does not excite the

natural modes of the system [51]. Indeed, controlling these high frequency modes

40

is often unnecessary since they tend to not only have small amplitudes (when not

excited) but are also slightly damped (a property of the physical system). In the

case of this thesis, the damping Dm ensures that the natural modes of vibration

dampen naturally without the controller. Thus the control of lower frequencies is

more important. The proposed use of the backstepping technique allows for an easily

tunable and effective filter and certainly a viable alternative to other approaches

recorded in the literature. Additionally, the stability proof is strong and potential

causes of instability are indentified easier.

3.3.3 RBFNs used in controller

The proposed controller will make use of three neural networks to model unknown

terms in the system. No a priori knowledge of slave and environment dynamics is

assumed. The first neural network models

φ1(x, Fd, s)w1 + d1 = Λd

dt(K(x)x)−M

−1(ΛDm + 1)Drx. (3.25)

Explicitly, the terms on the right hand side depend on x and x but providing the

neural network with x, Fd, and s will give it unique state information since s =

s(x, x, Fd). For implementation purposes it is easier to provide x,Fd, and s and there

is no effect on stability. Additionally, looking ahead to the backstepping technique,

providing x, Fd, and s will make implementing the derivative terms of φ1 easier.

φ1w1 has no intuitive physical significance and it is helpful to go through the

stability proof in Chapter 4 to justify its design. The weights are updated according

to

˙w1 = β1

�τ1 + γφ

T

1

�∂φ1

∂sw1 + G1

�p−1

z − ν1w1

�, (3.26)

where β1 > 0 ∈ � determines the learning rate of the RBFN and τ1 is the tuning

41

function

τ1 = φT

1 s, (3.27)

implemented for robust control purposes. γ > 0 ∈ � is a scaling factor which

is often needed because the derivative terms in Eq. (3.26) tend to dominate in

magnitude over the tuning function. As a result, γ tends to be << 1. Scaling the

tuning functions by γ so that all terms are similar in magnitude, allowing the control

system designer to balance robust control with desired system performance is shown

by [52]. ν1 > 0 ∈ � is the coefficient for the robustifying leakage term used to

ensure the weights are bounded, and G1 > 0 ∈ � is a positive control gain. Leakage

(σ-modification) is used for this update law because it tends to stabilize the weight

updates better than e-modification, which is particularly susceptible to the bursting

phenomenon discussed in [53], [54]. Bursting in this case of a surgical slave robot is

likely caused by the drastic changes in errors that occur at transitions in environment

stiffness.

The second neural network models

p + dp = M−1(ΛDm + 1). (3.28)

To be precise, in the one-DOF case this is not a neural network but an adaptive

parameter. Looking ahead to a mulit-DOF case, I found it appropriate to include

this adaptive parameter in this section as the mass matrix will be dependant on both

link angles and angular velocities. Likewise, the damping coefficient will become a

matrix of coriolis and centripetal forces also dependant on link angles and angular

velocities. The parameter p is updated according to

˙p = βpProj

�τp + γ(−Fm + Fc)

�∂φ1

∂sw1 + G1

�p−1

z + ζ(p− p)

�, (3.29)

42

where τp is the tuning function

τp = (−Fm + Fc)s, (3.30)

and the definition

Proj[·] =

0 if p > ||p||max and · > 0,

0 if p < ||p||min and · < 0,

· otherwise,

(3.31)

called projection, is used to ensure the boundedness of p. Additionally, since p−1

exists in the control law (shown below), the projection approach ensures that this

term does not become too large. It is hard, if not impossible, to know exact values

for ||p||max and ||p||min so in implementation these values are estimated as desired

maximum and minimum values that are to appear in the control law. Because of

this, a novel projection rule is used which includes the supervisory term

ζ(p− p), (3.32)

where ζ ∈ �+ and p is a number that the neural network output p is driven towards.

ζ is chosen small such that the update law still has flexibility in its learning abilities

yet p does not deviate significantly from p. p is chosen to be a very rough estimate of

p (but more importantly an initial value of p). In fact, results obtained in this thesis

suggest that performance has less to do with the choice of p and depends more on the

fact that p stays within a certain range of it’s initial value. Results are convincing

in support of this novel method.

There is an analogy of this novel supervisory method to the common leakage method.

Leakage terms tend to drive the weights to their initial value (zero). Similarly, the

43

supervisory term also drives the weights to their initial value. This analogy is further

strengthened in the results section when it was found that the ideal value of ζ is also

the ideal value of ν1,2.

A final neural network models the unknown term

φ2(x, x, Fc)w2 + d2 = −Fm(x, x, Fc), (3.33)

and is updated according to

˙w2 = β2(φT

2 z − ν2w2). (3.34)

For stability purposes the reasonable assumption that all uncertainties d1, dp, and

d2 are bounded functions of the system states is made (according to Theorem 3).

3.3.4 Control Law

The actual control applied to the slave is

Fc(t) =

�t

0

Fc(τ)dτ, (3.35)

where

Fc = Fc,nom + Fc,rob. (3.36)

The nominal component is defined as

Fc,nom = ˙α− sp−G2z, (3.37)

and the robustifying term, found using the Lyapunov redesign approach, is

Fc,rob =−

�µ3 + µ4|κ1|

1.1 + µ5|κ1(−Fm + Fc)|1.1 + µ6|κ2|

1.1

+ µ7|κ3|1.1 + µ8|κ3(−Fm + Fc)|

1.1

�|z|

1.1, (3.38)

44

with the following defined

κ1 =− p−1

�∂φ1

∂sw1 + G1

�, (3.39)

κ2 =p−1|s|

1.1µ2|− Fm + Fc|

0.1sgn(−Fm + Fc), (3.40)

κ3 =p−1|s|

0.1(µ2|− Fm + Fc|1.1 + µ1)sgn(s). (3.41)

It is important to design a robust control because of the auxiliary error definition.

Because it behaves like a sliding mode control, the robust terms help keep the system

on the sliding mode. Previously, z was designed in Eq. (3.20).

˙α = ˙αnom + ˙αrob, (3.42)

where

˙αnom =− φ2w2 −˙pp−2(−φ1w1 + ΛFd −G1s)

+ p−1

�− φ1

˙w1 −

�∂φ1

∂xx +

∂φ1

∂Fd

Fd

�w1

+ ΛFd −

�∂φ1

∂sw1 + G1

�˙s

�, (3.43)

and

˙s = −ΛFd + φ1w1 + p(−Fm + Fc). (3.44)

A Lyapunov redesign robustifying term is,

˙αrob =− p−1|s|

1.1

�1.1µ2|− Fm + Fc|

0.1[Fc + φ2w2]sgn(−Fm + Fc)

�

− 1.1p−1|s|

0.1 ˙ssgn(s)(µ2|− Fm + Fc|1.1 + µ1)

+ ˙pp−2|s|

1.1(µ2|− Fm + Fc|1.1 + µ1). (3.45)

An additional switching control is defined in Appendix B which is used for sta-

bility purposes.

45

Chapter 4

Stability Proof

This section uses Lyapunov stability theory to prove the uniform ultimate bound-

edness of all system states under the control Fc,nom with no system disturbances

(stability in the presence of disturbances is discussed in Appendix A). The proof

is largely algebraic and the subscripts “nom” are dropped since only the nominal

control is considered. Begin with the Lyapunov candidate,

V1 =1

2s2 +

1

2βp

p2 +

1

2β1wT

1 w1. (4.1)

Taking the derivative of this candidate,

V1 = ss +1

βp

pd

dt(p− p) +

1

β1wT

1

d

dt(w1 − w1). (4.2)

The ideal weights in the neural networks are assumed not to change with time and

their time derivatives are 0. Granted, the ideal weights will change with time, though

it is assumed that the update rate of the estimated weights is much faster than the

rate at which the ideal weights change. Thus it can be assumed that at any instant

of time the time derivative of the ideal weights is dominated by the time derivative

of the estimated weights.

Let’s look separately at the term s:

s = Λ� + x, (4.3)

= Λ(Fm − Fd) + x, (4.4)

= Λ

�Dmx +

d

dt(K(x)x)− Fd

�+ x, (4.5)

= −ΛFd + Λd

dt(K(x)x) + (ΛDm + 1)x. (4.6)

46

Substituting in the slave dynamics,

s = −ΛFd + Λd

dt(K(x)x) + M

−1(ΛDm + 1)(−Drx− Fm + Fc), (4.7)

= −ΛFd + Λd

dt(K(x)x)−M

−1(ΛDm + 1)Drx + M−1(ΛDm + 1)(−Fm + Fc).

(4.8)

Using the neural network models Eqs. (3.25),(3.28),

s = −ΛFd + φ1w1 + p(−Fm + Fc). (4.9)

Returning to the Lyapunov candidate,

V1 = s(−ΛFd + φ1w1 + p[−Fm + Fc])−1

βp

p ˙p−1

β1wT

1˙w1. (4.10)

At this stage it is useful to use the substitutions w1 = w1 + w1 and p = p + p and

combine the weight error terms together,

V1 =s(−ΛFd + φ1w1 + φ1w1 + [p + p][−Fm + Fc])−1

βp

p ˙p−1

β1wT

1˙w1, (4.11)

=s(−ΛFd + φ1w1 + p[−Fm + Fc]) + p

�[−Fm + Fc]s−

1

βp

˙p

�+ wT

1

�φ

T

1 s−1

β1

˙w1

�.

(4.12)

Introducing the virtual control error Eq. (3.20) and the virtual control Eq. (3.22)

into the Lyapunov function,

V1 =s(−ΛFd + φ1w1 + p[−Fm + z + α]) + p

�[−Fm + Fc]s−

1

βp

˙p

�+ wT

1

�φ

T

1 s−1

β1

˙w1

�,

(4.13)

=s(−ΛFd + φ1w1 + p[−Fm + z + Fm + p−1(−φ1w1 + ΛFd −G1s)])

+ p

�[−Fm + Fc]s−

1

βp

˙p

�+ wT

1

�φ

T

1 s−1

β1

˙w1

�, (4.14)

=spz −G1s2 + p

�[−Fm + Fc]s−

1

βp

˙p

�+ wT

1

�φ

T

1 s−1

β1

˙w1

�. (4.15)

47

Using the expressions for the tuning functions,

V1 = spz −G1s2 + p

�τp −

1

βp

˙p

�+ wT

1

�τ1 −

1

β1

˙w1

�. (4.16)

Now to begin the second stage of backstepping use a Lyapunov candidate

V2 = V1 +1

2z

2 +1

2β2wT

2 w2, (4.17)

and differentiate

V2 =V1 + zz −1

β2wT

2˙w2, (4.18)

=spz −G1s2 + z(Fc − α)

+ p

�τp −

1

βp

˙p

�+ wT

1

�τ1 −

1

β1

˙w1

�−

1

β2wT

2˙w2. (4.19)

Insert the proposed control law

V2 =spz −G1s2 + z( ˙α− sp−G2z − α)

+ p

�τp −

1

βp

˙p

�+ wT

1

�τ1 −

1

β1

˙w1

�−

1

β2wT

2˙w2, (4.20)

=−G1s2−G2z

2 + z( ˙α− α)

+ p

�τp −

1

βp

˙p

�+ wT

1

�τ1 −

1

β1

˙w1

�−

1

β2wT

2˙w2. (4.21)

48

Looking separately at α, derived from Eq. (3.22),

α =d

dt(Fm) +

d

dt(p−1)(−φ1w1 + ΛFd −G1s)

+ p−1

�−

d

dt(φ1(x, Fd, s)w1) + Λ

d

dt(Fd)−G1

d

dt(s)

�, (4.22)

=Fm −˙pp−2(−φ1w1 + ΛFd −G1s)

+ p−1

�−

�∂φ1

∂xx +

∂φ1

∂Fd

Fd +∂φ1

∂ss

�w1 − φ1

˙w1 + ΛFd −G1s

�, (4.23)

=− φ2w2 −˙pp−2(−φ1w1 + ΛFd −G1s)

+ p−1

�−

�∂φ1

∂xx +

∂φ1

∂Fd

Fd

�w1 − φ1

˙w1 + ΛFd

�− p

−1

�∂φ1

∂sw1 + G1

�s,

(4.24)

=− φ2w2 −˙pp−2(−φ1w1 + ΛFd −G1s)

+ p−1

�−

�∂φ1

∂xx +

∂φ1

∂Fd

Fd

�w1 − φ1

˙w1 + ΛFd

�

− p−1

�∂φ1

∂sw1 + G1

�(−ΛFd + φ1w1 + p[−Fm + Fc]), (4.25)

and subtracting this from ˙α,

˙α− α = φ2w2 + p−1

�∂φ1

∂sw1 + G1

�(φ1w1 + p[−Fm + Fc]). (4.26)

Substitute this back into the expression for V2

V2 =−G1s2−G2z

2 + p

�τp + [−Fm + Fc]

�∂φ1

∂sw1 + G1

�p−1

z −1

βp

˙p

�

+ wT

1

�τ1 + φ

T

1

�∂φ1

∂sw1 + G1

�p−1

z −1

β1

˙w1

�+ wT

2

�φ

T

2 z −1

β2

˙w2

�. (4.27)

As the final step in determining V2, insert the proposed weight update laws. For this

stability proof, it is assumed that γ = 1. For a proof of stability for arbitrary values of

this parameters see Appendix B. A simple example can convince the reader that the

value of γ does not affect stability. For instance, if weight update laws are designed

49

in the first stage of backstepping, stable weight updates would not even include the

tuning functions τ (ie. γ = 0). This approach creates an overparameterized system.

On the other hand, as will be shown below, γ = 1 also provides stable weight

updates. Hence the tuning functions are, as their name suggests, used to tune the

weight updates and add robustness to the overall system by providing unique state

information.

V2 = −G1s2−G2z

2− ζ p(p− p) + ν1wT

1 w1 + ν2wT

2 w2, (4.28)

= −G1s2−G2z

2− ζ p

2 + ζ(p− p)p− ν1wT

1 w1 + ν1wT

1 w− ν2wT

2 w2 + ν2wT

2 w2.

(4.29)

At this point, only consider the case when p �= 0 because p is guaranteed bounded oth-

erwise, by construction of the projection rule. Thus, it is more interesting to consider

the region in which p �= 0 and define bounds for performance considerations rather

that stability considerations. Defining vectors ξ = [s z]T and w = [w1 w2 p]T

(and associated˜andˆconventions) transforms V2 into

V2 = −ξT

G1 0

0 G2

ξ − wT

1ν1 0 0

0 1ν2 0

0 0 ζ

w + wT

1ν1 0 0

0 1ν2 0

0 0 ζ

w + ζ pp.

(4.30)

Thus an ultimate (conservative) bound can be defined

V2 ≤ −G||ξ||2 − ν||w||2 + ν||w||(||w|| + ζ|p|), (4.31)

which represents an ellipse on the (||ξ||, ||w||) plane. In the above bound the defini-

tions G = min(G1, G2) and ν = min(ν1, ν2, ζ) are used. The assumption on stability

is an extension of standard Lyapunov theory and the system states are said to be

50

uniformly ultimately bounded. V2 < 0 when

||ξ|| >

�ν

4G||w|| = δξ, (4.32)

or

||w|| > ||w|| + ζ p = δw, (4.33)

and the ultimate bound, ξb, on the system error ξ is

ξb =�

δ2ξ + δ2

w. (4.34)

51

Chapter 5

Alternative Control Designs for Comparison

A comparison will be made between the proposed controller of Chapter 3 and three

other well known and established approaches: A robust H2 controller, an output

feedback controller, and a direct force controller. As will be apparent, a robust

design in either approach is challenging and significant assumptions must be made

which compromise stability.

5.1 H2 Control

Stability properties of the H2 have been well understood for some time. Controllers

designed using H2 techniques are inherently robust even in the presence of modeling

uncertainties, external disturbances, and measurement noise. This is because the

formulation of the control law is based on disturbance rejection, aiming to “keep the

size of the performance variable small in the presence of the exogenous signals” [55].

Additionally, the controller design is straightforward and systematic if the plant to

be controlled is well defined in terms of its state space representation. As such, these

robust and optimal controllers are frequently implemented in many industry applica-

tions and are used confidently by engineers. For these reasons, it seems appropriate

to compare the proposed design to an H2 design. Begin by adopting the system

dynamics

Mx = −Drx− Fm + Fc, (5.1)

52

with a requirement that the damping term in the end effector force measurement

is excluded and, for now, the environment stiffness is constant. That is, Fm =

Kex, Ke ∈ �+. As far as a controller is concerned, any damping present in the

environment can be included in the term Drx so excluding Dm in Eq. (5.1) implicitly

includes it in with Dr. To make a fair comparison to the proposed controller, force

errors are treated as the system states. Defining

e =

e1

e2

=

Fm − Fd

Fm − Fd

, (5.2)

results in

e =

0 1

0 0

e +

B11

B12

w(t) +

0

KeM−1(−Drx− Fm + Fc)− Fd

, (5.3)

where B1 := [B11 B12]T and w(t) represents sensor noise and may also model some

system disturbances. First, a feedback cancelling control is chosen to be

uFBC = Drx + Fm + K−1e

MFd, (5.4)

and the overall control law that will be implemented is

Fc = uFBC + uH2 . (5.5)

The final state-space representation for H2 synthesis becomes

e := Ae + B1w(t) + B2uH2 =

0 1

0 0

e +

0

KeM−1

uH2 , (5.6)

with the performance variable, z, having dynamics

z =

C11 0

0 C12

e +

0

D12

uH2 , (5.7)

53

and the system has measurable output

y(t) := C2e + D21w(t) =

�1 0

�e + D21w(t). (5.8)

The control uH2 is the output of the system K2 (defined below) with input y(t). K2

minimizes the H2 norm of the w(t) → z mapping (that is, minimizes the influence of

the system uncertainties to the performance variable)

K2 :=

A + B2F2 + L2C2 −L2

F2 0

, (5.9)

where

F2 = −D−212 ([0 C12D12] + BT

2 X2), (5.10)

and

L2 = −D−221 (Y2CT

2 + [B11D21 B12D21]T ). (5.11)

X2 and Y2 are the positive semi-definite solutions to the algebraic Ricatti equations

0 = X2Ar+AT

rX2+

C

211 0

0 C212

−D−212

0 0

0 C212D

212

−D−212 X2

0 0

0 K2eM−2

X2,

(5.12)

and

0 = AeY2 + Y2AT

e−D

−221 Y2

1 0

0 0

Y2, (5.13)

with

Ar = A−D−212

0 0

0 KeM−1

C12D12

, (5.14)

and

Ae = A−D−121

B11 0

B12 0

. (5.15)

54

5.1.1 Discussion on H2 controller

A few of the noteworthy results that arise from the above controller design require

some discussion. First of all, this controller will only work when there is a non-zero

force sensor measurement. The controller design verifies this fact because ||F2||2 →∞

as Ke → 0, meaning that when the slave is in unconstrained motion a switch must

be made from the above designed H2 control to another control. Realistically, it is

hard to implement this switching type control and this fact highlights the superiority

of the proposed control design to conventional control design.

Also remember that the H2 control law is designed based on a constant environ-

ment stiffness Ke. The control is implemented by calling a gain scheduler routine

which varies the control gains according to the current environment stiffness. The

environment stiffness is unknown, however an estimate of it can be made based on the

measured force. It turns out that it is reasonable for estimation purposes to assume

that Ke ∝ Fm for surgical applications. For instance, if a large force is encountered

it can be assumed that the environment stiffness is also large. If the environment

stiffness were small in this particular case it would be implied that there was a large

tool displacement, an unlikely scenario in surgical applications. Note also that the

estimator gains L2 do not depend on Ke. In fact, the only gain that depends on Ke

is the derivative control gain (the second element of F2). Thus, uncertainties in the

estimated Ke do not affect the controller performance or robustness significantly.

Another point of discussion is uFBC which aims at cancelling out dynamics asso-

ciated with robot damping, measured force, and commanded force derivatives. Of

particular difficulty is knowledge of the coefficients Dr and Ke in Eq. (5.4). Ke

must be estimated in the same manner as above. The damping of the robot can be

55

Slave

ControllerPassivity Observer

Remote Environment

Passivity Controller

Figure 5.1: Output feedback control architecture

quantified by experiment, but recall that Dr also contains damping effects present in

the environment. Nonetheless, as long as Dr is estimated to be less than the actual

Dr only controller performance is affected, not stability. Indeed, robustness actually

improves if the estimate of Dr is below the actual value by adding a damping effect

to the system.

In addition to the above control law, the actual control signal sent to the slave ac-

tuators is filtered such that the maximum actuator velocity does not exceed 1500mm/s.

Filtering the control signal is a common practice and is absolutely necessary to pro-

tect the actuators, other slave hardware, and ensure that high frequency natural

modes are not excited.

5.2 Output Feedback Control

The basis for this control comes from passivity theory and follows the design em-

ployed by [30] and [31]. For a system as in Fig. 5.1 the control design is fairly straight

forward. Consider that the controller (which can be design arbitrarily, without re-

gard for stability) sends force commands to the slave robot actuators. The passivity

observer observes the passivity of the slave and the passivity controller adds any

shortage of passivity to the control signal from the controller. The passivity observer

56

is designed from a direct extension of passivity theory. In continuous time a passive

system obeys the inequality

�t

0

f(τ)v(τ)dτ ≥ 0, (5.16)

where f(t) is the controller output and v(t) is the slave velocity. Physically, this

system can be interpreted as being passive if the slave is absorbing energy. The

passivity observer calculates and outputs the passivity, E, of the slave in discrete

time by

E(n) = ∆T

n�

k=0

f(k)v(k), (5.17)

where ∆T is the sampling period and n is the current sample.

The passivity controller is designed to output

α(n) =

−E(n)∆Tv(n)2 if E(n) < 0,

0 otherwise,

(5.18)

thus injecting any shortage of passivity back into the system. A proof of stability is

trivial and given in [30] for the readers reference.

5.2.1 Discussion on output feedback control

Again, a short discussion on the output feedback control is constructive. First,

note that the observer contains a memory element. Thus, a system can accumulate

“passivity” for some time. If the slave becomes active at some point (adding energy

to the system), it will continue to behave actively until the accumulated “passivity”

is dissipated. Only at this point will the passivity controller intercede and stabilize

the system. Realistically, the passivity observer must be reset periodically, at the

cost of performance.

57

Additionally, if there is any time delay in the system it is difficult to accurately

observe the passivity in the system. Indeed, in the presence of a time delay the

passivity controller may even cause the system to destabilize faster (particularly

when the slave exhibits oscillations).

Nonetheless, in ideal conditions the output feedback controller is attractive in

the sense that the controller can be designed with a small robustness margin or

only locally stable and the passivity controller will compensate for any instances of

instability and ensure global stability.

58

Chapter 6

Results

This chapter

• provides an overview of the experimental setup used to verify the controller

design

• compares the proposed control design to the controllers designed in Chapter 5

• gives experimental results of various tests performed on the proposed controller

6.1 Experimental Setup

The experimental setup consists of a combination of simulation and real hardware.

A real master device is used and the force trajectory is supplied by a real human

operator. The slave device and environment are completely simulated.

6.1.1 Master device

Project neuroArm from the Foothills Medical Research Center in Calgary has pro-

vided a PHANTOM Omni haptic device manufactured by SensAble (Fig. 6.1). The

Omni is a 3 degree-of-freedom (in force) haptic device providing force feedback in

the x, y, z axis. Positional sensing is provided in 6 degrees (x, y, z and roll, pitch,

yaw) from digital encoders with a positional resolution of 0.055mm. A workspace

with dimensions of 160W × 120H × 70Dmm is available. An IEEE-1394 FireWire

port connects the device to a PC, allowing for fast communications between the PC

59

Figure 6.1: Master haptic device used for experiments.

and the device. A stylus located at the end effector has an apparent mass of 45g.

The maximum force that can be continuously exerted by the Omni is 3.3N , and the

motors have a backdrive friction of 0.26N . The device exhibits a maximum stiffness

of 2310N/m.

Communication to the device is provided through Quanser’s QuaRC control soft-

ware solution. A PHANTOM Omni blockset is provided for use in MATLAB’s

Simulink environment. QuaRC fully supports Simulink’s external mode, includ-

ing scopes, online parameter tuning, and data logging directly to the MATLAB

workspace. The Omni Simulink block sends force commands to the Omni’s motors

and receives x, y, z encoder positions as well as roll, pitch, yaw angles. QuaRC allows

communication with the Omni at sample frequencies up to 1000Hz, as used for this

60

work.

The proposed controller is designed in such a way to ensure a straightforward

extension to a multiple degree-of-freedom haptic setup, however this thesis only tests

the one degree of freedom case. As such, throughout the experiment two proportional

controllers are used to lock the haptic device onto the (x, 0, 0) line in the usable

workspace.

Because there is no force sensor mounted on the Omni end effector, a virtual

force sensor is designed which models a spring. The output of the force sensor is the

human desired force defined by

Fd = Fm + Khaptic(xhaptic − x0). (6.1)

With this force sensor construction it appears that the force controller has been

recast as a scaled position control. However, there are two key differentiating points.

One, Fd is the force fed back to the Onmi such that the human can feel the force that

they are applying. Compare this to a slave under position control in which the force

fed into the haptic device would simply be Fm. Secondly, the addition of Fm in Eq.

(6.1) has profound influence on the system with time delay in terms of force error.

Consider a system with a communication delay of T seconds between the remote

environment and the human operator. The measured force takes T seconds to reach

the master device,

Fd(t) = Fm(t− T ) + Khaptic(xhaptic(t)− x0), (6.2)

and the human commanded force Fd takes an additional T seconds to reach the

61

controller on the slave side. Thus the force error at time t is

�(t) = Fd(t− T )− Fm(t), (6.3)

= Fm(t− 2T )− Fm(t) + Khaptic(xhaptic(t− T )− x0). (6.4)

Indeed, when there is no communications delay (T = 0) the force sensor construction

does transform the controller into a scaled position controller.

It is important to note that, although the maximum force exerted by the Omni

is 3.3N a force larger than this can still be command because of the virtual force

sensor. However, the force feedback felt by the human is saturated at 3.3N and

represents a limitation of haptic systems in general and can pose significant danger

to the slave hardware and patient because the human operator becomes unaware of

the force they are exerting. This limitation can be remedied by scaling the human

commanded force fed back to the haptic device by a constant 0 < Ks < 1 while

sending the unscaled version to the controller. Teleoperation systems often do this

to provide the human with increased fidelity in some situations.

Additionally, [56] has shown experimentally and quantitatively that humans have

poor judgment when differentiating between impedances. This particularly well cited

article (and others [57]) argues that nonlinearities (high-frequency changes in force,

or “edges”) are much better indicators of perceptual “hardness” compared to a ratio

of static position to force (ie. the technical definition of stiffness). Thus, it is more

important that the haptic device preserves the high-frequency change in force and

there is less concern with the haptic devices ability to reflect the appropriate force.

In fact, [56] makes these claims specifically for the stiffness range of 1700 to 3200N/m

which is precisely the stiffness that the controllers are tested for. The human test

subject used for obtaining the following experiment results confirms this.

62

Figure 6.2: Screen capture of the Simulink model used for experiments.

The parameter Khaptic is a user-determined parameter based on their preference.

A large Khaptic means the master device will move little whereas a smaller value will

require larger displacements to produce the same force.

6.1.2 Virtual components

The remaining components of the experimental setup are written in software and

implemented in Simulink (see Fig. 6.2). A majority of the experiment is written in

the C language, integrated into Simulink using mex-file s-functions, and compiled

using MATLAB’s real-time workshop toolbox (requiring additional Target Language

Compiler files to be written). Any integration performed in the experiment is done

using MATLAB’s ode4 fixed-step integrator which uses the 4th order Runge-Kutta

method for integration. One exception is the neural-network weight updates which

employs a 5th order bode rule integrator because this integration is performed in

C-code written externally to the MATLAB environment.

Slave dynamics consist of a mass-damper system in one-dimension with driving

force Fc and opposing force Fm. Two nonlinear and challenging environments are

63

Figure 6.3: Force profile of the simulated remote environment

designed for the controller to be tested on. The general force profile for the virtual

environment is shown in Fig. 6.3. In the case of stiff contact, the force profile is

shown exactly as in Fig. 6.3. In the case of the loss of contact test, there is a jump

discontinuity at xt with Fm(x = x+t ) = 0.

Communication delays are added to the system so that any information trans-

ported from the master to the slave (or vice-versa) undergoes a time delay of T

seconds.

Table 6.1 shows the nominal parameters used in the experiments. Unless other-

wise mentioned, these parameters were used to obtain all results.

64

Table 6.1: Experiment Parameters

Parameter Value Parameter Value

β1 30 β2 10βp 10 G1 10G2 15 ν 0.001γ 0.1 T 0.05sh 0.01s M 0.1kg

Dr 4N · s/m Dm 1N · s/m

Λ 1 Khaptic 100N/m

µ 0.1 m 10||p||min 1 xt 10cm

Ke,1 (loss of contact) 30N/m Ke,1 (stiff contact) 20N/m

Ke,2 (loss of contact) 0N/m Ke,2 (stiff contact) 3000N/m

Mhaptic 45g

6.1.3 Implementation Considerations

Fd and Fd are used in the control law and it is not intuitive as to how these values

are obtained. Looking analytically at these terms based on Eq. (6.2) it is seen that

Fd(t) = Fm(t− T ) + Khapticxhaptic(t), (6.5)

Fd(t) = Fm(t− T ) + Khapticxhaptic(t). (6.6)

In this case, the time delay allows a calculation of Fm(t− T ). Usually, it is difficult

to obtain derivative values in real time, but past derivative values can be fairly

accurately estimated from a delta rule

Fm(t− T ) =Fm(t− T + h)− Fm(t− T − h)

h, (6.7)

Fm(t− T ) =Fm(t− T + h)− Fm(t− T − h)

h, (6.8)

where 0 < h < T .

Additionally, xhaptic and xhaptic are obtained by implementing a Kalman filter at

65

the master side. The observer gains are calculated using the state space model

xhaptic

xhaptic

=

0 1

M−1hapticKhaptic 0

xhaptic − x0

xhaptic

+

0

Mhaptic

(Fd(t)−Fm(t−T )),

(6.9)

and classic optimal observer design theory with noise covariances of 1.

6.2 Results

This section is divided into 2 sections

• First, the proposed controller is tested for the cases of stiff contact and loss

of contact. The results are also compared to an H2 controller and an output

feedback controller

• Secondly, the proposed controller is specifically tested to show

– the filtering properties of the backstepping technique

– the effect of time delay

– the evolution of errors s and z

– neural network outputs

– the boundedness of neural network weights

6.2.1 Stiff Contact Test and Loss of Contact Test (comparison with H2 and output

feedback controller)

A human operator (the author) is asked to push the slave robot through a medium

as shown in Fig. 6.3 and after 10cm the slave will encounter a stiff environment,

simulating contact with a wall. Because the force trajectory is determined by a

66

human user, each trial is somewhat different. Testing the controllers using a pre-

defined force trajectory would not provide a fair test because the human is expected

to change their response based on the controller’s performance. This is an artifact

of the human operator being part of the system dynamics and makes it difficult to

test.

Nonetheless, the human is instructed to respond as consistently as possible amongst

trials and is asked to contact the wall (or puncture point) between 3 and 4sec for

consistency. The test for each controller is repeated 11 times to ensure consistent

results and standard deviations are provided with data which confirm the validity of

the results. The time delay is set to 0.05sec for the H2 and proposed controller, and

0sec for the output feedback controller. The output feedback controller tends to go

unstable for even small time delays because small time delays significantly affect the

accuracy of the passivity observer, in particular if high frequency effects are present

(which they are when the slave bounces off the hard surface).

The collision is not completely elastic because the environment still has a finite

stiffness (that is, at xt in Fig. 6.3 the slave velocity is not perfectly reflected in the op-

posite direction). In reality there will always be some compliance in the event of stiff

contact whether it be from environment deformation or surgical tool deformation.

The human operator is provided with information of the slave position as well as

force feedback from the PHANTOM Omni haptic device.

To convince the reader that the human test subject is not biased, the same ex-

periment is also performed with a pre-defined ‘human’ response. The pre-defined

‘human’ behaves like a filtered proportional-integral controller for velocity. That is,

the pre-defined response attempts to maintain constant slave velocity. The com-

67

manded force is thus determined by

Fd =2 + 8s

s

20

s + 20�v + Fm, (6.10)

where �v is the error between the slave velocity and the desired slave velocity. In

terms of Fig. 6.3 the desired slave velocity in the first region is set to be 0.03m/s.

For the stiff contact test, the desired velocity in the second region is set to 0m/s

once a force of 5N is reached. For the loss of contact test, the desired velocity in the

second region is set to 0m/s as soon as the puncture occurs.

Results of the comparisons are shown in Fig. 6.4-6.7, and following those figures

the results for the same tests using the PI human model are shown in Fig. 6.8-6.11.

Comparing the results when the real human is providing the desired force and when

the PI human model is providing the desired force reveals that the controller response

at stiff contact (and loss of contact) are remarkably the same.

Fig. 6.4 shows the results of the 1st trial and confirm that the proposed controller

significantly out performs both the H2 and output feedback controller. Both the H2

and output feedback controller apply significant force to the wall when contact is

made. Table 6.2 records the average maximum force exerted by the slave on the

wall over the 11 trials and the standard deviation of this maximum. Only the 0.5sec

range after contact with the wall is made is considered when calculating the values

in Table 6.2, after which time the human is able to react to the increased force.

Standard deviations show that there is not significant deviation between trials and

justifies single tests from hereon.

As can be seen, such forces would either damage force sensors (force sensors on

neuroArm have a sensing range of 32N in the x, y direction and 56N in the z di-

rection) or at least not reflect the humman commanded behaviour and these results

68

Table 6.2: Average maximum measured force at the slave end effector due to theproposed controller, H2 controller, and the output feedback controller

Controller Average Maximum Measured Force Standard Deviation

Proposed 4.15N 0.24H2 122.46N 2.40

Output Feedback 25.96N 2.83

confirm the danger when contact with a wall is encountered in teleoperation. Maxi-

mum contact force decreases for both the proposed controller and the H2 controller

as the time delay increases (not shown). This is because the controller is allowed

more time to pull back before the high contact force reaches the human operator.

Oscillations in the measured force when in contact in the material seem undesir-

able. However, looking at the slave robot position when in contact with the wall, it

can be seen that these oscillations in force correspond to small oscillations in slave

position (for the proposed controller at least, which have an average magnitude of

around 0.2mm).

Fig. 6.7 shows this result of the loss of contact test. The H2 controller is not

appropriate for the loss of contact test as discussed in Section 5. Note that the

primary goal of this results section is to show that stability and improved performance

in the case of stiff contact while ensuring that performance is not compromised in

the loss of contact case. Fig. 6.7 confirms that we do not compromise performance

in the loss of contact test. In fact, there is less positional overshoot when using our

controller (though at the expense of increase slave speed).

69

0 1 2 3 4 5 60

0.05

0.1

Time (s)

Slav

ePos

itio

n(m

)

Stiff Contact Test: Comparison between the Proposed Controller (P.C.), H2 Controllerand the Output Feedback Controller

P.C.H2O.F.C.

0 1 2 3 4 5 6−2−1

0123

Time (s)Slav

eVel

ocity

(m/s

)

P.C.H2O.F.C.

0 0.02 0.04 0.06 0.08 0.1 0.12−2−1

0123

Slave Position (m)Slav

eVel

ocity

(m/s

)

P.C.H2O.F.C.

0 1 2 3 4 5 60

2

4

6

Time (s)

Forc

e(N

) P.C.

FdFm

0 0.02 0.04 0.06 0.08 0.1 0.120

2

4

6

Slave Position (m)

Forc

e(N

)

FdFm

0 1 2 3 4 5 60

50

100

Time (s)

Forc

e(N

) H2

FdFm

0 0.02 0.04 0.06 0.08 0.1 0.120

50

100

Slave Position (m)

Forc

e(N

)

FdFm

0 1 2 3 4 5 60

1020

3040

Time (s)

Forc

e(N

) O.F.C.

FdFm

0 0.02 0.04 0.06 0.08 0.1 0.120

1020

3040

Slave Position (m)

Forc

e(N

)

FdFm

Figure 6.4: Comparison of the three controllers for the stiff contact test. Proposedcontroller hits the wall with 118N less force than the H2 controller and 21N lessforce than the output feedback controller.

70

0 1 2 3 4 5 60

0.02

0.04

0.06

0.08

0.1

Time (s)

Slav

ePos

itio

n(m

)

Stiff Contact Test (Close up Performance)

P.C.H2O.F.C.

0 1 2 3 4 5 6

−0.02

0

0.02

0.04

0.06

Time (s)

Slav

eVel

ocity

(m/s

)

P.C.H2O.F.C.

0 0.02 0.04 0.06 0.08 0.1 0.12

−0.02

0

0.02

0.04

0.06

Slave Position (m)

Slav

eVel

ocity

(m/s

)

P.C.H2O.F.C.

0 1 2 3 4 5 6−0.8−0.6

−0.4−0.2

0

0.20.40.6

Time (s)

Forc

eE

rror

(N)

(Fd−

Fm

)

P.C.H2O.F.C.

0 0.02 0.04 0.06 0.08 0.1 0.12−0.8−0.6

−0.4−0.2

0

0.20.40.6

Slave Position (m)

Forc

eE

rror

(N)

(Fd−

Fm

)

P.C.H2O.F.C.

Figure 6.5: Zoomed in version of Fig. 6.4 to emphasize the performance benefit ofthe proposed controller

71

0 1 2 3 4 5 60

0.02

0.04

0.06

0.08

0.1

Time (s)

Slav

ePos

itio

n(m

)

Stiff Contact Test (Proposed Controller Only)

0 1 2 3 4 5 6

−0.02

0

0.02

0.04

0.06

Time (s)

Slav

eVel

ocity

(m/s

)

0 0.02 0.04 0.06 0.08 0.1 0.12

−0.02

0

0.02

0.04

0.06

Slave Position (m)

Slav

eVel

ocity

(m/s

)

0 1 2 3 4 5 6−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

Time (s)

Forc

eE

rror

(N)

0 0.02 0.04 0.06 0.08 0.1 0.12−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

Slave Position (m)

Forc

eE

rror

(N)

Figure 6.6: A version of Fig. 6.5 with only the proposed controller performance.Axis are the same as in Fig. 6.5.

72

0 1 2 3 4 5 6 70

0.05

0.1

Time (s)

Slav

ePo

sitio

n(m

)

Loss of Contact Test: Comparison between the Proposed Controller (P.C.)and the Output Feedback Controller (O.F.C.)

P.C.O.F.C.

0 1 2 3 4 5 6 7−0.4

−0.2

0

0.2

Time (s)

Slav

eV

eloc

ity(m

/s)

P.C.O.F.C.

0 0.02 0.04 0.06 0.08 0.1 0.12−0.4

−0.2

0

0.2

Slave Position (m)

Slav

eV

eloc

ity(m

/s)

P.C.O.F.C.

0 1 2 3 4 5 6 7−0.5

0

0.5

1

Time (s)

Forc

e(N

)

P.C.

FdFm

0 0.02 0.04 0.06 0.08 0.1 0.12−0.5

0

0.5

1

Slave Position (m)

Forc

e(N

)

FdFm

0 1 2 3 4 5 6 7

0

0.5

1

Time (s)

Forc

e(N

)

O.F.C.

FdFm

0 0.02 0.04 0.06 0.08 0.1 0.12

0

0.5

1

Slave Position (m)

Forc

e(N

)

FdFm

Figure 6.7: Comparison between the proposed controller and an output feedbackcontroller for the loss of contact test. The proposed controller has less positionalovershoot than the output feedback controller, but a greater negative velocity.

73

0 1 2 3 4 5 6 70

0.05

0.1

Time (s)

Slav

ePos

itio

n(m

)

Stiff Contact Test (PI human model): Comparison between the Proposed Controller (P.C.), H2Controller and the Output Feedback Controller

P.C.H2O.F.C.

0 1 2 3 4 5 6 7−2−1

0123

Time (s)Slav

eVel

ocity

(m/s

)

P.C.H2O.F.C.

0 0.02 0.04 0.06 0.08 0.1 0.12−2−1

0123

Slave Position (m)Slav

eVel

ocity

(m/s

)

P.C.H2O.F.C.

0 1 2 3 4 5 6 70

2

4

6

Time (s)

Forc

e(N

) P.C.

FdFm

0 0.02 0.04 0.06 0.08 0.1 0.120

2

4

6

Slave Position (m)

Forc

e(N

)

FdFm

0 1 2 3 4 5 6 70

50

100

Time (s)

Forc

e(N

) H2

FdFm

0 0.02 0.04 0.06 0.08 0.1 0.120

50

100

Slave Position (m)

Forc

e(N

)

FdFm

0 1 2 3 4 5 6 70

1020

3040

Time (s)

Forc

e(N

) O.F.C.

FdFm

0 0.02 0.04 0.06 0.08 0.1 0.120

1020

3040

Slave Position (m)

Forc

e(N

)

FdFm

Figure 6.8: Comparison of the three controllers for the stiff contact test using the PIhuman model. The controller response is quite similar to those shown in Fig. 6.4.

74

0 1 2 3 4 5 6 70

0.02

0.04

0.06

0.08

0.1

Time (s)

Slav

ePos

itio

n(m

)

Stiff Contact Test (PI human model, Close up Performance)

P.C.H2O.F.C.

0 1 2 3 4 5 6 7

−0.02

−0.01

0

0.01

0.02

0.03

Time (s)

Slav

eVel

ocity

(m/s

)

P.C.H2O.F.C.

0 0.02 0.04 0.06 0.08 0.1 0.12

−0.02

−0.01

0

0.01

0.02

0.03

Slave Position (m)

Slav

eVel

ocity

(m/s

)

P.C.H2O.F.C.

0 1 2 3 4 5 6 7

−0.4

−0.2

0

0.2

0.4

0.6

Time (s)

Forc

eE

rror

(N)

(Fd−

Fm

)

P.C.H2O.F.C.

0 0.02 0.04 0.06 0.08 0.1 0.12

−0.4

−0.2

0

0.2

0.4

0.6

Slave Position (m)

Forc

eE

rror

(N)

(Fd−

Fm

)

P.C.H2O.F.C.

Figure 6.9: Zoomed in version of Fig. 6.8

75

0 1 2 3 4 5 6 70

0.02

0.04

0.06

0.08

0.1

Time (s)

Slav

ePos

itio

n(m

)

Stiff Contact Test (PI human model, Proposed Controller Only)

0 1 2 3 4 5 6 7

−0.02

−0.01

0

0.01

0.02

0.03

Time (s)

Slav

eVel

ocity

(m/s

)

0 0.02 0.04 0.06 0.08 0.1 0.12

−0.02

−0.01

0

0.01

0.02

0.03

Slave Position (m)

Slav

eVel

ocity

(m/s

)

0 1 2 3 4 5 6 7

−0.4

−0.2

0

0.2

0.4

0.6

Time (s)

Forc

eE

rror

(N)

0 0.02 0.04 0.06 0.08 0.1 0.12

−0.4

−0.2

0

0.2

0.4

0.6

Slave Position (m)

Forc

eE

rror

(N)

Figure 6.10: A version of Fig. 6.9 with only the proposed controller performance.Axis are the same as in Fig. 6.9.

76

0 1 2 3 4 5 6 70

0.05

0.1

Time (s)

Slav

ePo

sitio

n(m

)

Loss of Contact Test (PI human model): Comparison between the ProposedController (P.C.) and the Output Feedback Controller (O.F.C.)

P.C.O.F.C.

0 1 2 3 4 5 6 7−0.2

−0.1

0

0.1

0.2

Time (s)

Slav

eV

eloc

ity(m

/s)

P.C.O.F.C.

0 0.02 0.04 0.06 0.08 0.1 0.12−0.2

−0.1

0

0.1

0.2

0.3

Slave Position (m)

Slav

eV

eloc

ity(m

/s)

P.C.O.F.C.

0 1 2 3 4 5 6 7

−0.5

0

0.5

1

Time (s)

Forc

e(N

)

P.C.

FdFm

0 0.02 0.04 0.06 0.08 0.1 0.12

−0.5

0

0.5

1

Slave Position (m)

Forc

e(N

)

FdFm

0 1 2 3 4 5 6 7

−0.5

0

0.5

1

Time (s)

Forc

e(N

)

O.F.C.

FdFm

0 0.02 0.04 0.06 0.08 0.1 0.12

−0.5

0

0.5

1

Slave Position (m)

Forc

e(N

)

FdFm

Figure 6.11: Comparison between the proposed controller and an output feedbackcontroller for the loss of contact test using the PI human model. Again, the controllerresponse is quite similar to the results shown in Fig. 6.7.

77

6.2.2 Proposed controller performance

The remainder of this section rigorously tests the performance of the proposed con-

troller, showing

• the filtering properties of the backstepping technique

• the effect of time delay

• the evolution of errors s and z

• neural network outputs

• the boundedness of neural network weights

6.2.2.1 Filtering properties of Backstepping

Fig. 6.12 shows the controller performance when the backstepping technique is not

used. In this case, the actual control sent to the slave is Eq. (3.22) and the neural

network weight update laws do not include the derivative terms in the weight update

law resulting from the tuning function approach (γ = 0). This controller is tested

using the nominal gains from Table 6.1 that are applicable. In addition to the control

law Eq. (3.22) the control signal is filtered before it is sent to the slave. The controller

is then tested for various filter cutoff frequencies. A first order filter of the form

output

input=

ωb

s + ωb

, (6.11)

is used where ωb represents the break frequency in rad/s.

Comparable results are achieved to the proposed controller when ωb = 15. For

large ωb the controller tends to go unstable. After much testing, it became apparent

that this instability actually arises from the controller exciting natural modes of

78

vibration in stiff contact. In the stiff environment the natural mode of the undamped

vibration is

ωn =

�Ke

M= 173.2rad/s, (6.12)

which is quite high (the damped natural frequency will be slightly lower). The

controller tends to excite these natural modes and the system becomes unstable. In

addition to the force responses of Fig. 6.14, the frequency spectrum of the control

force is included. As expected, high frequency components of the control signal

appear near the calculated natural mode of undamped vibration. Because controlling

these high frequency modes is not of particular concern, there is justification from

a performance perspective in adding a low-pass filter to the controller output. A

first order filter with ωb sufficiently damps these natural vibrations and prevent the

controller from causing instability. Thus, it can be said that under the nominal

control gains in Table 6.1 the backstepping technique behaves most like a first order

filter with cutoff frequency ωc = 15. Not surprisingly, this is in exact agreement with

the analysis in Section 3.3.2 in which it was shown that the cutoff frequency of the

backstepping filter effect is exactly determined by the gain G2.

Fig. 6.14 displays the effect of gain G2 has on the filtering properties of the

control signal. Additional damping, likely due to the neural network φ2w2, shifts the

response of controller so that its dominant frequency in stiff contact is further from

the natural mode. This decreases the risk of exciting the natural mode and causing

resonant behavior in the system.

79

0 1 2 3 4 50

1

2

3

4

5

6

Time (s)

Fc(N

)

ωb = 100

Filtered, ωbs+ωb

, control signal for stiff contact test without backsteppingfor various ωb

0 1 2 3 4 50

1

2

3

4

5

6

Time (s)

Fc(N

)

ωb = 50

0 1 2 3 4 50

1

2

3

4

5

6

Time (s)

Fc(N

)

ωb = 25

0 1 2 3 4 50

1

2

3

4

5

6

Time (s)

Fc(N

)

ωb = 15

Figure 6.12: Filtered control signal, Fc, without backstepping for various filter breakfrequencies. Contact with wall is made when Fc is approximately 2. Higher breakfrequencies allow excitation of the system’s normal modes.

80

0 100 200 300 400 5000

0.05

0.1

0.15

0.2

0.25

0.3

ω(rad/s)

|Fc(j

ω)|

ωb = 100

Frequency spectrum of filtered, ωbs+ωb

, control signal for stiff contact test without backsteppingfor various ωb

0 100 200 300 400 5000

0.05

0.1

0.15

0.2

0.25

0.3

ω(rad/s)

|Fc(j

ω)|

ωb = 50

0 100 200 300 400 5000

0.05

0.1

0.15

0.2

0.25

ω(rad/s)

|Fc(j

ω)|

ωb = 25

0 100 200 300 400 5000

0.05

0.1

0.15

0.2

0.25

ω(rad/s)

|Fc(j

ω)|

ωb = 15

Figure 6.13: Frequency spectrum of the filtered control signal, Fc, without backstep-ping for various filter break frequencies. Higher frequency components tend to excitethe natural modes of the system.

81

0 1 2 3 4 5 6−1

0

1

2

3

4

5

6

7

Time (s)

Fc(N

)

G2 = 80Control signal Fc and associated frequency spectrum when using backstepping with

various gains G2

0 2 4 6 80

1

2

3

4

5

6

7

8

Time (s)

Fc(N

)

G2 = 20

0 100 200 300 400 5000

0.05

0.1

0.15

0.2

0.25

G2 = 80

|Fc(j

ω)|

ω(rad/s)0 100 200 300 400 5000

0.05

0.1

0.15

0.2

0.25

G2 = 20

ω(rad/s)

|Fc(j

ω)|

Figure 6.14: Showing the filtering properties of the backstepping method. The back-stepping technique attenuates high frequency control signals and thus allows stableoperation.

82

6.2.2.2 Neural Network Outputs, Boundedness of Neural Network Weights, Evolu-

tion of System States

The neural network outputs for the stiff contact test are given in Fig. 6.15 and for

the loss of contact test in Fig. 6.16. As expected, the outputs change drastically

at transition points in order for the neural network to model the large changes in

environment stiffness.

The results in the section also examine the boundedness of the neural network

weights. A commanded force trajectory was designed and the controller was run for

200 trials using this same trajectory. It was not necessary to test this portion of the

controller using human produced trajectories. Additionally, it is easier to convince

ourselves that the weights are bounded when a consistent trajectory is applied. The

controller was simulated with

Fd = Fm + 0.2N. (6.13)

Fig. 6.17-6.18 show the results. As can be seen, the robust weight update laws

ensure the boundedness of the neural network weights. Of particular interest is

Fig. 6.19 in which the supervisory learning term is removed from the update law

of the parameter p. Under the same conditions, the system goes unstable without

a supervising learning term. It is interesting to note that the error s drives the

parameter p to zero, yet as p approaches zero the system goes unstable.

Finally, Fig. 6.20 show the evolution of the system states s and z as the desired

force trajectory is repeatedly imposed on the controller. The neural networks suc-

ceed in reducing s and z. However, it is important to remember that the proposed

controller shows its advantage in its speed of adaptation rather than its ability to

learn over time. Decreasing the learning gains would give a smoother learning curve,

83

but it was decided that the speed of adaptation is more important. This is why

the error s reaches its steady state value in approximately five trials. Nevertheless,

controller performance does improve with use.

84

0 1 2 3 4 5 60

2

4

6

8

Time (s)

Forc

e(N

)Neural Network Outputs for a stiff of contact test

FdFm

0 1 2 3 4 5 6−4

−2

0

2

4

Time (s)

φw

1

Output of neural network 1

0 1 2 3 4 5 6−0.5

0

0.5

1

1.5

Time (s)

φw

2


0 1 2 3 4 5 69.5

9.69.79.89.910

Time (s)

p

Adaptive parameter

Figure 6.15: Neural Network outputs in stiff contact. A wall is hit at around 3.5sand the neural network outputs react accordingly.

85

0 1 2 3 4 5 6 7 8−1

−0.5

0

0.5

1

Time (s)

Forc

e(N

)Neural Network Outputs for a loss of contact test

FdFm

0 1 2 3 4 5 6 7 8−6

−4

−2

0

2

4

Time (s)

φw

1


0 1 2 3 4 5 6 7 8−2

−1

0

1

2

Time (s)

φw

2


0 1 2 3 4 5 6 7 87

8

9

10

11

Time (s)

p

Adaptive parameter

Figure 6.16: Neural Network outputs in loss of contact. The puncture occurs ataround 2.5s and neural network outputs react quickly.

86

0 20 40 60 80 100 120 140 160 180 2000.05

0.1

0.15

0.2

0.25

w1,R

MS

Root Mean Squared Neural Network Weights

Trial

0 20 40 60 80 100 120 140 160 180 2000.01

0.02

0.03

0.04

0.05

0.06

w2,R

MS

Trial

0 20 40 60 80 100 120 140 160 180 2006

7

8

9

10

pR

MS

Trial

Figure 6.17: Root mean square neural network weights for 200 trials. Weight con-vergence is achieved.

87

0 20 40 60 80 100 120 140 160 180 2001

1.5

2

2.5

3

3.5Maximum Weight over the Duration of a Trial

||w

1||∞

Trial

0 20 40 60 80 100 120 140 160 180 2000.2

0.4

0.6

0.8

1

||w

2||∞

Trial

Figure 6.18: Maximum neural network weights for 200 trials. Weight convergence isachieved.

88

0 5 10 15 20 25 300.05

0.1

0.15

0.2

0.25

0.3

w1,R

MS

RMS Neural Network Weights without supervising ˙p

Trial

0 5 10 15 20 25 300.09

0.1

0.11

0.12

0.13

0.14

0.15

w2,R

MS

Trial

0 5 10 15 20 25 303

4

5

6

7

8

9

10

pR

MS

Trial

Figure 6.19: Root mean square neural network weight when there is no supervisedlearning in ˙p. Instability occurs after 28 trials.

89

0 20 40 60 80 100 120 140 160 180 2000.09

0.095

0.1

0.105

0.11

0.115

0.12Root Mean Squared system errors

s RM

S

Trial

0 20 40 60 80 100 120 140 160 180 2000.015

0.02

0.025

0.03

0.035

0.04

0.045

z RM

S

Trial

Figure 6.20: Convergence of states s and z over 200 trials.

90

6.2.2.3 Time delay

The proposed controller is tested for time delays up to 1sec. Fig. 6.21 displays

the results. When the time delay becomes large it is hard for the user to control

the slave, but an important result is that the system does not go unstable. The

difficulty in controlling the slave is an artifact of the time delay itself and represents

a weakness in the human’s ability to account for the delay. The most important

result is that regardless of the time delay, the impact force of the slave on the wall is

roughly 3.5N and this displays the ultimate advantage of using force control in a time

delayed system. Under position control, the controller would apply excessive force

to the wall because the human commanded position may be beyond the physical

constraints defined by the wall.

The case of sudden loss of contact is then tested with an environment in the

presence of time delay. It is important to test the loss of contact scenario because it

is expected that a force controlled system will behave undesirably in the case when

there is a time delay (large positional overshoot can occur because the controller

aims at tracking the last commanded force, Fd(t − T ) � Fm(t)). Fig. 6.22 shows

the results. The slave robot losses contact at x = 0.1m. As anticipated, there

is significant positional overshoot when the slave loses contact and this overshoot

increases with increased time delay (from 0.45cm for T = 0.01s up to 10cm for

T = 0.2s). Nevertheless, because of the auxiliary error definition the overshoot is

reduced due to the damping characteristics of s.

One would assume that decreasing Λ would decrease the position overshoot when

there is a sudden loss of contact. Although this is true initially, a larger Λ also

improves the response of the system when moving in free space (x is proportional to

Fd by a proportionality constant Λ). Therefore, a large Λ actually allows the human

91

operator to react to the positional overshoot faster. In most cases, it was found to

be more beneficial to have Λ = 1.

This test also displays the position tracking abilities of the proposed controller

when in free space.

92

0 2 4 6 8 10 120

2

4

6

8

10

12

Time (s)

Forc

e(N

)

Force tracking for stiff contact test under various time delays T

T = 0.1s

FdFm

0 2 4 6 8 10 120

2

4

6

8

10

12

Time (s)

Forc

e(N

)

T = 0.25s

0 2 4 6 8 10 120

2

4

6

8

10

12

Time (s)

Forc

e(N

)

T = 0.5s

0 2 4 6 8 10 120

2

4

6

8

10

12

Time (s)

Forc

e(N

)

T = 1s

Figure 6.21: Force response of the proposed controller in the presence of time delayfor a stiff contact test. Impact force remains the same for arbitrary time delays.

93

0 2 4 6 8 10 120

0.05

0.1

0.15

0.2

0.25

Time (s)

Slav

ePos

itio

n(m

)

T = 0.01sSlave position for loss of contact test under various time delays T

0 2 4 6 8 10 120

0.05

0.1

0.15

0.2

0.25

Time (s)

Slav

ePos

itio

n(m

)

T = 0.05s

0 2 4 6 8 10 120

0.05

0.1

0.15

0.2

0.25

Time (s)

Slav

ePos

itio

n(m

)

T = 0.1s

0 2 4 6 8 10 120

0.05

0.1

0.15

0.2

0.25

Time (s)

Slav

ePos

itio

n(m

)

T = 0.2s

Figure 6.22: Slave position in the presence of time delay for a loss of contact test(contact lost at x = 0.1m). Positional overshoot increases with increased time delay.

94

Chapter 7

Conclusions

A feedback controller has been proposed for the control of a remote surgical robot.

A force feedback haptic device supplies the human operator with force information

at the remote site. Force sensor information at the master-human interface provide

a desired force trajectory for the slave. It is argued that a force tracking slave

is sufficient for high fidelity haptic control when the human operator is supplied

with force and visual feedback from the remote site. Force tracking has obvious

advantages for teleoperation systems, particularly in the presence of time delays and

when contact with a hard (constrained) surface is made.

The feedback controller aims at minimizing force errors between the commanded

force and the force measured at the patient. An auxiliary error definition also ensures

that the robot’s velocity is kept low as well as providing positional control in free mo-

tion. Reducing the robot’s velocity has distinct advantages when the slave suddenly

loses contact with the environment. A force tracking slave produces large positional

overshoot, a problem compounded when the time delay is large. The backstepping

technique filters out high frequency components of the control signal. Thus, high

frequency natural modes inherent in stiff environments are not excited while low

frequency control was sufficiently maintained. It is argued that this is valid for the

realistic cases where the high frequency natural modes of the system are naturally

slightly damped.

Two neural networks as well as an adaptive parameter model unknown system

dynamics. Indeed, the controller has no information about the environment or slave.

95

These adaptive components also add damping to the system and ensure that high

frequency spikes in force measurements are not transmitted to the slave actuators.

Weight updates aim at reducing state errors and are inherently robust to unmodeled

dynamics. Designing the weight update laws at the second stage of backstepping al-

low for a tuning function design and robustifies the weight updates. A novel method

for updating the weights when using a projection rule is proposed. Using the super-

vised learning method proposed in [58] forces the weights to remain close to their

initial value and allow for fast adaptation as well as weight convergence over time.

The true benefit of the adaptive control design is its ability to react quickly to changes

in the remote environment. Performance nonetheless improves over time.

A smooth robust control law designed using Lyapunov redesign methods ensures

uniform ultimate boundedness of all signals in the presence of modeling errors from

the neural networks. Stability of the overall system is proven using a Lyapunov

control function and a bound on all signals exists.

Experimental results verify the proposed controller’s performance. First, it is

compared to an optimal H2 control and a passivity based control and shown to

be superior in many ways. Next, the effect of the backstepping technique is shown.

Stability of the system is displayed by repeatedly hitting a hard surface 200 times and

recording the neural network weight convergence as well as state error convergence.

The novel proposed weight update method is tested for the adaptive parameter.

Finally, it is shown the effect of a communications time delay on the system. In

particular, it is shown that the maximum force exerted by the slave on the patient

does not increase with time delay. As well, the effect of suddenly losing contact with

a surface is shown.

In summary, three main contributions were made and tested in this work:

96

• A unique auxiliary error definition reduces force errors while providing position

control when the slave moves in free space,

• Using neural adaptive backstepping to ensure stable control when contact with

a stiff environment is made,

• Novel neural network update law ensures stable and robust control.

7.0.3 Future Work

There is a multitude of additional future work that could potentially arise from this

research. From a theoretical point of view, additional analysis on the filtering prop-

erties of the backstepping technique would prove insightful. For instance, quantifying

the optimal filtering for various environments to ensure the controller does not excite

the high frequency modes of the environment would prove useful. Also, examining

various orders of filters induced by the backstepping technique would be interesting.

For example, using an additional step of backstepping would induce a second order

filter on the virtual controls allowing for better filter performance with stepper cut-

offs. Also, the effect of the time delay on the system stability was not quantified

rather only showed that it appears stable for delays up to 1sec. Examining the case

of varying time delays would also be of importance in order to allow teleoperation

over time varying and unreliable communications channels such as the internet or

wireless communication to space. Finally, an extension from the 1 DOF case to a

multi-DOF case is a logical next step. Testing the combined effects of disturbances

and errors at each joint to the overall performance would rigorously test the valid-

ity of the control design. Also, I would be interested to test the novel supervised

projection update method for the muli-DOF case in which the adaptive parameter

97

becomes a matrix of neural networks.

From a validation stand point, it would be beneficial to perform experiments

with a real slave robot. Hidden dynamics and subtleties inevitably surface when a

controller is taken from the simulation stage to experiment. These arise from sensor

noise, actuator limitations and nonlinearities, and unmodeled dyanmics. Also, the

assumption was made that a perfect master device is used in which the measured

force measured at the slave is perfectly reflected to the human operator. Using a

real force sensor rather than the virtual force sensor would allow the consideration

of limitations on the master side.

Bibliography

[1] R. Aracil, M. Buss, S. Cobos, M. Ferre, S. Hirche, M. Kuschel, and A. Peer,

The Human Role in Telerobotics, pp. 11–24. Springer, 2007.

[2] R. Cole and D. Parker, “Stereo tv improves manipulator performance,” in Pro-

ceeding of the SPIE, (Bellingham, WA.), pp. 18–27, 1990.

[3] A. Meier, C. Rawn, and T. Krummel, “Virtual reality: Surgical application -

challenge for the new millennium,” Journal of the American College of Surgeons,

vol. 192, pp. 372–384, March 2001.

[4] C. Basdogan, C. Ho, M. Srinivasan, and M. Slater, “An experimental study on

the role of touch in shared virtual environments,” in ACM Transactions on CHI,

pp. 443–460, 2000.

[5] S. Brave and A. Dahley, “intouch: A medium for haptic interpersonal commu-

nication,” in Proceedings of CHI, pp. 363–364, 1997.

[6] B. Fogg, L. Cutler, P. Arnold, and C. Eisbach, “Handjive: A device for in-

terpersonal haptic entertainment,” in Proceedings of CHI, (Los Angeles, CA.),

pp. 57–64, 1998.

[7] E. Sallnas, K. Rassmus-Grohn, and C. Sjostrom, “Supporting presence in col-

laborative environments by haptic force feedback,” in ACM Transactions on

CHI, pp. 461–476, 2000.

[8] I. Oakley, S. Brewster, and P. Gray, “Can you feel the force? an investigation

of haptic collaboration in shared editors,” in Proceedings of EuroHaptics, 2001.

98

99

[9] neuroArm, “neuroarm,” March 2010. http://www.neuroarm.org/.

[10] G. Ballantyne and F. Moll, “The da vinci telerobotic surgical system: the virtual

operative field and telepresence surgery,” Surgical Clinics of North America,

vol. 83, pp. 1293–1304, 2003.

[11] F. Isgro, A. Kiessling, M. Blome, A. Lehmann, B. Kumle, and W. Saggau,

“Robotic surgery using zeus microwrist technology: the next generation,” Jour-

nal of Cardiac Surgery, vol. 18, pp. 1–5, 2003.

[12] F. Tendick and S. Sastry, Minimally Invasive Robotic Telesurgery, pp. 89–94.

Kluwer Academic Publishers, 2001.

[13] P. Fager and P. von Wowern, “The use of haptics in medical applications,”

The International Journal of Medical Robotics and Computer Assisted Surgery,

vol. 1, pp. 36–42, 2005.

[14] F. Seto, Y. Hirata, and K. Kosuge, “Real-time cooperating motion generation

for man-machine systems and its application to medical technology,” Technology

and Health Care, vol. 15, pp. 121–130, 2007.

[15] D. Lawrence, “Stability and transparency in bilateral teleoperation,” IEEE

Transactions on Robotics and Automation, vol. 9, pp. 624–637, October 1993.

[16] A. Aziminejad, M. Tavakoli, R. Patel, and M. Moallem, “Transparent time-

delayed bilateral teleoperation using wave variables,” IEEE Transactions on

Control Systems Technology, vol. 16, pp. 548–555, May 2008.

[17] G. Sankaranarayanan and B. Hannaford, “Virtual coupling schemes for position

coherency in networked haptic environments,” in Proceedings of the BioRob

100

Conference, (Pisa, Italy), 2006.

[18] M. Cavusoglu, A. Sherman, and F. Tendick, “Design of bilateral teleoperation

controllers for haptic exploration and telemanipulation of soft environments,”

IEEE Transactions on Robotics and Automation, vol. 20, pp. 1–7, August 2002.

[19] H. Lee and M. Chung, “Adaptive controller of a master-slave system for trans-

parent teleoperation,” Journal of Robotic Systems, vol. 15, pp. 465–475, 1998.

[20] R. Anderson and M. Spong, “Asymptotic stability for force reflecting teleop-

erators with time delay,” International Journal of Robotics Research, vol. 11,

pp. 135–149, April 1992.

[21] R. Anderson and M. Spong, “Bilateral control of teleoperators with time delay,”

IEEE Transactions on Automatic Control, vol. 34, pp. 494–501, May 1989.

[22] H. Kazerooni, T. Tsay, and C. Moore, “Telefunctioning: An approach to teler-

obotic manipulations,” in American Control Conference, (San Diego, CA),

pp. 2778–2783, 1990.

[23] Y. Strassberg, A. Goldenberg, and J. Mills, “A new control scheme for bilateral

teleoperating systems: Performance evaluation and comparison,” in Proceedings

of the IEEE/RSJ International Conference on Intelligent Robots and Systems,

(Raleigh, NC), pp. 865–872, July 1992.

[24] M. Tavakoli, A. Aziminejad, R. Patel, and M. Moallem, “High-fidelity bilateral

teleoperation systems and the effect of multimodal haptics,” IEEE Transactions

on Systems, Man, and Cybernetics, vol. 37, pp. 1512–1528, December 2007.

101

[25] Z. Hu, S. Salcudean, and P. Loewen, “Robust controller design for teleoperation

systems,” in IEEE Conference on Systems, Man, and Cybernetics Intelligent

Systems for the 21st Century, pp. 2127–2132, October 1995.

[26] J. Gil, A. Avello, A. Rubio, and J. Florez, “Stability analysis of a 1 dof hap-

tic interface using the routh-hurwitz criterion,” IEEE Transactions on Control

Systems Technology, vol. 12, pp. 583–588, July 2004.

[27] Y. Yokokohji, E. V. Poorten, and T. Yoshikawa, “Haptic control architectures

based on scattering theory and wave-variables,” in Proceedings of the Virtual

Reality Society of Japan Annual Conference, (Japan), pp. 319–322, 2002.

[28] G. Niemeyer and J. Slotine, “Stable adaptive teleoperation,” IEEE Journal of

Oceanic Engineering, vol. 16, pp. 152–162, January 1991.

[29] S. Stramigioli, A. van der Schaft, B. Maschke, and C. Melchiorri, “Geometric

scattering in robotic telemanipulation,” IEEE Transactions on Robotics and

Automation, vol. 18, pp. 588–595, August 2002.

[30] J. Ryu, D. Kwon, and B. Hannaford, “Stable teleoperation with time-domain

passivity control,” IEEE Transactions on Robotics and Automation, vol. 20,

pp. 365–373, April 2004.

[31] H. Khalil, Nonlinear Systems. New Jersey: Prentice Hall, 2002.

[32] S. Zak, Systems and Control. New York: Oxford University Press, 2003.

[33] E. Sontag, “A lyapunov-like characterization of asymptotic controllability,”

SIAM Journal of Control and Optimization, vol. 21, pp. 462–471, 1983.

102

[34] R. Freeman and P. Kokotovic, Lyapunov Design, pp. 932–940. IEEE Press,

1996.

[35] M. Krstic, P. Kokotovic, and I. Kanellakopoulos, Nonlinear and Adaptive Con-

trol Design. New York: John Wiley & Sons, Inc, 1995.

[36] M. Krstic, I. Kanellakopoulos, and P. Kokotovic, “Adaptive nonlinear control

without overparameterization,” Systems and Controls Letters, vol. 19, pp. 177–

185, 1992.

[37] J. Park and I. Sandberg, “Universal approximation using radial-basis-function

networks,” Neural Computation, vol. 3, pp. 246–257, 1991.

[38] K. Hornik, “Approximation capabilities of multilayer feedforward networks,”

Neural Networks, vol. 4, pp. 251–257, 1991.

[39] E. Bishop, “A generalization of the stone-weierstrass theorem,” Pacific Journal

of Mathematics, vol. 11, pp. 777–783, 1961.

[40] S. Seshagiri and H. Khalil, “Output feedback control of nonlinear systems using

rbf neural networks,” IEEE Transactions on Neural Networks, vol. 11, pp. 69–

79, January 2000.

[41] Y. Li, S. Qiang, X. Zhuang, and O. Kaynak, “Robust and adaptive backstepping

control for nonlinear systems using rbf neural networks,” IEEE Transactions on

Neural Networks, vol. 15, pp. 693–701, May 2004.

[42] R. Sanner and J. Slotine, “Gaussian networks for direct adaptive control,” IEEE

Transactions on Neural Networks, vol. 3, pp. 837–863, November 1992.

103

[43] S. Ge and C. Wang, “Direct adaptive nn control of a class of nonlinear systems,”

IEEE Transactions on Neural Networks, vol. 13, pp. 214–221, January 2002.

[44] J. Slotine and W. Li, “Adaptive robot control: A new perspective,” in Proceeding

of the 26th Conference on Decision and Control, (Los Angeles, U.S.A.), pp. 192–

197, 1987.

[45] P. Ioannou and P. Kokotovic, “Instability analysis and improvement of robust-

ness of adaptive control,” Automatica, vol. 20, pp. 583–594, 1984.

[46] F. Chen and H. Khalil, “Adaptive control of nonlinear systems using neural

networks - a deadzone approach,” in American Control Conference, (Boston,

MA.), pp. 667–672, 1991.

[47] K. Narendra and A. Annaswamy, “A new adaptive law for robust adaptation

without persistent excitation,” in American Control Conference, (Seattle, WA.),

pp. 1067–1072, 1986.

[48] Y. Fung, Biomechanics: Mechanical Properties of Living Tissues. New York:

Springer-Verlag, 1993.

[49] U. Kuhnapfel, H. Cakmak, and H. Maab, “Endoscopic surgery training us-

ing virtual reality and deformable tissue simulation,” Computers and Graphics,

vol. 24, pp. 671–682, 2000.

[50] S. Eppinger and W. Seering, “Understanding bandwidth limitations in robot

force control,” in IEEE International Conference on Robotics and Automation,

1987.

104

[51] J. Chow and C. Hanley, “Singular perturbation analysis of high-frequency filter

design,” International Journal of Control, vol. 51, pp. 705–720, October 1990.

[52] C. Macnab, G. D’Eleuterio, and M. Meng, “Cmac adaptive control of flexi-

ble -joint robots using backstepping with tuning functions,” in Proceeding of

the IEEE International Conference on Robotics and Automation, vol. 3, (New

Orleans, U.S.A.), pp. 2679–2686, 2004.

[53] L. Hsu and R. Costa, “Bursting phenomena in continuous-time adaptive systems

with a sigma -modification,” IEEE Transactions on Automation and Control,

vol. 32, pp. 84–86, 1987.

[54] C. Macnab, “Preventing bursting in approximate-adaptive control when using

local basis functions,” Fuzzy Sets and Systems, vol. 160, pp. 439–462, 2009.

[55] L. Lubin, S. Grocott, and M. Athans, H2 (LQG) and H∞ Control, pp. 651–661.

IEEE Press, 1996.

[56] D. Lawrence, L. Pao, M. Salada, and M. Dougherty, “Quantitative experimen-

tal analysis of transparency and stability in haptic interfaces,” in Proceedings

of the ASME International Mechanical Engineering Congress and Exposition,

(Atlanta, GA.), pp. 441–449, November 1996.

[57] P. Millman and J. Colgate, “Effects of non-uniform environment damping on

haptic perception and performance of aimed movements,” in Proceedings of the

International Mechanical Engineering Congress and Exposition, (San Francisco,

CA.), 1995.

[58] D. Richert, A. Beirami, and C. Macnab, “Neuro-adaptive control of robotic

105

manipulators using a supervisor inertia matrix,” in Proceeding of the 4th In-

ternational Conference on Autonomous Robots and Agents, (Wellington, N.Z.),

pp. 634–639, 2009.

106

Appendix A

Stability Analysis Including Disturbances

The stability of the system is examined when disturbances are included in the system

model, and particularly the robustifying behavior of Fc,rob. Since a uniform ultimate

bound has already been established for the system in the absence of disturbances,

it is sufficient to show that the system with only disturbances is likewise uniformly

ultimately bounded using the same Lyapunov functions of Chapter 4. Thus, begin

the analysis when disturbances appear in the Lyapunov candidate function, Eq.

(4.9), and only consider the terms of importance. In the following analysis, only

disturbances that arise from modeling errors are considered. Therefore, it is easier

to follow the below analysis by looking at Chapter 4 and noting that disturbance

terms are added every time a neural network is used,

V1,d = s(d1 + dp[−Fm + Fc] + pαrob). (A.1)

αrob has been designed such that

V1,d = s(d1 + dp[−Fm + Fc]− |s|1.1[µ2|− Fm + Fc|

1.1 + µ1]). (A.2)

The second stage of backstepping yields,

V2,d = V1,d + z(Fc,rob + ˙α− α), (A.3)

and only considering the disturbances

V2,d = V1,d + z(Fc,rob − αnom,d − αrob,d). (A.4)

107

Look at disturbances that arise from αnom (an extension of the analysis in Eq. (4.25))

yields

αnom,d = d2 + p−1

�∂φ1

∂sw1 + G1

�(d1 + dp[−Fm + Fc]), (A.5)

and considering αrob

αrob =− p−1|s|

1.1 d

dt(µ2|− Fm + Fc|

1.1 + µ1)− p−1 d

dt(|s|1.1)(µ2|− Fm + Fc|

1.1 + µ1)

−d

dt(p−1)|s|1.1(µ2|− Fm + Fc|

1.1 + µ1), (A.6)

=− p−1|s|

1.1(1.1µ2|− Fm + Fc|0.1[Fc − Fm]sgn(−Fm + Fc))

− 1.1p−1|s|

0.1ssgn(s)(µ2|− Fm + Fc|

1.1 + µ1)

+ ˙pp−2|s|

1.1(µ2|− Fm + Fc|1.1 + µ1). (A.7)

Two terms, s and Fm will give rise to disturbances in αrob because neural network

approximations must be used to implement them,

αrob,d =− p−1|s|

1.1(1.1µ2|− Fm + Fc|0.1

d2sgn(−Fm + Fc))

− 1.1p−1|s|

0.1(d1 + dp[−Fm + Fc])sgn(s)(µ2|− Fm + Fc|1.1 + µ1).

(A.8)

Renaming

κ1 =− p−1

�∂φ1

∂sw1 + G1

�, (A.9)

κ2 =p−1|s|

1.1µ2|− Fm + Fc|

0.1sgn(−Fm + Fc), (A.10)

κ3 =p−1|s|

0.1(µ2|− Fm + Fc|1.1 + µ1)sgn(s), (A.11)

108

and putting these results back into Eq. (A.4)

V2,d =V1,d + z(Fc,rob − d2 + κ1[d1 + dp(−Fm + Fc)]

+ κ2d2 + κ3[d1 + dp(−Fm + Fc)]), (A.12)

=s(d1 + dp[−Fm + Fc]− |s|1.1[µ2|− Fm + Fc|

1.1− µ1])

+ z(−d2 + κ1d1 + κ1[−Fm + Fc]dp + κ2d2 + κ3d1 + κ3[−Fm + Fc]dp + Fc,rob),

(A.13)

and including the robust control Fc,rob,

V2,d =s(d1 − µ1|s|1.1)

+ s(dp[−Fm + Fc]− µ2|s|1.1|− Fm + Fc|

1.1)

+ z(−d2 − µ3|z|1.1)

+ z(κ1d1 − µ4|κ1|1.1|z|

1.1)

+ z(κ1[−Fm + Fc]dp − µ5|κ1[−Fm + Fc]|1.1|z|

1.1)

+ z(κ2d2 − µ6|κ2|1.1|z|

1.1)

+ z(κ3d1 − µ7|κ3|1.1|z|

1.1)

+ z(κ3[−Fm + Fc]dp − µ8|κ3[−Fm + Fc]|1.1|z|

1.1), (A.14)

which defines eight regions. Each region contributes to V2,d being negative definite,

and thus creating a bound for V2,d, according to the results in Table A.1 Using the

same approach as in Section 2.1.2, a conservative bound on V2,d is found by noting

that V2,d is certainly negative if

|s| >

�d1,max

µ1

�0.91

∪

�dp,max

µ2

�0.91

= δs, (A.15)

109

Table A.1: Bounds which contribute to V2,d being negative definiteBound Condition

1 |s| >

�d1,max

µ1

�0.91

2 |s||− Fm + Fc|0.091

>

�dp,max

µ2

�0.91

3 |z| >

�d2,max

µ3

�0.91

4 |z||κ1|0.091

>

�d1,max

µ4

�0.91

5 |z||κ1(−Fm + Fc)|0.091>

�dp,max

µ5

�0.91

6 |z||κ2|0.091

>

�d2,max

µ6

�0.91

7 |z||κ3|0.091

>

�d1,max

µ7

�0.91

8 |z||κ3(−Fm + Fc)|0.091>

�dp,max

µ8

�0.91

or

|z| >

�d2,max

µ3

�0.91

∪

�d1,max

µ4

�0.91

∪

�dp,max

µ5

�0.91

∪

�d2,max

µ6

�0.91

∪

�d1,max

µ7

�0.91

∪

�dp,max

µ8

�0.91

= δz. (A.16)

Combining this result with that of Eq. 4.32 redefines the ultimate bound on the

system error to be

ξb =�

[δ2ξ] ∪ [δ2

s+ δ2

z] + δ2

w. (A.17)

110

Appendix B

Robust control for scaled tuning functions

As discussed in Appendix A, sometimes it is not desirable to use robust control terms

to make the bounds due to disturbances small. Indeed, in doing so the control places

far too much effort in overcoming disturbance terms rather than driving state errors

to equilibrium. This problem compounds because state bounds are derived based on

worst-case-scenarios which are rarely encountered.

Using a scaled tuning function approach robust behavior is maintained while

ensuring the nominal control is able to perform well. Nevertheless, designing a

robustifying control term to ensure boundedness of the states when we scale the

tuning functions is necessary.

The proof begins from Eq. (4.17) with a redefinition of the Lyapunov candidate,

V2 = V1 +γ

2z

2 +1

2β2wT

2 w2, (B.1)

where the addition of 0 ≤ γ ≤ 1 is paramount for the proof. Following the stability

proof in Chapter 4 from this point onwards shows that the final Lyapunov control

function derivative is achieved

V2 =−G1s2−G2z

2 + (1− γ)spz + γzur,τ

+ p

�τp + γ(−Fm + Fc)

�∂φ1

∂sw1 + G1

�p−1

z −1

βp

˙p

�

+ wT

1

�τ1 + γφ

T

1

�∂φ1

∂sw1 + G1

�p−1

z −1

β1

˙w1

�+ wT

2

�φ

T

2 z −1

β2

˙w2

�, (B.2)

where an additional robust control ur,τ is included for later design. Using the designed

111

weight update laws defined in Eqs. (3.26), (3.29), (3.34) yields

V2 = −G1s2−G2z

2 + (1− γ)spz + γzur,τ − ζ p(p− p) + ν1wT

1 w1 + ν2wT

2 w2. (B.3)

By treating (1− γ)spz as a system disturbance, design now

ur,τ =γ − 1

γsp, (B.4)

to end up with the exact expression in Eq. (4.29). However, it is found that with

γ � 1, ur,τ tends to become large. To ensure stability yet still take advantage of

the weighted tuning function method the following approach is used. Assuming that

γ � 1, approximate V2 by

V2,s,z ≈ −G1s2−G2z

2 + spz. (B.5)

The states related to w1,2 and p have been excluded because the have already been

shown to be bounded. Therefore, this Lyapunov derivative can easily be observed.

Using a switching criteria as proposed by [52]

γ =

1 if −G1s2 −G2z

2 + spz > 0,

γdesign otherwise,

(B.6)

thus ensuring that V is bounded. [52] also notes that changing the learning rates β1,p

when γ changes improves performance.

university of calgary robust control design for

Documents