university of calgary robust control design for
TRANSCRIPT
UNIVERSITY OF CALGARY
Robust Control Design for Teleoperation Systems with Haptic Feedback
using Neural-Adaptive Backstepping
by
Dean Matthew Richert
A THESIS
SUBMITTED TO THE FACULTY OF GRADUATE STUDIES
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE
DEGREE OF MASTER OF SCIENCE
DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING
CALGARY, ALBERTA
APRIL, 2010
c� Dean Matthew Richert 2010
Abstract
Teleoperation holds a promising future as humans push the limits of technology by
allowing human presence in otherwise hostile or remote environments. This the-
sis examines specifically teleoperation with the use of force reflecting haptic devices
as it pertains to robotic surgery. Neural networks operate online to learn the un-
known system dynamics and provide a completely adaptive control design. There
are three novel contributions made in this thesis. First, a unique error definition
allows for force control in constrained motion while permitting position control in
unconstrained motion. Secondly, the backstepping technique smoothes out control
signals and ensures that high frequency vibrations of the robot/environment dynam-
ics are not excited by the proposed controller. Finally, a novel supervisory neural
network update law ensures fast convergence of neural network weights and improves
robustness. The entire system is shown to be globally Lyapunov stable. Using the
Lyapunov redesign method a robust control law is also derived.
ii
Acknowledgements
Thanks must first go to my wife, Jane, who patiently (and sometimes even atten-
tively) listened to me explain my daily findings. I’m sure she is the only student in the
International Development department who knows the difference between Lyapunov
stability and stability by passivity. If you ever talk to her about your research she’ll
be sure to ask you, ”is your model linear or nonlinear.” Our relationship has been
a blessing throughout my studies and has brought joy and laughter to me even in
stressful and discouraging times. She has undeserved belief in me and the motivation
for all I do comes from her and God.
From a research perspective, my supervisor Dr. Chris Macnab and co-supervisor
Dr. Jeff Pieper deserve many thanks. They have both given me invaluable direction
and supplied me with the resources I needed to complete this work. Indeed, the
vision and many of the preliminary details of this work were inspired by Dr. Macnab
and his foresight in to this topic ensured me a gentle road to graduation. I would
also like to thank Dr. Macnab for particular opportunities he made available to me
including attending conferences and teaching tutorial periods.
Finally, I would like to thank my office mates: Javad, Sanaz, Tayyab, and Khalid.
They have always respected me and kept our lab a functional place to do work. In
particular Tayyab and Javad have been instrumental in me attaining this degree.
I’ve been thankful that Tayyab and I were able to take all of the same courses and
our discussions immensely helped me understand the course material. Javad always
asked difficult questions that would challenge me to explore the very foundations of
control systems and his work ethic inspired me everyday.
iii
Table of Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Introduction to Teleoperation Vocabulary . . . . . . . . . . . 4
1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 State of the Art in Teleoperation (literature review) . . . . . . . . . . 8
1.3.1 Sensor choice . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.2 Choice of control system . . . . . . . . . . . . . . . . . . . . . 10
1.4 Introduction to the Proposed Solution . . . . . . . . . . . . . . . . . 13
2 Background Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1 Lyapunov Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.1 Control Lyapunov Function . . . . . . . . . . . . . . . . . . . 17
2.1.2 Lyapunov Redesign (robust design) . . . . . . . . . . . . . . . 18
2.1.3 Backstepping . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.1.4 Adaptive Backstepping and Tuning Functions . . . . . . . . . 24
2.2 Radial Basis Functions Networks . . . . . . . . . . . . . . . . . . . . 26
2.2.1 RBFN in controller design . . . . . . . . . . . . . . . . . . . . 28
3 Proposed Controller Design . . . . . . . . . . . . . . . . . . . . . . . 32
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
iv
3.2 System description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3 Proposed Control Law . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3.1 Design choice 1: Error definition . . . . . . . . . . . . . . . . . 34
3.3.2 Design choice 2: Backstepping . . . . . . . . . . . . . . . . . . 36
3.3.3 RBFNs used in controller . . . . . . . . . . . . . . . . . . . . 40
3.3.4 Control Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4 Stability Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5 Alternative Control Designs for Comparison . . . . . . . . . . . . . . 51
5.1 H2 Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.1.1 Discussion on H2 controller . . . . . . . . . . . . . . . . . . . 54
5.2 Output Feedback Control . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2.1 Discussion on output feedback control . . . . . . . . . . . . . 56
6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.1.1 Master device . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.1.2 Virtual components . . . . . . . . . . . . . . . . . . . . . . . . 62
6.1.3 Implementation Considerations . . . . . . . . . . . . . . . . . 64
6.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.2.1 Stiff Contact Test and Loss of Contact Test (comparison with
H2 and output feedback controller) . . . . . . . . . . . . . . . 65
6.2.2 Proposed controller performance . . . . . . . . . . . . . . . . . 77
6.2.2.1 Filtering properties of Backstepping . . . . . . . . . 77
6.2.2.2 Neural Network Outputs, Boundedness of Neural Net-
work Weights, Evolution of System States . . . . . . 82
6.2.2.3 Time delay . . . . . . . . . . . . . . . . . . . . . . . 90
v
7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
7.0.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
A Stability Analysis Including Disturbances . . . . . . . . . . . . . . . . 106
B Robust control for scaled tuning functions . . . . . . . . . . . . . . . 110
vi
List of Tables
6.1 Experiment Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.2 Average maximum measured force at the slave end effector due to the
proposed controller, H2 controller, and the output feedback controller 68
A.1 Bounds which contribute to V2,d being negative definite . . . . . . . . 109
vii
List of Figures
1.1 neuroArm haptic hand controllers . . . . . . . . . . . . . . . . . . . . 3
1.2 neuroArm surgical (slave) robot manipulators . . . . . . . . . . . . . 4
1.3 Teleoperation setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Impedance interpretation of a teleoperation system . . . . . . . . . . 6
1.5 Linear system based control architecture . . . . . . . . . . . . . . . . 11
2.1 General shape of the robust control . . . . . . . . . . . . . . . . . . . 21
2.2 General spatial derivative of robust control . . . . . . . . . . . . . . . 22
3.1 Mechanical model of a 1 DOF surgical slave robot in contact with an
environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.1 Output feedback control architecture . . . . . . . . . . . . . . . . . . 55
6.1 Master haptic device used for experiments. . . . . . . . . . . . . . . . 59
6.2 Screen capture of the Simulink model used for experiments. . . . . . . 62
6.3 Force profile of the simulated remote environment . . . . . . . . . . . 63
6.4 Comparison of the three controllers for the stiff contact test. Proposed
controller hits the wall with 118N less force than the H2 controller and
21N less force than the output feedback controller. . . . . . . . . . . 69
6.5 Zoomed in version of Fig. 6.4 to emphasize the performance benefit
of the proposed controller . . . . . . . . . . . . . . . . . . . . . . . . 70
6.6 A version of Fig. 6.5 with only the proposed controller performance.
Axis are the same as in Fig. 6.5. . . . . . . . . . . . . . . . . . . . . . 71
viii
6.7 Comparison between the proposed controller and an output feedback
controller for the loss of contact test. The proposed controller has
less positional overshoot than the output feedback controller, but a
greater negative velocity. . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.8 Comparison of the three controllers for the stiff contact test using the
PI human model. The controller response is quite similar to those
shown in Fig. 6.4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.9 Zoomed in version of Fig. 6.8 . . . . . . . . . . . . . . . . . . . . . . 74
6.10 A version of Fig. 6.9 with only the proposed controller performance.
Axis are the same as in Fig. 6.9. . . . . . . . . . . . . . . . . . . . . . 75
6.11 Comparison between the proposed controller and an output feedback
controller for the loss of contact test using the PI human model. Again,
the controller response is quite similar to the results shown in Fig. 6.7. 76
6.12 Filtered control signal, Fc, without backstepping for various filter
break frequencies. Contact with wall is made when Fc is approxi-
mately 2. Higher break frequencies allow excitation of the system’s
normal modes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.13 Frequency spectrum of the filtered control signal, Fc, without back-
stepping for various filter break frequencies. Higher frequency compo-
nents tend to excite the natural modes of the system. . . . . . . . . . 80
6.14 Showing the filtering properties of the backstepping method. The
backstepping technique attenuates high frequency control signals and
thus allows stable operation. . . . . . . . . . . . . . . . . . . . . . . . 81
6.15 Neural Network outputs in stiff contact. A wall is hit at around 3.5s
and the neural network outputs react accordingly. . . . . . . . . . . . 84
ix
6.16 Neural Network outputs in loss of contact. The puncture occurs at
around 2.5s and neural network outputs react quickly. . . . . . . . . . 85
6.17 Root mean square neural network weights for 200 trials. Weight con-
vergence is achieved. . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.18 Maximum neural network weights for 200 trials. Weight convergence
is achieved. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.19 Root mean square neural network weight when there is no supervised
learning in ˙p. Instability occurs after 28 trials. . . . . . . . . . . . . . 88
6.20 Convergence of states s and z over 200 trials. . . . . . . . . . . . . . 89
6.21 Force response of the proposed controller in the presence of time delay
for a stiff contact test. Impact force remains the same for arbitrary
time delays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.22 Slave position in the presence of time delay for a loss of contact test
(contact lost at x = 0.1m). Positional overshoot increases with in-
creased time delay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
x
1
Chapter 1
Introduction
1.1 Motivation
Teleoperation of robot manipulators provides a means for humans to manipulate
remote, virtual, or otherwise hostile environments. Applications vary widely from
the construction industry, such as backhoe operation, video games, to deep sea ex-
ploration. Humans by nature interact in many different ways with our environment.
Humans are equipped with many sensing devices that help us effectively manipulate
objects to our desire. Sight, smell, touch, hearing, and taste are famously referred to
as our ’five senses’ and are the primary methods by which we attain information from
our surroundings. In teleoperation terms, we are multi-modal in our interactions [1].
The sense of sight tends to be of utmost importance in teleoperation applications [2].
Millan and Colgate [3] have stated that 70% our sensory input as humans is visual
information. As a result, significant technological advances have been made in past
years in trying to give humans a sense of physical presence in remote scenes. Cam-
eras, video recorders, and television are but a few such examples. Research also
suggests that touch vastly improves the performance of human operators in teleop-
eration applications [4–8]. Though the sense of touch only accounts for 5% of our
sensory input, its role in teleoperation is invaluable [3]. Indeed, current advances in
force-reflecting devices echo this demand to be able to feel the remote environment
we are manipulating. In the 1950s the world saw its first force-reflecting applica-
tion in a large modern aircraft. The use of servomechanisms to operate mechanical
2
control surfaces in aircraft effectively eliminated any sense of force applied at these
control surfaces. High frequency buffeting at the brink of a stall no longer warned
the pilot of the impending danger. As a result and out of necessity, a force reflecting
device was required. This thesis examines another application in which the sense of
touch is paramount: robot assisted surgery. Because surgeons must exhibit extreme
delicacy and precision, visual cues alone are inadequate.
Let’s first examine the motivation for robot assisted surgery. A recent project
labeled “Project neuroArm” at the Foothills Medical Research Center is, in more
ways that one, the first of its kind in terms of robot assisted surgery (detailed below)
and is a primary motivator for this work [9]. The ultimate goal of neuroArm is
to provide the means for neuro-surgeons to perform surgery on patients while the
patient is being monitored by Magnetic Resonance Imaging (MRI), an environment
otherwise hostile to surgeons. Allowing the surgeon access to near real-time imagery
of the patient’s brain can ensure that a surgeon removes all traces of cancerous tumors
at the surgery’s conclusion. A result of this is a reduction of follow up surgeries
and hence reduced risk to patients and less strain on already over-booked surgical
equipment. The neuroArm system also employs two haptic (force-feedback) hand
controllers (Fig. 1.1) that the surgeon uses to control the surgical robot. In its current
configuration, manipulating the hand controllers provides positional commands to
each of the surgical robot arms (Fig. 1.2) while force sensor measurements at the
robot/environment interface provide force commands to the haptic devices. Other
uses for robots in surgery include minimally invasive surgery (in which only small
incisions are made to gain access to a desired organ. The commercially available Da
Vinci system [10] eliminates the need for large abdominal incisions in gynecological
surgeries), telesurgery [11] (in which a centrally located surgeon can perform surgery
3
Figure 1.1: neuroArm haptic hand controllers
in remote locations), and tremor reduction or improved performance at the mircoscale
level [12].
A multitude of research has agreed that force-reflecting haptic devices enhance
task performance in robot assisted surgery [13]. However, the addition of haptic
devices certainly complicates the system as well. For instance, there is a necessity for
haptic devices to reproduce forces at a certain level of accuracy. Indeed, inaccurate
or misleading forces reflected by the haptic device could be detrimental, causing the
surgeon to apply excessive force as an example. Accordingly, research in haptics
is dominated by engineers as they use their skills to improve force transparency
between the surgeon and the remote environment. As it tends to proceed with
these undertakings, progress is being made from both a mechanical approach and a
control systems approach. Haptic devices pose some unique challenges, that will be
addressed in this thesis, to which a mechanical approach cannot solve.
4
Figure 1.2: neuroArm surgical (slave) robot manipulators
1.1.1 Introduction to Teleoperation Vocabulary
In order to properly discuss the problems at hand and their associated proposed
solutions, an understanding of certain terms is necessary. Consider a typical teleop-
eration setup found in Fig. 1.3. Starting at the left is the human. From a block
diagram perspective, the human operator is part of the system architecture. The
human receives information about the remote environment from the master device,
processes this information, and based on their intended task sends information back
to the master device. In this thesis, the master device is a robot manipulator, re-
ferred to as the haptic device, outfitted with actuators (to send information, such
as a force, to the human) and encoders and force sensors (to receive information
from the human). The interaction between the human and the master device is a
5
HumanOperator
Communications (T)
MasterDevice
Controller
ControllerSlaveDevice
RemoteEnvironment
Master Side Slave Side
Figure 1.3: Teleoperation setup
physical connection between the human’s hand and the master device. Both the
human and master device reside in a central control room, physically removed from
the remote environment. Information received by the master is then sent through a
communications channel which distorts the information through time delays, noise,
or otherwise. A controller at the slave (or remote environment) side receives the
human commanded information, processes it, and generates a control signal sent to
the actuators of the slave device. Again, in this thesis the slave device is a robot
manipulator equipped with actuators (to manipulate the environment) and encoders
and force sensors (to receive information from the environment). From here, the
process is reversed back to the human.
By construction a teleoperation architecture operates in feedback, with the human
and environment closing the loop. Thus, any additional internal feedback from the
slave/environment to the controller (not shown in Fig. 1.3) complicates the control
design.
To further understand the following discussion, formal definitions of impedance,
admittance, and transparency are given below. These terms follow from electrical
network theory. The analogy between electrical and mechanical systems give rise to
a relation between the effort/flow pair in mechanical systems and the voltage/current
pair in electrical systems.
6
Human Operator
Tele- operator
Environment
Figure 1.4: Impedance interpretation of a teleoperation system
Definition 1. The impedance Z ∈ �n×n of an n-port mechanical system maps ve-
locity v ∈ �n×1 to force f ∈ �n×1
f = Zv. (1.1)
Definition 2. Conversely, the admittance Y ∈ �n×n of an n-port mechanical system
maps force f ∈ �n×1 to velocity v ∈ �n×1
v = Y f. (1.2)
It follows that
Y = Z−1
. (1.3)
Definition 3. Consider the teleoperation setup as in Fig. 1.4. A system is said to be
transparent if the impedance perceived by the human Zt is the same as the impedance
of the environment
Zt = Ze. (1.4)
The same can be said for the admittances. In this case it is said that the teleoperator
achieves perfect transparency
Note that these definitions do not determine which quantities are inputs and
outputs, but only relate system states to each other.
7
1.2 Problem Statement
The biggest obstacle that control engineers must address in haptic systems is the
fact that the human and, more troublesome, the environment enter into the stability
analysis. When interacting with arbitrary environments, the environment can exhibit
zero stiffness (slave robot motion in free space), near infinite stiffness (slave motion in
constrained motion, such as pressing against a wall), and anywhere in between. When
analyzing the system the environment stiffness enters into the stability analysis.
It turns out that the environment stiffness directly affects the gain margin of the
system. Thus, it is difficult to design a controller that maintains stable and desired
performance over a variety of environment interactions.
This thesis aims at designing a controller which maintains stability and desired
performance in all possible scenarios, in the absence of any kind of switching or gain
scheduled control, which as any seasoned control designer understands in difficult
to implement in reality. In particular, the controller is tested for the two most
challenging scenarios: when transitions between extreme environment stiffness are
encountered. These correspond to:
• sudden loss of contact between the slave and the environment occurs. Real life
situations that reflect this case are puncturing through tissue or sliding off the
edge of an object.
• sudden contact with a near rigid environment as, for example, when the slave
hits a wall, table, or bone.
This second scenario also gives rise to a couple of other issues surrounding haptic
control design [14]. When a slave robot comes into contact with a stiff environment
8
it tends to have a large impact force which can cause damage to expensive force
sensors, the slave robot in general, and can be most detrimental to the environment
(i.e. surgical patient). Also, following the initial impact the slave robot has trouble
settling onto the hard surface and repeatedly bounces on and off the surface.
1.3 State of the Art in Teleoperation (literature review)
This section will introduce some of the most advanced and common setups of haptic
systems. Differences between haptic systems arise from
• which sensors are mounted on the slave/master. Typical choices are encoders
and/or force sensors.
• what type of control system is used.
• what properties/behavior does the remote environment exhibit.
1.3.1 Sensor choice
Considering the variety and number of sensors potentially appropriate for the setup in
Fig. 1.3 to work, there are a great number of possibilities and configurations of which
states to control. Generally, the system states of most interest are force and position.
Literature categorizes these possibilities into 2,3, and 4-channel architectures.
4-channel architecture refers to master positions and human commanded force
measurements being sent to the slave-side controller while slave-position and envi-
ronment force measurements are sent to the master-side controller. An advantage to
this type of architecture is that both position and force tracking occur at both ends of
the system. Papers that examine this type of architecture [15,16] show that perfect
9
transparency can be achieved. However, this claim is made only theoretically. Sys-
tem transparency is analyzed using linear system theory and the assumption is that
the environment behavior (impedance or stiffness) is known. Performance degrades
as the environment deviates from the estimated impedance and more importantly
the robustness margin may vanish. For systems operating in highly defined task
spaces, this architecture is appropriate.
2-channel architecture sends one of master-position or human commanded force
to the slave-side controller, while either one of slave-position or environment mea-
sured force is sent to the master-side controller. Literature characterizes these ar-
chitectures as position-position [15, 17] position-force [15, 18–20], force-position [21],
and force-force [22, 23]. An advantage to this architecture is that fewer sensors are
required and a relatively good sense of transparency is maintained. Controller design
tends to be simpler in 2-channel architectures. An assumption that makes 2-channel
architecture valid is that the human operator and the environment behave either as
an admittance or impedance. For example, a system under position control at the
slave-side will interpret the remote environment as an impedance, receiving position
inputs (from the slave) and producing a force output (to the slave). Thus, force com-
mands would be redundant for the slave controller. The difficulty comes when trying
to generate slave positions in order to produce a desired force when the environment
impedance is unknown. Fortunately, humans are able to do this naturally and 2-
channel architecture can exploit this fact because the human is part of the system.
Indeed, the human varies his/her apparent admittance (or impedance, whatever the
case may be) based on their interpretation of the environments impedance (or admit-
tance). Thus, rather than having a controller “guess” the environment impedance,
2-channel architecture lets the human do this while the controllers are only concerned
10
with controlling a single state. Our human brains are quite adept to producing hand
positions in order to apply a certain force. Alternatively, a certain force can be
applied with our hand to achieve a desired position. We switch between the two
methods [21] based on the task at hand (in fact, more accurately we probably em-
ploy a mixture of both approaches). We also do this switching unconsciously. The
ultimate implication is that 2-channel architecture works well because our brains
perform the necessary impedance transformation (or admittance transformation, as
it may be). Another way of looking at this point is to consider the argument that
humans are capable of optimally producing desired position (or force) trajectories
based on given force (or position) and visual information. For this reason, I believe
that 2-channel architecture is superior to 4-channel architecture which essentially
unnecessarily “over controls” the system.
3-channel architecture, as the name suggests, sends 2 desired measurements to the
slave-side (or master-side) controller and 1 desired measurement to the master-side
(or slave-side) controller.
1.3.2 Choice of control system
Rather than an examination of specific controllers, this section gives an overview of
various approaches to controller designs in haptic systems. Literature is dominated
by two methods of controller design: linear system based, passivity based.
Linear system based designs typically utilize the structure displayed in Fig. 1.5
and originally proposed by [15] and seen in many others [16, 24, 25] The C-blocks
are the controller blocks to be designed and 2,3, or 4-channel architecture designs
are achieved by setting appropriate C-blocks to 0 (ie. for position-force 2-channel
architecture set C3 = C4 = 0). Stability of the overall system is guaranteed through
11
Human Operator
Master Controllers Slave Remote Environment
Figure 1.5: Linear system based control architecture
linear stability theory (Routh-Hurwitz criteria [26], scattering conditions [27], or
other frequency domain methods). Additionally, the C-blocks are designed using a
priori knowledge of Zh and Ze (human and environment impedances). Under perfect
knowledge of these impedances, the system achieves perfect transparency. However,
in a general robot assisted surgery application, the environment impedances can
range from free space Ze = 0 to contact with an ideally rigid environment Ze = ∞.
In [16] it is shown that a rigid (or even near rigid) environment easily causes the
system to go unstable. This issue is circumnavigated in [24] by using scaling factors
in some of the C-block controllers or by imposing restrictions on C-blocks which
both trade off system performance for stability as well as reduce flexibility in control
design. Even with these scaling factors, the controller is only guaranteed stable for a
limited range of environment impedances. Additionally, practical use of the scaling
factor method would require online adaptation of the controller based on changes
12
in the environment. It is also unreasonable to assume that an environment behaves
linearly.
Passivity based controllers have gained popularity among haptic researchers for
many reasons. The primary goal of a passivity based controller is to ensure stability
for any environment impedance. A key result of passivity theory is that the inter-
connection of passive systems is also passive. The implication of this result is that
the system can be split into multiple components, each proven passive, and then
the entire system can be assumed passive. Looking at Fig. 1.3, both the human
and environment blocks are considered passive (that is, neither the human nor the
environment behave to destabilize the system). Additionally, the master and slave
devices can be considered passive. The blocks that require design are the controllers
and the communications. In any system the time delay can cause instability and
the amount of allowable time delay is called (or roughly related to) phase margin.
Papers such as [28, 29] have addressed the issue of stabilizing the communications
channel in terms of passivity. In fact, the results are powerful and are able to ensure
stability under an arbitrary time delay. The real problem in passivity based controls
is designing the controller block. In fact, most passivity based systems do not have a
controller at all. This is because to prove that a controller is passive can be difficult.
Indeed, the passivity condition is as follows:
Theorom 1. An n-port system with input u ∈ �n and output y ∈ �n is passive, and
thus stable, iff
�uτ , yτ � ≥ 0, (1.5)
where the inner product is defined in the normal way
�uτ , yτ � :=
�τ
0
u(t)T y(t)dt. (1.6)
13
If u and y represent a power pair, an interpretation of passivity is that the system
must always be dissipating energy. Passivity is a generalization of bounded-input-
bounded-output (BIBO) stability. Yet the whole advantage of passivity based anal-
ysis is to analyze the controller separate from the rest of the system. The difficulty
in designing a controller based on passivity is that the entire set of possible inputs
is probably unknown. This is why stability by passivity is referred to as a type of
unconditional stability.
As such, there is no methodical or constructive method for proving passivity.
Hannaford et. al. [30] has come the closest to developing a general control design by
“observing” the passivity of the system and appropriately injecting any shortage of
passivity. Analogously, this controller adds damping to the system until it stabilizes.
In Khalil’s text [31], this control is called output-feedback control and is discussed
in more detail in Section 5.2.
1.4 Introduction to the Proposed Solution
A deviation from both of the above mentioned control system designs is proposed.
Using the theory of Lyapunov stability and neural networks, stability is guaranteed
for all possible environment impedances while maintaining good and consistent trans-
parency between the human and the remote environment. The proposed controller
has no prior knowledge of the remote environment and is fully adaptive, without
the need for switching controllers or gain scheduling. By sending force commands
to the slave and receiving force information from the environment a 2-channel con-
trol architecture is designed. By doing so it is shown that the resulting force in the
case of stiff contact between the slave and environment reduced, thus protecting the
14
hardware and patient. Three main contributions can be identified in this work:
• A unique auxiliary error definition reduces force errors while providing position
control when the slave moves in free space,
• Using neural adaptive backstepping to ensure stable control when contact with
a stiff environment is made,
• Novel neural network update law ensures stable and robust control.
15
Chapter 2
Background Theory
There are seven key theorems and concepts that will prove necessary in subsequent
chapters of this thesis. First Lyapunov theory is examined, then the control Lya-
punov function, the Lyapunov redesign method, and backstepping as methods for
nonlinear stability analysis. Then, radial basis function neural networks (RBFNs)
are introduced and their ability to model uncertain functions is discussed. Finally,
this section will discuss the integration of Lyapunov theory and RBFN theory from
a control system’s perspective. Theory introduced in this section is not complete.
Rather, the important results pertaining to the following work are presented.
2.1 Lyapunov Theory
The dominant method for ensuring the stability of nonlinear systems stems from
Lyapunov theory. Lyapunov theory examines primarily the stability of equilibrium
points of dynamical systems, so a definition of stability in terms of equilibrium points
is a logical place to start.
Generally speaking, an equilibrium point is stable if solutions of the system start-
ing near the equilibrium point stay near the equilibrium point. The set of starting
points in which the solution happens to stay nearby the equilibrium point define a
region in which the system is stable. If solutions diverge away from the equilibrium,
this point is called an unstable equilibrium point. From an stability analysis per-
spective it is sufficient to prove that solutions stay nearby the equilibrium point, but
16
a stronger and more desirable condition is that, as time approaches infinity, solu-
tions of the system actually converge to the equilibrium point. This scenario, like
it’s linear counterpart, is called asymptotic stability. Formulized in a mathematical
definition [32]:
Definition 4. A point xe is said to be an equilibrium point of the system
x = f(x), (2.1)
if f(xe) = 0,∀t. In addition, xe is a stable equilibrium point if for any given t0
and positive � ∈ �+ ∃ a δ = δ(t0, �) ∈ �+ such that if ||x(t0) − xe|| < δ, then
||x(t : t0, x0)− xe|| ≤ � for ∀t ≥ t0.
Lyapunov theory bridges the gap between understanding this definition and an-
alyzing stability. Given an arbitrary dynamic system, Lyapunov theory allows us to
determine the stability of an equilibrium point by the following theorem
Theorem 2. Let D ⊂ Rn contain the equilibrium point x = 0 of the system
x = f(x). (2.2)
x = 0 is a stable equilibrium if ∃ a continuously differentiable smooth function of the
system states V : D → R such that
V (0) = 0, (2.3)
V (x) > 0 ∈ D, and (2.4)
V (x) ≤ 0 ∈ D. (2.5)
The reader is referred to [31] for a detailed proof of this theorem. Note that
there are few restrictions on the function V , and choosing an appropriate V is an
17
art rather than a science. On the other hand, this lack of restrictions allows great
flexibility in the analysis of the system and this freedom permits a control system
designer to achieve desired results. Additionally, it is important to realize that,
without affecting system behavior, through appropriate coordinate transformation
the point x = 0 can be made an equilibrium point. Finally, though the above theory
pertains to an autonomous system it can be extended to non-autonomous systems
provided the control is a function of the system states and is thus absorbed into f(x).
2.1.1 Control Lyapunov Function
Due to the descriptive rather than constructive nature of Theorem 2, discussion of
the control Lyapunov function method is necessary [33]. Lyapunov-based control
design follows a straightforward process and allows a control system designer great
flexibility in their design. This freedom can be exploited to achieve desired perfor-
mance, robustness, and control cost. First, a candidate Lyapunov control function
V satisfying the conditions
• V (0) = 0,
• V = V (x),
• V > 0 ∈ D,
is chosen. Then, the time derivative of V is determined. Finally, a control is designed
that forces V to be negative semi-definite. Such a process ensures stability of the
equilibrium point. The major leap made here is that a priori knowledge of the
derivative of V is not necessary in order to design a stabilizing control.
18
2.1.2 Lyapunov Redesign (robust design)
The method of Lyapunov redesign [34] follows from the Lyapunov control function
method and provides a method for robust control design. Lets consider the system
x = f(x) + gu + h(x)δ(x, t), (2.6)
where f is a known function, g �= 0 is a parameter, u is the control, h(x) is some
function that acts on the uncertain but bounded function δ(x, t),
|δ(x, t)| ≤ ρ(x) ≤ δmax. (2.7)
To begin the robust design process a control unom(x) that stabilizes the nominal
system is found
x = f(x) + gunom. (2.8)
The final control designed will be u = unom + urob where urob is the robustifying
component of the controller. If unom stabilizes Eq. (2.8), then ∃ some Lyapunov
function V (x) for the nominal system. Using the same Lyapunov control function as
for the nominal system, the derivative is found to be
V (x) =∂V
∂x
�f(x) + g[unom(x) + urob(x)] + h(x)δ(x, t)
�. (2.9)
urob can be designed at this stage and there are many possible designs, two of which
will be discussed. Choose the robust control
urob = −|h(x)|ρ(x)
g
∂VT/∂x
|∂V/∂x|, (2.10)
then
V (x) ≤∂V
∂x
�f(x) + gunom(x)
�+
����∂V
∂x
����
�− |h(x)|ρ(x) + |h(x)||δ(x, t)|
�, (2.11)
19
which is known to be negative because of the nominal design and the fact that
|δ(x, t)| ≤ ρ(x) and the system is asymptotically stable. Granted, this design of urob
causes the Lyapunov function to be discontinuous and thus violating a requirement
of Lyapunov theory. However, there are methods to smoothing out this control which
in return define a region of convergence rather than asymptotic stability. One such
example is to use tanh(∂V/∂x) rather than ∂VT
/∂x
|∂V/∂x| .
Another such design for urob which is employed in this thesis is
urob = −µ1
g|h(x)|r1
����∂V
∂x
����r2
, (2.12)
where r1 > 1 ∈ �, r2, andµ ∈ �+. If this robust control is being designed for a virtual
control of a backstepping design, this robust control law and its n-time-derivatives
must have an analytic solution at the equilibrium x = 0, where n is the number of
stages remaining in the backstepping procedure. This is because a robust control
is designed for each virtual control and will thus be differentiated at the following
step of backstepping. This constraint directly affects the parameters r1 and r2. For
example, if there are two stages of backstepping in a particular design r1 > 2 and
r2 > 1 must be satisfied and for the robust control designed in the first stage. Usually,
however, r1 = r2 = 1.1 is chosen.
To see the design reasoning for this control it is helpful to examine what Eq.
(2.12) does to the Lyapunov function,
V (x) =∂V
∂x
�f(x) + gunom(x)
�+
∂V
∂x
�− µ|h(x)|r1
����∂V
∂x
����r2
+ h(x)δ(x, t)
�. (2.13)
Consider the bounds on V (x) due to the robust control (the nominal component is
already known to be asymptotically stable),
Vrob(x) ≤
����∂V
∂x
����
�− µ|h(x)|r1
����∂V
∂x
����r2
+ |h(x)|δmax
�, (2.14)
20
and it can be seen that Vrob(x) < 0 when
����∂V
∂x
����|h(x)|r1−1
r2 >
�δmax
µ
�1/r2
. (2.15)
It is clear now the restrictions mentioned above on r1 and r2. In order to explicitly
find the bound of the error signal, assume that
V =1
2x
2, (2.16)
and the nominal control design results in an asymptotically stable system x = −Gx.
The Lyapunov derivative for the disturbance and robust terms is found to be
Vrob(x) = x(−µ|h(x)|r1|x|r2 + h(x)δ(x, t)), (2.17)
which is bounded by
Vrob(x) ≤ |x|(−µ|h(x)|r1|x|r2 + |h(x)|δmax). (2.18)
With certainty it can be said that |h(x)|r1 ≥ |h(x)| for r1 > 1. Therefore, as a
worst case when |h(x)|r1 = |h(x)|
Vrob(x) ≤ |x||h(x)|(−µ|x|r2 + δmax), (2.19)
and it can be said that
Vrob(x) < 0 when |x| >
�δmax
µ
�1/r2
. (2.20)
Because of the original control Lyapunov function, this is also the ultimate bound
on the signal. A point of interest is that this ultimate bound occurs when |h(x)|r1 =
|h(x)|, which implies h(x) = 0. Thus, the ultimate bound exists only when distur-
bances are absent and we are guaranteed a smaller bound otherwise.
21
Figure 2.1: General shape of the robust control
At first it seems desirable to design µ large but in reality it is desirable to achieve
µ ≤ δmax. This way, the robust control does not dominate over the nominal control.
The robust control laws are designed in this way for a few reasons. First, knowl-
edge of ρ(x) is not needed. Second, it is smooth and its derivative is also smooth
and analytic near the origin. Fig. 2.1 show the general shape of the robust control
and its spatial derivative is shown in Fig. 2.2. Thus, the requirements for Lyapunov
stability are met.
Third, a robust control is desired to bound the disturbance Lyapunov function
but in general it is not desired for this robust control to dominate over the actual
control. The nonlinear damping technique [35] would use powers of 2 or 3 (versus
22
Figure 2.2: General spatial derivative of robust control
1.1 in the proposed robust control) and shrinks the bounds as well as ensures fast
convergence. However, these high powers tend to dominate the nominal control and
affect system performance. Analyzing the bounds on Vrob in a conservative sense and
using maximum disturbances to derive ultimate bounds provides deceiving predic-
tions of the actual performance and is a more useful tool for simply guaranteeing
stability. Experiments show that using powers of 1.1 ensure sufficient robustness
while allowing the nominal control to behave as desired.
23
2.1.3 Backstepping
For systems in strict feedback form [31]
x1 = f1(x1) + g1x2, (2.21)
x2 = f2(x2) + g2x3, (2.22)
...
xn−1 = fn−1(xn−1) + gn−1xn, (2.23)
xn = fn(xn) + gnu, (2.24)
with x1...xn and g1...gn ∈ � the backstepping technique allows us to control states
x1 → xn−1 “directly” by stepping the integrator for state xn back through the system.
It is possible to do this by assuming that there exists some control α(x), called a
virtual control, that can stabilize the system described by states x1 → xn−1. Then
the actual control u is designed in hopes of attaining α. The backstepping technique
does so by introducing an additional state which is the difference between the actual
control and the desired virtual control
z = u− α. (2.25)
Again, the best way to visualize what is happening is by example. Consider the two
state system
x1 = f1(x1) + g1x2, (2.26)
x2 = f2(x2) + g2u. (2.27)
In the first stage of backstepping some control α is designed that can stabilize the
one state virtual system x1 = f1(x1)+ g1α by assuming a Lyapunov control function
24
V1 = 12x
21,
V1 = x1(f1(x1) + g1α) + g1x1z. (2.28)
It is seen that α = −g−11 (f1(x1) + G1x1) where G1 > 0 ∈ � is desired. However,
in reality the state x2 controls this subsystem so u must be designed such that x2
approaches the desired virtual control α. Introducing the error z = x2 − α and
designing u to drive z → 0 will achieve this goal. For the second stage, use a
Lyapunov control function V2 = V1 + 12z
2,
V2 = −G1x21 + g1x1z + z(x2 − α), (2.29)
= −G1x21 + g1x1z + z(f2(x2) + g2u− α). (2.30)
Choosing
u = g−12 (α− f2(x2)− g1x1 −G2z), (2.31)
where G2 > 0 ∈ � stabilizes the system asymptotically. G2 in this case determines
the control effort spent to reduce the virtual control error z.
2.1.4 Adaptive Backstepping and Tuning Functions
Note that the final control designed using the backstepping approach requires knowl-
edge of α. It turns out that this term must be evaluated analytically in order to ensure
good system performance. However, it may not always be possible to do this. In the
above example,
α = −g−11
�∂f1(x1)
∂x1x1 + G1x1
�. (2.32)
It is possible that some of the derivatives in Eq. (2.32) are unknown. In which case it
is common to model these unknown terms using a universal approximator (denoted
UA and described below in detail). If the unknown terms are lumped into µ(x1, x1)
25
and the known terms into γ(x1, x1),
α = γ(x1, x1) + φµ(x1, x1)wµ, (2.33)
where φµ(x1, x1)wµ models the unknown term
µ(x1, x1) = φµ(x1, x1)wµ. (2.34)
This approach is often referred to as a kind of adaptive backstepping and the learning
rule is based on the virtual control error z. It is important to note that the UA should
only be used to model terms that are unknown and any known derivatives should
be calculated analytically. The derivative of α must be calculated as analytically as
possible to ensure good performance and robustness.
Another possibility is that some of the terms in the first stage of backstepping were
unknown and also modelled using a UA. If this is the case, these unknown terms will
appear in the second stage of backstepping (for instance, x1 appears in Eq. (2.32)).
One could use another UA to model all unknown functions in Eq. (2.32), called an
over-parameterized system. Indeed, this must be done if the learning rule is designed
in the first stage of backstepping. However a much more robust method is to postpone
the learning rule design to the second stage of backstepping. This will allow use of the
original UA from the first stage to model the same terms in the second stage. Doing
so will introduce additional derivative terms into the learning rule which are called
“tuning functions” [36]. These tuning functions improves robustness by providing
the learning rule with information of additional system dynamics. Another way of
looking at the increased robustness is to see that a tuning function method allows a
more analytical model α. There is also the freedom to scale the tuning functions to
achieve a desired performance and a proof is shown in Appendix B. Note that there
26
may still be additional unknown terms in Eq. (2.32) that will be modeled using a
UA.
2.2 Radial Basis Functions Networks
A radial basis function network (RBFN) [37] is a type of neural network and is a
powerful and useful tool for modelling unknown and nonlinear functions. RBFNs are
a special case of an universal approximator, thus they cannot only model a function
but are guaranteed to model the function well enough that the modeling error is
bounded. This is seen from the following definition and theorem,
Definition 5. A family of functions g : D → � is of class G if
• the constant function g(x) = 1, x ∈ D belongs to G,
• the sum ag1 + bg2 is of class G for a, b ∈ � and g1, g2 ∈ G,
• the product g1g2 is of class G for g1, g2 ∈ G,
• g(x1) �= g(x2) for x1 �= x2 but x1, x2 ∈ D,
and
Theorem 3. Given a continuous function f : D → � ∃ for each � > 0 a function
{f ∈ G : D → �} such that
||f(x)− f(x)||∞ < �, x ∈ D, (2.35)
[38].
The above definition and theorem encompass the Stone-Weierstrauss approxima-
tion theorem and provides the foundation for proving system stability using universal
approximators [39].
27
For a single output RBFN there is an m-element row vector of n-dimensional
radial basis functions (kernels) φ(q) as well as an m-element column vector of weights
w. q are the inputs to the network and the output is given by
o = φ(q)w =m�
i=1
� n�
j=1
φi(qj)
�wi. (2.36)
The following corollary pertaining to RBFNs resulting from Theorem 3 can be stated
Corollary 1. Let f(q) : �n → � be a Lipschitz function and let φ(q) ⊂ G and
φ(q) ⊂ D be integral bounded kernel functions. The function f can be expressed as
f(q) =m�
i=1
� n�
j=1
φi(qj − cj
σ)
�wi + d(q), (2.37)
on D where ci > 0, σ > 0, and ||d(q)|| ≤ δ is bounded.
Lipschitz functions can be defined as the following:
Definition 6. A function f : X → Y is Lipschitz continuous if ∃ a K ∈ �+ such
that
|f(x1)− f(x2)| ≤ K|x1 − x2|, ∀x1, x2 ∈ X. (2.38)
Here K is the Lipschitz constant.
That is, a RBFN is able to approximate any smooth real-valued function f(q) on
a certain domain D and has associated with it a set of ideal weights w which result
in a modeling error expressed by the function d(q). Of course in reality access to
these ideal weights is impossible but a set of actual weights w are available and a
weight error can be defined by
w = w− w. (2.39)
A RBFN can be structured in many ways, though for the sake of this thesis each
m basis functions are centered at a unique point ci in the domain D and all have a
28
width σ. By width it is meant, in a statistical sense, the radius from the center ci
at which 1 standard deviation of the area under the basis function exists. All neural
networks in this thesis are implemented as RBFNs using Gaussian basis functions
φi(qj) = exp
�−
(qj − ci)2
2σ2
�, i = 1, ...,m. (2.40)
Examples of Gaussian basis functions in RBFN adaptive control design are numerous
and well documented in the literature [40–43]. It can be seen that φi(qj) is of class
G since
φi(qj) = 1 with σ = ∞. (2.41)
The other conditions are trivial to prove.
Finally, from a performance perspective, the minimum number of basis functions
required to approximate f(q) is 2n, where n is the dimension of the input vector q.
Requiring 2n basis functions allows for a very course approximation, allowing one
basis function for each dimension and in each direction. Often this requirement is
referred to as the “curse of dimensionality” because RBFNs with many inputs may
require too much computer memory to implement for large n.
2.2.1 RBFN in controller design
The first to formulize and justify the use of neural networks in controller design
was [44]. The easiest way to see how RBFNs are used in controller design is by
example. Consider a system with single state x, an equilibrium point at x = 0, and
dynamics
x = f(x) + u, (2.42)
29
where f(x) is a real-valued but unknown function which can thus be modeled by a
RBFN
f(x) = φ(x)w + d(x), |d(x)| < dmax ∀x ∈ D. (2.43)
Theorem 4. A stabilizing control for Eq. (2.42) is
u = −φ(x)w−Gx, (2.44)
where G > 0 ∈ � and the weights w are updated online according to ˙w = φTx− νw
Proof. Using a positive definite Lyapunov control function
V (x, w) =1
2(x2 + wT w), (2.45)
V will be shown to be bounded:
V = xx + wT
�d
dt(w− w)
�, (2.46)
using Eqs. (2.42),(2.43),(2.39),(2.44), and rearranging algebraically,
V = −Gx2 + xd(x) + νwT w− νwT w, (2.47)
≤ −G||x||2 + dmax||x|| + ν||w||||w||− ν||w||
2, (2.48)
≤ −G
��||x||−
dmax
2G
�2
−d
2max
4G2
�− ν
��||w||−
||w||
2
�2
−||w||2
4
�. (2.49)
Recognizing this as the general equation of an ellipse it can be said that V is negative
when
||x|| >dmax
2G+
�d2
max
4G+
ν||w||2
4G= δx, (2.50)
or
||w|| >||w||
2+
�d2
max
4νG+
||w||2
4= δw. (2.51)
30
An ultimate bound, xb on the system state x is explicitly using Vb = 12(||x||
2 + ||w||2)
Vb(||x|| = xb, ||w|| = 0) = Vb(||x|| = δx, ||w|| = δw), (2.52)
1
2x
2b
=1
2(δ2
x+ δ
2w), (2.53)
xb =�
δ2x
+ δ2w. (2.54)
There are some points of interest that arise from the above proof that are worth
discussing. First of all, the states are bounded by an elliptic region centered about
the point (dmax2G
,||w||2 ) in the (||x||, ||w||) plane and whose maximum size depends on
dmax, ν, and ||w||. It is desired to have this region as small as possible for the states
to be forced close to equilibrium. However, control system designers tend to be
restricted as to how small they can make this region. Since dmax is a property of the
RBFN and depends on the number of hidden layers, width, and centers of the basis
functions. Thus, decreasing dmax tends to be an art rather than a science. Factors
that effect dmax include the dispersion of basis functions within the domain D, widths
of the basis functions, number of basis functions, and types of basis functions.
||w|| is primarily a property of the unknown function f(x).
ν is a coefficient used to bound the RBFN weights and the term νw is called
a leakage term [45]. The leakage term is used to bound w since without it the
time derivative of the Lyapunov control function would not be a function of all the
system states. There are other methods to doing this other than leakage such as
projection [32], deadzone [46], and e-modification [47]. Again, choosing a value for
ν is largely heuristic rather than prescriptive.
This thesis will also use projection which will be examined now. Projection is a
31
robust weight update law defined by
˙w =
0 if w ≥ ||w||max and φTx > 0,
0 if w ≤ ||w||min and φTx < 0,
φTx otherwise.
(2.55)
Such an update law ensures that V is bounded by considering that a bound on w
exists by construction
||w|| = (||w||max − w) ∪ (w− ||w||min), (2.56)
and V is negative when
||x|| >
dmaxG
if ˙w �= 0,
dmax+φ||w||G
otherwise.
(2.57)
Projection is a weight update law used when certain limits of the neural network
output can cause the system to go unstable. It is commonly used when the neural
network output is inverted and thus the output must remain a certain distance from
0.
Another point to note is that the RBFN weights are updated based on the system
state x, not on w. As a result, it is incorrect to say that the RBFN “learns” f(x).
Rather, it is common to state that the RBFN “models” f(x) and in fact the RBFN
only aims at driving the system states to equilibrium. As such, the effectiveness of
a RBFN in terms of controller design stems from its ability to adapt quickly rather
than producing a highly accurate model. Although this has its disadvantages a key
advantage to updating weights based on system states is that RBFNs tend to be
quite robust to unmodeled uncertainties and disturbances.
32
Chapter 3
Proposed Controller Design
3.1 Introduction
Beginning with an overview of the system dynamics and block diagrams, error defini-
tions, and design choices, this section ultimately proposes a controller for a surgical
teleoperation haptic system. It is written under the constant consideration of two
extreme scenarios: When contact with the environment is suddenly lost, and when
the slave robot comes into contact with a wall.
3.2 System description
First of all, this thesis examines only a one-degree of freedom (DOF) translational
slave robot with the assumption that all of the theory mentioned hereafter can be
extended to the multi-DOF case through appropriate kinematic transformations. In
light of this, the controller design begins with the mass-spring-damper model shown
in Fig. 3.1. Dynamically, Fig. 3.1 model is represented by
Mx = −Drx−Dmx−K(x)x + Fc, (3.1)
where x(t) ∈ �. M represents the combined inertia of the robot and the surgical
tool. Dr contains damping coefficients arising from the robot joints. The terms K(x)
and Dm are properties of the tissue/material that the surgical robot is in contact
with. The approach of modeling tissue with a spring is well established in the liter-
ature [48], [49], however to be precise human tissue tends to exhibit a viscio-elastic
33
Figure 3.1: Mechanical model of a 1 DOF surgical slave robot in contact with anenvironment
property meaning K = K(x, t). For design purposes a shortcut is made by assuming
that some of this viscio-elastic behavior can be captured in a spring-damper model.
As will be seen the proposed controller is highly robust to time-varying properties of
the tissue due to fast adaptation in the neural-network control law. Tissue also has
mass, but we do not model this explicitly in Fig. 3.1 because any inertial properties
of the tissue can be included in M (the mass of the slave robot). K(x) also contains
the end-effector force sensor dynamics.
The slave robot positions, velocities, and accelerations are x, x, x respectively. Fi-
nally, Fc is the controller output and is what’s designed in this chapter.
From the above mentioned model it is implied that the force sensor on the end-effector
of the slave robot outputs
Fm = Dmx + K(x)x. (3.2)
34
3.3 Proposed Control Law
This section will introduce the proposed control law and justify design decisions. An
in-depth stability analysis follows in Chapter 4.
The proposed design focuses on attempting to improve three limitations in current
haptic controllers:
• When the slave robot comes into contact with stiff environments (such as a
wall, bone, or table) controllers tend to apply excess force to the environment,
potentially causing damage to the robot or the environment as well as causing
instability
• If the slave robot suddenly loses contact with the environment or punctures
through a layer of tissue there tends to be tool overshoot.
• Control commands are required to be filtered causing a possible source of in-
stability.
These items will be discussed in more detail to follow.
3.3.1 Design choice 1: Error definition
A primary objective of the proposed controller is to have the slave robot track hu-
man commanded force rather than position. Under a time delayed communications
channel, this will decrease the impact force when the slave hits a wall. For instance,
say the human operator commands a force Fd(t) and a time delay of T seconds exists
in the communications channel. A force tracking controller will aim for
Fm(t) = Fd(t− T ). (3.3)
35
That is, if the slave robot hits a wall and measures a large contact force the con-
troller will pull the slave back, at least initially. This is contrasted to a system under
position control in which the human may have commanded a slave position beyond
the physical constraint of the wall. A controller designed for position tracking will
push “through” the wall and may apply a dangerously large force to the slave and/or
environment before the human can react to the collision. Such high contact forces are
particularly dangerous in surgical applications where the “environment” is a human
patient and the robots are very expensive.
Although performance improves in the case of stiff contact, force control will com-
promise performance when a sudden loss of contact with the environment occurs. In
this case, the controller will cause the slave to overshoot as it tries to achieve the last
commanded force. Thus, another objective of the proposed controller is to reduce
tool overshoot in the case that contact is suddenly lost with the environment. Such
a scenario could represent a puncture through tissue or the sliding of the end-effector
off of a surface.
For these reasons, an auxiliary error is defined for the controller
s = Λ� + x, (3.4)
where � is the force error defined by � = Fm − Fd and Λ ∈ � is a positive tuning
parameter. A larger value for Λ will emphasize force tracking whereas a small Λ
slows down the slave and behaves like a damper. Admittedly, a slowly responding
robot is undesirable however the proposed controller allows for the surgeon to tune
the value of Λ to their liking. Also, this damping acts to maintain desirable control
in two ways: One, the robot is slowed down after a puncture or loss of contact and
two, if the slave bounces off a hard surface the controller will dampen any vibrations
36
that arise (which often occur in teleoperation). Achieving s ≡ 0 is the ultimate goal
of the proposed controller. Doing so implies
x = −Λ�. (3.5)
Let’s examine what the implications of achieving s ≡ 0 are in the case of stiff contact.
Substituting in the expression for Fm results in
x = −ΛFm + ΛFd, (3.6)
= −ΛK(x)x− ΛDmx− ΛFd, (3.7)
=−ΛK(x)
1 + ΛDm
x +Λ
1 + ΛDm
Fd, (3.8)
which constitutes a stable (nonlinear) system with input Fd and state x under the
reasonable assumption the K(x) is positive semi-definite. Thus, if a controller attains
s ≡ 0 the system behaves very much like a sliding mode controller in which the state
trajectory will asymptotically approach the origin along the line x = −Λ�.
In the case of unconstrained, free motion where there is no contact with material
and Fm = 0,
x = −ΛFd, (3.9)
and the system comes under velocity control in which the slave velocity is directly
proportional to human commanded force. Again, achieving s ≡ 0 will cause the state
trajectory to converge asymptotically to the origin along the line x = ΛFd.
3.3.2 Design choice 2: Backstepping
As mentioned previously, control commands must be filtered before being sent to
the slave to protect the actuators from damage. In linear (and even some simple
non linear) control applications the effect of this filter can be accounted for and
37
stability can be ensured. For example, an ideal relay controller has well known
properties and its describing function is used to ensure stability when the control
signal is filtered. However, if a more complex and nonlinear controller is desired,
the describing function method is not only laborious (or impossible) but can also
become an inaccurate model, unable to guarantee stability. As a result, filters are
added heuristically and the stability analysis is performed assuming their absence.
The reason why a control signal must be filtered before it is sent to the actuators
is because high frequency components in the signal can cause chattering, limit cycles,
un-due stress on the actuator hardware, and, in general, unexpected motor dynamics
in the high frequency range.
A novel solution is proposed by using the backstepping technique as a tool to filter
the control signal. This approach has its advantages in that the filtering properties
that are desired are built into the controller and the entire system is analyzed for
stability holistically. Using the backstepping technique in this way is certainly novel
as it is not being used it for its originally intended purpose but rather exploiting a
resulting characteristic of systems designed using backstepping. An extension from
the analysis of backstepping in Section 2.1.3 reveals the filtering properties of the
backstepping technique. Consider the Lyapunov candidate
V2(x1, z) = V1(x1) +1
2z
2, (3.10)
where z is the virtual control error u(t) − u∗(t) and V1 is the Lyapunov function
designed in the previous step. The derivative becomes
V2 = V1 + x2z, (3.11)
and in general, after virtual control design, ending up with
V2 = −G1x21 −G2z
2, (3.12)
38
implying that z has dynamics
z = −G2z, (3.13)
which is an asymptotically stable system, as t →∞ the virtual control error z → 0,
and there is no filtering effect beyond the transient response. If, however, in the likely
case that an exact expression for α is unknown but a “best estimate” derivative of
virtual control ˙α is attainable then, using the same control as in Eq. (2.31) but with
the “best estimate” derivative,
V2 = −G1x21 −G2z
2 + ( ˙α(t)− α(t))z. (3.14)
Thus, z has dynamics
z = −G2z + ˙α(t)− α(t). (3.15)
The frequency domain solution for z assuming zero initial conditions is
L{z} = L{−G2z + ˙α(t)− α(t)}, (3.16)
sZ(s) = −G2Z(s) + sA(s)− sA(s), (3.17)
(s + G2)(U(s)− A(s)) = sA(s)− sA(s), (3.18)
U(s) =sA(s)
s + G2+
G2A(s)
s + G2. (3.19)
Here L{α} = A(s) is used. It can be seen that the actual control u(t) is a sum of a
filtered α(t) and a filtered α(t). Thus, any high frequency components of either of
these signals do not appear in the actual control. Additionally, the break frequency
at which the control is filtered can be fully determined through G2. A small G2 will
smooth out the control more than a large G2.
In the proposed design, the virtual control error turns out to be
z = Fc − α, (3.20)
39
where
α = αnom + αrob, (3.21)
and
αnom = Fm + p−1(−φ1w1 + ΛFd −G1s), (3.22)
and αrob is a Lyapunov redesign robust control defined as
αrob = −|s|1.1(µ2|− Fm + Fc|
1.1 + µ1). (3.23)
Because p and φw are neural approximations, the terms that could potentially intro-
duce high frequency components in α are Fm and s. As will be seen, the derivatives of
these terms are modeled using neural networks, thus smoothing them and justifying
a backstepping design.
Originally, the backstepping technique was believed to filter out large spikes in
the term Fm, especially when the slave came into contact with a wall or suddenly lost
contact with the environment. However, testing has shown that the true advantage
of using the backstepping technique is when the slave robot is already in contact
with a wall. Consider the natural mode of vibration for a mass-spring model with
stiffness K and mass M having an undamped natural frequency of
fn =1
2π
�K
M. (3.24)
As K increases the frequency of this natural mode also increases. In the case of
a surgical slave robot, a large K may contain surgical tool flexibilities. It is well
documented that these high frequency natural modes are easily excited by a control
system [50]. Many have proposed various filter designs to try and dampen the high
frequency components of the control signal so that the controller does not excite the
natural modes of the system [51]. Indeed, controlling these high frequency modes
40
is often unnecessary since they tend to not only have small amplitudes (when not
excited) but are also slightly damped (a property of the physical system). In the
case of this thesis, the damping Dm ensures that the natural modes of vibration
dampen naturally without the controller. Thus the control of lower frequencies is
more important. The proposed use of the backstepping technique allows for an easily
tunable and effective filter and certainly a viable alternative to other approaches
recorded in the literature. Additionally, the stability proof is strong and potential
causes of instability are indentified easier.
3.3.3 RBFNs used in controller
The proposed controller will make use of three neural networks to model unknown
terms in the system. No a priori knowledge of slave and environment dynamics is
assumed. The first neural network models
φ1(x, Fd, s)w1 + d1 = Λd
dt(K(x)x)−M
−1(ΛDm + 1)Drx. (3.25)
Explicitly, the terms on the right hand side depend on x and x but providing the
neural network with x, Fd, and s will give it unique state information since s =
s(x, x, Fd). For implementation purposes it is easier to provide x,Fd, and s and there
is no effect on stability. Additionally, looking ahead to the backstepping technique,
providing x, Fd, and s will make implementing the derivative terms of φ1 easier.
φ1w1 has no intuitive physical significance and it is helpful to go through the
stability proof in Chapter 4 to justify its design. The weights are updated according
to
˙w1 = β1
�τ1 + γφ
T
1
�∂φ1
∂sw1 + G1
�p−1
z − ν1w1
�, (3.26)
where β1 > 0 ∈ � determines the learning rate of the RBFN and τ1 is the tuning
41
function
τ1 = φT
1 s, (3.27)
implemented for robust control purposes. γ > 0 ∈ � is a scaling factor which
is often needed because the derivative terms in Eq. (3.26) tend to dominate in
magnitude over the tuning function. As a result, γ tends to be << 1. Scaling the
tuning functions by γ so that all terms are similar in magnitude, allowing the control
system designer to balance robust control with desired system performance is shown
by [52]. ν1 > 0 ∈ � is the coefficient for the robustifying leakage term used to
ensure the weights are bounded, and G1 > 0 ∈ � is a positive control gain. Leakage
(σ-modification) is used for this update law because it tends to stabilize the weight
updates better than e-modification, which is particularly susceptible to the bursting
phenomenon discussed in [53], [54]. Bursting in this case of a surgical slave robot is
likely caused by the drastic changes in errors that occur at transitions in environment
stiffness.
The second neural network models
p + dp = M−1(ΛDm + 1). (3.28)
To be precise, in the one-DOF case this is not a neural network but an adaptive
parameter. Looking ahead to a mulit-DOF case, I found it appropriate to include
this adaptive parameter in this section as the mass matrix will be dependant on both
link angles and angular velocities. Likewise, the damping coefficient will become a
matrix of coriolis and centripetal forces also dependant on link angles and angular
velocities. The parameter p is updated according to
˙p = βpProj
�τp + γ(−Fm + Fc)
�∂φ1
∂sw1 + G1
�p−1
z + ζ(p− p)
�, (3.29)
42
where τp is the tuning function
τp = (−Fm + Fc)s, (3.30)
and the definition
Proj[·] =
0 if p > ||p||max and · > 0,
0 if p < ||p||min and · < 0,
· otherwise,
(3.31)
called projection, is used to ensure the boundedness of p. Additionally, since p−1
exists in the control law (shown below), the projection approach ensures that this
term does not become too large. It is hard, if not impossible, to know exact values
for ||p||max and ||p||min so in implementation these values are estimated as desired
maximum and minimum values that are to appear in the control law. Because of
this, a novel projection rule is used which includes the supervisory term
ζ(p− p), (3.32)
where ζ ∈ �+ and p is a number that the neural network output p is driven towards.
ζ is chosen small such that the update law still has flexibility in its learning abilities
yet p does not deviate significantly from p. p is chosen to be a very rough estimate of
p (but more importantly an initial value of p). In fact, results obtained in this thesis
suggest that performance has less to do with the choice of p and depends more on the
fact that p stays within a certain range of it’s initial value. Results are convincing
in support of this novel method.
There is an analogy of this novel supervisory method to the common leakage method.
Leakage terms tend to drive the weights to their initial value (zero). Similarly, the
43
supervisory term also drives the weights to their initial value. This analogy is further
strengthened in the results section when it was found that the ideal value of ζ is also
the ideal value of ν1,2.
A final neural network models the unknown term
φ2(x, x, Fc)w2 + d2 = −Fm(x, x, Fc), (3.33)
and is updated according to
˙w2 = β2(φT
2 z − ν2w2). (3.34)
For stability purposes the reasonable assumption that all uncertainties d1, dp, and
d2 are bounded functions of the system states is made (according to Theorem 3).
3.3.4 Control Law
The actual control applied to the slave is
Fc(t) =
�t
0
Fc(τ)dτ, (3.35)
where
Fc = Fc,nom + Fc,rob. (3.36)
The nominal component is defined as
Fc,nom = ˙α− sp−G2z, (3.37)
and the robustifying term, found using the Lyapunov redesign approach, is
Fc,rob =−
�µ3 + µ4|κ1|
1.1 + µ5|κ1(−Fm + Fc)|1.1 + µ6|κ2|
1.1
+ µ7|κ3|1.1 + µ8|κ3(−Fm + Fc)|
1.1
�|z|
1.1, (3.38)
44
with the following defined
κ1 =− p−1
�∂φ1
∂sw1 + G1
�, (3.39)
κ2 =p−1|s|
1.1µ2|− Fm + Fc|
0.1sgn(−Fm + Fc), (3.40)
κ3 =p−1|s|
0.1(µ2|− Fm + Fc|1.1 + µ1)sgn(s). (3.41)
It is important to design a robust control because of the auxiliary error definition.
Because it behaves like a sliding mode control, the robust terms help keep the system
on the sliding mode. Previously, z was designed in Eq. (3.20).
˙α = ˙αnom + ˙αrob, (3.42)
where
˙αnom =− φ2w2 −˙pp−2(−φ1w1 + ΛFd −G1s)
+ p−1
�− φ1
˙w1 −
�∂φ1
∂xx +
∂φ1
∂Fd
Fd
�w1
+ ΛFd −
�∂φ1
∂sw1 + G1
�˙s
�, (3.43)
and
˙s = −ΛFd + φ1w1 + p(−Fm + Fc). (3.44)
A Lyapunov redesign robustifying term is,
˙αrob =− p−1|s|
1.1
�1.1µ2|− Fm + Fc|
0.1[Fc + φ2w2]sgn(−Fm + Fc)
�
− 1.1p−1|s|
0.1 ˙ssgn(s)(µ2|− Fm + Fc|1.1 + µ1)
+ ˙pp−2|s|
1.1(µ2|− Fm + Fc|1.1 + µ1). (3.45)
An additional switching control is defined in Appendix B which is used for sta-
bility purposes.
45
Chapter 4
Stability Proof
This section uses Lyapunov stability theory to prove the uniform ultimate bound-
edness of all system states under the control Fc,nom with no system disturbances
(stability in the presence of disturbances is discussed in Appendix A). The proof
is largely algebraic and the subscripts “nom” are dropped since only the nominal
control is considered. Begin with the Lyapunov candidate,
V1 =1
2s2 +
1
2βp
p2 +
1
2β1wT
1 w1. (4.1)
Taking the derivative of this candidate,
V1 = ss +1
βp
pd
dt(p− p) +
1
β1wT
1
d
dt(w1 − w1). (4.2)
The ideal weights in the neural networks are assumed not to change with time and
their time derivatives are 0. Granted, the ideal weights will change with time, though
it is assumed that the update rate of the estimated weights is much faster than the
rate at which the ideal weights change. Thus it can be assumed that at any instant
of time the time derivative of the ideal weights is dominated by the time derivative
of the estimated weights.
Let’s look separately at the term s:
s = Λ� + x, (4.3)
= Λ(Fm − Fd) + x, (4.4)
= Λ
�Dmx +
d
dt(K(x)x)− Fd
�+ x, (4.5)
= −ΛFd + Λd
dt(K(x)x) + (ΛDm + 1)x. (4.6)
46
Substituting in the slave dynamics,
s = −ΛFd + Λd
dt(K(x)x) + M
−1(ΛDm + 1)(−Drx− Fm + Fc), (4.7)
= −ΛFd + Λd
dt(K(x)x)−M
−1(ΛDm + 1)Drx + M−1(ΛDm + 1)(−Fm + Fc).
(4.8)
Using the neural network models Eqs. (3.25),(3.28),
s = −ΛFd + φ1w1 + p(−Fm + Fc). (4.9)
Returning to the Lyapunov candidate,
V1 = s(−ΛFd + φ1w1 + p[−Fm + Fc])−1
βp
p ˙p−1
β1wT
1˙w1. (4.10)
At this stage it is useful to use the substitutions w1 = w1 + w1 and p = p + p and
combine the weight error terms together,
V1 =s(−ΛFd + φ1w1 + φ1w1 + [p + p][−Fm + Fc])−1
βp
p ˙p−1
β1wT
1˙w1, (4.11)
=s(−ΛFd + φ1w1 + p[−Fm + Fc]) + p
�[−Fm + Fc]s−
1
βp
˙p
�+ wT
1
�φ
T
1 s−1
β1
˙w1
�.
(4.12)
Introducing the virtual control error Eq. (3.20) and the virtual control Eq. (3.22)
into the Lyapunov function,
V1 =s(−ΛFd + φ1w1 + p[−Fm + z + α]) + p
�[−Fm + Fc]s−
1
βp
˙p
�+ wT
1
�φ
T
1 s−1
β1
˙w1
�,
(4.13)
=s(−ΛFd + φ1w1 + p[−Fm + z + Fm + p−1(−φ1w1 + ΛFd −G1s)])
+ p
�[−Fm + Fc]s−
1
βp
˙p
�+ wT
1
�φ
T
1 s−1
β1
˙w1
�, (4.14)
=spz −G1s2 + p
�[−Fm + Fc]s−
1
βp
˙p
�+ wT
1
�φ
T
1 s−1
β1
˙w1
�. (4.15)
47
Using the expressions for the tuning functions,
V1 = spz −G1s2 + p
�τp −
1
βp
˙p
�+ wT
1
�τ1 −
1
β1
˙w1
�. (4.16)
Now to begin the second stage of backstepping use a Lyapunov candidate
V2 = V1 +1
2z
2 +1
2β2wT
2 w2, (4.17)
and differentiate
V2 =V1 + zz −1
β2wT
2˙w2, (4.18)
=spz −G1s2 + z(Fc − α)
+ p
�τp −
1
βp
˙p
�+ wT
1
�τ1 −
1
β1
˙w1
�−
1
β2wT
2˙w2. (4.19)
Insert the proposed control law
V2 =spz −G1s2 + z( ˙α− sp−G2z − α)
+ p
�τp −
1
βp
˙p
�+ wT
1
�τ1 −
1
β1
˙w1
�−
1
β2wT
2˙w2, (4.20)
=−G1s2−G2z
2 + z( ˙α− α)
+ p
�τp −
1
βp
˙p
�+ wT
1
�τ1 −
1
β1
˙w1
�−
1
β2wT
2˙w2. (4.21)
48
Looking separately at α, derived from Eq. (3.22),
α =d
dt(Fm) +
d
dt(p−1)(−φ1w1 + ΛFd −G1s)
+ p−1
�−
d
dt(φ1(x, Fd, s)w1) + Λ
d
dt(Fd)−G1
d
dt(s)
�, (4.22)
=Fm −˙pp−2(−φ1w1 + ΛFd −G1s)
+ p−1
�−
�∂φ1
∂xx +
∂φ1
∂Fd
Fd +∂φ1
∂ss
�w1 − φ1
˙w1 + ΛFd −G1s
�, (4.23)
=− φ2w2 −˙pp−2(−φ1w1 + ΛFd −G1s)
+ p−1
�−
�∂φ1
∂xx +
∂φ1
∂Fd
Fd
�w1 − φ1
˙w1 + ΛFd
�− p
−1
�∂φ1
∂sw1 + G1
�s,
(4.24)
=− φ2w2 −˙pp−2(−φ1w1 + ΛFd −G1s)
+ p−1
�−
�∂φ1
∂xx +
∂φ1
∂Fd
Fd
�w1 − φ1
˙w1 + ΛFd
�
− p−1
�∂φ1
∂sw1 + G1
�(−ΛFd + φ1w1 + p[−Fm + Fc]), (4.25)
and subtracting this from ˙α,
˙α− α = φ2w2 + p−1
�∂φ1
∂sw1 + G1
�(φ1w1 + p[−Fm + Fc]). (4.26)
Substitute this back into the expression for V2
V2 =−G1s2−G2z
2 + p
�τp + [−Fm + Fc]
�∂φ1
∂sw1 + G1
�p−1
z −1
βp
˙p
�
+ wT
1
�τ1 + φ
T
1
�∂φ1
∂sw1 + G1
�p−1
z −1
β1
˙w1
�+ wT
2
�φ
T
2 z −1
β2
˙w2
�. (4.27)
As the final step in determining V2, insert the proposed weight update laws. For this
stability proof, it is assumed that γ = 1. For a proof of stability for arbitrary values of
this parameters see Appendix B. A simple example can convince the reader that the
value of γ does not affect stability. For instance, if weight update laws are designed
49
in the first stage of backstepping, stable weight updates would not even include the
tuning functions τ (ie. γ = 0). This approach creates an overparameterized system.
On the other hand, as will be shown below, γ = 1 also provides stable weight
updates. Hence the tuning functions are, as their name suggests, used to tune the
weight updates and add robustness to the overall system by providing unique state
information.
V2 = −G1s2−G2z
2− ζ p(p− p) + ν1wT
1 w1 + ν2wT
2 w2, (4.28)
= −G1s2−G2z
2− ζ p
2 + ζ(p− p)p− ν1wT
1 w1 + ν1wT
1 w− ν2wT
2 w2 + ν2wT
2 w2.
(4.29)
At this point, only consider the case when p �= 0 because p is guaranteed bounded oth-
erwise, by construction of the projection rule. Thus, it is more interesting to consider
the region in which p �= 0 and define bounds for performance considerations rather
that stability considerations. Defining vectors ξ = [s z]T and w = [w1 w2 p]T
(and associated˜andˆconventions) transforms V2 into
V2 = −ξT
G1 0
0 G2
ξ − wT
1ν1 0 0
0 1ν2 0
0 0 ζ
w + wT
1ν1 0 0
0 1ν2 0
0 0 ζ
w + ζ pp.
(4.30)
Thus an ultimate (conservative) bound can be defined
V2 ≤ −G||ξ||2 − ν||w||2 + ν||w||(||w|| + ζ|p|), (4.31)
which represents an ellipse on the (||ξ||, ||w||) plane. In the above bound the defini-
tions G = min(G1, G2) and ν = min(ν1, ν2, ζ) are used. The assumption on stability
is an extension of standard Lyapunov theory and the system states are said to be
50
uniformly ultimately bounded. V2 < 0 when
||ξ|| >
�ν
4G||w|| = δξ, (4.32)
or
||w|| > ||w|| + ζ p = δw, (4.33)
and the ultimate bound, ξb, on the system error ξ is
ξb =�
δ2ξ + δ2
w. (4.34)
51
Chapter 5
Alternative Control Designs for Comparison
A comparison will be made between the proposed controller of Chapter 3 and three
other well known and established approaches: A robust H2 controller, an output
feedback controller, and a direct force controller. As will be apparent, a robust
design in either approach is challenging and significant assumptions must be made
which compromise stability.
5.1 H2 Control
Stability properties of the H2 have been well understood for some time. Controllers
designed using H2 techniques are inherently robust even in the presence of modeling
uncertainties, external disturbances, and measurement noise. This is because the
formulation of the control law is based on disturbance rejection, aiming to “keep the
size of the performance variable small in the presence of the exogenous signals” [55].
Additionally, the controller design is straightforward and systematic if the plant to
be controlled is well defined in terms of its state space representation. As such, these
robust and optimal controllers are frequently implemented in many industry applica-
tions and are used confidently by engineers. For these reasons, it seems appropriate
to compare the proposed design to an H2 design. Begin by adopting the system
dynamics
Mx = −Drx− Fm + Fc, (5.1)
52
with a requirement that the damping term in the end effector force measurement
is excluded and, for now, the environment stiffness is constant. That is, Fm =
Kex, Ke ∈ �+. As far as a controller is concerned, any damping present in the
environment can be included in the term Drx so excluding Dm in Eq. (5.1) implicitly
includes it in with Dr. To make a fair comparison to the proposed controller, force
errors are treated as the system states. Defining
e =
e1
e2
=
Fm − Fd
Fm − Fd
, (5.2)
results in
e =
0 1
0 0
e +
B11
B12
w(t) +
0
KeM−1(−Drx− Fm + Fc)− Fd
, (5.3)
where B1 := [B11 B12]T and w(t) represents sensor noise and may also model some
system disturbances. First, a feedback cancelling control is chosen to be
uFBC = Drx + Fm + K−1e
MFd, (5.4)
and the overall control law that will be implemented is
Fc = uFBC + uH2 . (5.5)
The final state-space representation for H2 synthesis becomes
e := Ae + B1w(t) + B2uH2 =
0 1
0 0
e +
0
KeM−1
uH2 , (5.6)
with the performance variable, z, having dynamics
z =
C11 0
0 C12
e +
0
D12
uH2 , (5.7)
53
and the system has measurable output
y(t) := C2e + D21w(t) =
�1 0
�e + D21w(t). (5.8)
The control uH2 is the output of the system K2 (defined below) with input y(t). K2
minimizes the H2 norm of the w(t) → z mapping (that is, minimizes the influence of
the system uncertainties to the performance variable)
K2 :=
A + B2F2 + L2C2 −L2
F2 0
, (5.9)
where
F2 = −D−212 ([0 C12D12] + BT
2 X2), (5.10)
and
L2 = −D−221 (Y2CT
2 + [B11D21 B12D21]T ). (5.11)
X2 and Y2 are the positive semi-definite solutions to the algebraic Ricatti equations
0 = X2Ar+AT
rX2+
C
211 0
0 C212
−D−212
0 0
0 C212D
212
−D−212 X2
0 0
0 K2eM−2
X2,
(5.12)
and
0 = AeY2 + Y2AT
e−D
−221 Y2
1 0
0 0
Y2, (5.13)
with
Ar = A−D−212
0 0
0 KeM−1
C12D12
, (5.14)
and
Ae = A−D−121
B11 0
B12 0
. (5.15)
54
5.1.1 Discussion on H2 controller
A few of the noteworthy results that arise from the above controller design require
some discussion. First of all, this controller will only work when there is a non-zero
force sensor measurement. The controller design verifies this fact because ||F2||2 →∞
as Ke → 0, meaning that when the slave is in unconstrained motion a switch must
be made from the above designed H2 control to another control. Realistically, it is
hard to implement this switching type control and this fact highlights the superiority
of the proposed control design to conventional control design.
Also remember that the H2 control law is designed based on a constant environ-
ment stiffness Ke. The control is implemented by calling a gain scheduler routine
which varies the control gains according to the current environment stiffness. The
environment stiffness is unknown, however an estimate of it can be made based on the
measured force. It turns out that it is reasonable for estimation purposes to assume
that Ke ∝ Fm for surgical applications. For instance, if a large force is encountered
it can be assumed that the environment stiffness is also large. If the environment
stiffness were small in this particular case it would be implied that there was a large
tool displacement, an unlikely scenario in surgical applications. Note also that the
estimator gains L2 do not depend on Ke. In fact, the only gain that depends on Ke
is the derivative control gain (the second element of F2). Thus, uncertainties in the
estimated Ke do not affect the controller performance or robustness significantly.
Another point of discussion is uFBC which aims at cancelling out dynamics asso-
ciated with robot damping, measured force, and commanded force derivatives. Of
particular difficulty is knowledge of the coefficients Dr and Ke in Eq. (5.4). Ke
must be estimated in the same manner as above. The damping of the robot can be
55
Slave
ControllerPassivity Observer
Remote Environment
Passivity Controller
Figure 5.1: Output feedback control architecture
quantified by experiment, but recall that Dr also contains damping effects present in
the environment. Nonetheless, as long as Dr is estimated to be less than the actual
Dr only controller performance is affected, not stability. Indeed, robustness actually
improves if the estimate of Dr is below the actual value by adding a damping effect
to the system.
In addition to the above control law, the actual control signal sent to the slave ac-
tuators is filtered such that the maximum actuator velocity does not exceed 1500mm/s.
Filtering the control signal is a common practice and is absolutely necessary to pro-
tect the actuators, other slave hardware, and ensure that high frequency natural
modes are not excited.
5.2 Output Feedback Control
The basis for this control comes from passivity theory and follows the design em-
ployed by [30] and [31]. For a system as in Fig. 5.1 the control design is fairly straight
forward. Consider that the controller (which can be design arbitrarily, without re-
gard for stability) sends force commands to the slave robot actuators. The passivity
observer observes the passivity of the slave and the passivity controller adds any
shortage of passivity to the control signal from the controller. The passivity observer
56
is designed from a direct extension of passivity theory. In continuous time a passive
system obeys the inequality
�t
0
f(τ)v(τ)dτ ≥ 0, (5.16)
where f(t) is the controller output and v(t) is the slave velocity. Physically, this
system can be interpreted as being passive if the slave is absorbing energy. The
passivity observer calculates and outputs the passivity, E, of the slave in discrete
time by
E(n) = ∆T
n�
k=0
f(k)v(k), (5.17)
where ∆T is the sampling period and n is the current sample.
The passivity controller is designed to output
α(n) =
−E(n)∆Tv(n)2 if E(n) < 0,
0 otherwise,
(5.18)
thus injecting any shortage of passivity back into the system. A proof of stability is
trivial and given in [30] for the readers reference.
5.2.1 Discussion on output feedback control
Again, a short discussion on the output feedback control is constructive. First,
note that the observer contains a memory element. Thus, a system can accumulate
“passivity” for some time. If the slave becomes active at some point (adding energy
to the system), it will continue to behave actively until the accumulated “passivity”
is dissipated. Only at this point will the passivity controller intercede and stabilize
the system. Realistically, the passivity observer must be reset periodically, at the
cost of performance.
57
Additionally, if there is any time delay in the system it is difficult to accurately
observe the passivity in the system. Indeed, in the presence of a time delay the
passivity controller may even cause the system to destabilize faster (particularly
when the slave exhibits oscillations).
Nonetheless, in ideal conditions the output feedback controller is attractive in
the sense that the controller can be designed with a small robustness margin or
only locally stable and the passivity controller will compensate for any instances of
instability and ensure global stability.
58
Chapter 6
Results
This chapter
• provides an overview of the experimental setup used to verify the controller
design
• compares the proposed control design to the controllers designed in Chapter 5
• gives experimental results of various tests performed on the proposed controller
6.1 Experimental Setup
The experimental setup consists of a combination of simulation and real hardware.
A real master device is used and the force trajectory is supplied by a real human
operator. The slave device and environment are completely simulated.
6.1.1 Master device
Project neuroArm from the Foothills Medical Research Center in Calgary has pro-
vided a PHANTOM Omni haptic device manufactured by SensAble (Fig. 6.1). The
Omni is a 3 degree-of-freedom (in force) haptic device providing force feedback in
the x, y, z axis. Positional sensing is provided in 6 degrees (x, y, z and roll, pitch,
yaw) from digital encoders with a positional resolution of 0.055mm. A workspace
with dimensions of 160W × 120H × 70Dmm is available. An IEEE-1394 FireWire
port connects the device to a PC, allowing for fast communications between the PC
59
Figure 6.1: Master haptic device used for experiments.
and the device. A stylus located at the end effector has an apparent mass of 45g.
The maximum force that can be continuously exerted by the Omni is 3.3N , and the
motors have a backdrive friction of 0.26N . The device exhibits a maximum stiffness
of 2310N/m.
Communication to the device is provided through Quanser’s QuaRC control soft-
ware solution. A PHANTOM Omni blockset is provided for use in MATLAB’s
Simulink environment. QuaRC fully supports Simulink’s external mode, includ-
ing scopes, online parameter tuning, and data logging directly to the MATLAB
workspace. The Omni Simulink block sends force commands to the Omni’s motors
and receives x, y, z encoder positions as well as roll, pitch, yaw angles. QuaRC allows
communication with the Omni at sample frequencies up to 1000Hz, as used for this
60
work.
The proposed controller is designed in such a way to ensure a straightforward
extension to a multiple degree-of-freedom haptic setup, however this thesis only tests
the one degree of freedom case. As such, throughout the experiment two proportional
controllers are used to lock the haptic device onto the (x, 0, 0) line in the usable
workspace.
Because there is no force sensor mounted on the Omni end effector, a virtual
force sensor is designed which models a spring. The output of the force sensor is the
human desired force defined by
Fd = Fm + Khaptic(xhaptic − x0). (6.1)
With this force sensor construction it appears that the force controller has been
recast as a scaled position control. However, there are two key differentiating points.
One, Fd is the force fed back to the Onmi such that the human can feel the force that
they are applying. Compare this to a slave under position control in which the force
fed into the haptic device would simply be Fm. Secondly, the addition of Fm in Eq.
(6.1) has profound influence on the system with time delay in terms of force error.
Consider a system with a communication delay of T seconds between the remote
environment and the human operator. The measured force takes T seconds to reach
the master device,
Fd(t) = Fm(t− T ) + Khaptic(xhaptic(t)− x0), (6.2)
and the human commanded force Fd takes an additional T seconds to reach the
61
controller on the slave side. Thus the force error at time t is
�(t) = Fd(t− T )− Fm(t), (6.3)
= Fm(t− 2T )− Fm(t) + Khaptic(xhaptic(t− T )− x0). (6.4)
Indeed, when there is no communications delay (T = 0) the force sensor construction
does transform the controller into a scaled position controller.
It is important to note that, although the maximum force exerted by the Omni
is 3.3N a force larger than this can still be command because of the virtual force
sensor. However, the force feedback felt by the human is saturated at 3.3N and
represents a limitation of haptic systems in general and can pose significant danger
to the slave hardware and patient because the human operator becomes unaware of
the force they are exerting. This limitation can be remedied by scaling the human
commanded force fed back to the haptic device by a constant 0 < Ks < 1 while
sending the unscaled version to the controller. Teleoperation systems often do this
to provide the human with increased fidelity in some situations.
Additionally, [56] has shown experimentally and quantitatively that humans have
poor judgment when differentiating between impedances. This particularly well cited
article (and others [57]) argues that nonlinearities (high-frequency changes in force,
or “edges”) are much better indicators of perceptual “hardness” compared to a ratio
of static position to force (ie. the technical definition of stiffness). Thus, it is more
important that the haptic device preserves the high-frequency change in force and
there is less concern with the haptic devices ability to reflect the appropriate force.
In fact, [56] makes these claims specifically for the stiffness range of 1700 to 3200N/m
which is precisely the stiffness that the controllers are tested for. The human test
subject used for obtaining the following experiment results confirms this.
62
Figure 6.2: Screen capture of the Simulink model used for experiments.
The parameter Khaptic is a user-determined parameter based on their preference.
A large Khaptic means the master device will move little whereas a smaller value will
require larger displacements to produce the same force.
6.1.2 Virtual components
The remaining components of the experimental setup are written in software and
implemented in Simulink (see Fig. 6.2). A majority of the experiment is written in
the C language, integrated into Simulink using mex-file s-functions, and compiled
using MATLAB’s real-time workshop toolbox (requiring additional Target Language
Compiler files to be written). Any integration performed in the experiment is done
using MATLAB’s ode4 fixed-step integrator which uses the 4th order Runge-Kutta
method for integration. One exception is the neural-network weight updates which
employs a 5th order bode rule integrator because this integration is performed in
C-code written externally to the MATLAB environment.
Slave dynamics consist of a mass-damper system in one-dimension with driving
force Fc and opposing force Fm. Two nonlinear and challenging environments are
63
Figure 6.3: Force profile of the simulated remote environment
designed for the controller to be tested on. The general force profile for the virtual
environment is shown in Fig. 6.3. In the case of stiff contact, the force profile is
shown exactly as in Fig. 6.3. In the case of the loss of contact test, there is a jump
discontinuity at xt with Fm(x = x+t ) = 0.
Communication delays are added to the system so that any information trans-
ported from the master to the slave (or vice-versa) undergoes a time delay of T
seconds.
Table 6.1 shows the nominal parameters used in the experiments. Unless other-
wise mentioned, these parameters were used to obtain all results.
64
Table 6.1: Experiment Parameters
Parameter Value Parameter Value
β1 30 β2 10βp 10 G1 10G2 15 ν 0.001γ 0.1 T 0.05sh 0.01s M 0.1kg
Dr 4N · s/m Dm 1N · s/m
Λ 1 Khaptic 100N/m
µ 0.1 m 10||p||min 1 xt 10cm
Ke,1 (loss of contact) 30N/m Ke,1 (stiff contact) 20N/m
Ke,2 (loss of contact) 0N/m Ke,2 (stiff contact) 3000N/m
Mhaptic 45g
6.1.3 Implementation Considerations
Fd and Fd are used in the control law and it is not intuitive as to how these values
are obtained. Looking analytically at these terms based on Eq. (6.2) it is seen that
Fd(t) = Fm(t− T ) + Khapticxhaptic(t), (6.5)
Fd(t) = Fm(t− T ) + Khapticxhaptic(t). (6.6)
In this case, the time delay allows a calculation of Fm(t− T ). Usually, it is difficult
to obtain derivative values in real time, but past derivative values can be fairly
accurately estimated from a delta rule
Fm(t− T ) =Fm(t− T + h)− Fm(t− T − h)
h, (6.7)
Fm(t− T ) =Fm(t− T + h)− Fm(t− T − h)
h, (6.8)
where 0 < h < T .
Additionally, xhaptic and xhaptic are obtained by implementing a Kalman filter at
65
the master side. The observer gains are calculated using the state space model
xhaptic
xhaptic
=
0 1
M−1hapticKhaptic 0
xhaptic − x0
xhaptic
+
0
Mhaptic
(Fd(t)−Fm(t−T )),
(6.9)
and classic optimal observer design theory with noise covariances of 1.
6.2 Results
This section is divided into 2 sections
• First, the proposed controller is tested for the cases of stiff contact and loss
of contact. The results are also compared to an H2 controller and an output
feedback controller
• Secondly, the proposed controller is specifically tested to show
– the filtering properties of the backstepping technique
– the effect of time delay
– the evolution of errors s and z
– neural network outputs
– the boundedness of neural network weights
6.2.1 Stiff Contact Test and Loss of Contact Test (comparison with H2 and output
feedback controller)
A human operator (the author) is asked to push the slave robot through a medium
as shown in Fig. 6.3 and after 10cm the slave will encounter a stiff environment,
simulating contact with a wall. Because the force trajectory is determined by a
66
human user, each trial is somewhat different. Testing the controllers using a pre-
defined force trajectory would not provide a fair test because the human is expected
to change their response based on the controller’s performance. This is an artifact
of the human operator being part of the system dynamics and makes it difficult to
test.
Nonetheless, the human is instructed to respond as consistently as possible amongst
trials and is asked to contact the wall (or puncture point) between 3 and 4sec for
consistency. The test for each controller is repeated 11 times to ensure consistent
results and standard deviations are provided with data which confirm the validity of
the results. The time delay is set to 0.05sec for the H2 and proposed controller, and
0sec for the output feedback controller. The output feedback controller tends to go
unstable for even small time delays because small time delays significantly affect the
accuracy of the passivity observer, in particular if high frequency effects are present
(which they are when the slave bounces off the hard surface).
The collision is not completely elastic because the environment still has a finite
stiffness (that is, at xt in Fig. 6.3 the slave velocity is not perfectly reflected in the op-
posite direction). In reality there will always be some compliance in the event of stiff
contact whether it be from environment deformation or surgical tool deformation.
The human operator is provided with information of the slave position as well as
force feedback from the PHANTOM Omni haptic device.
To convince the reader that the human test subject is not biased, the same ex-
periment is also performed with a pre-defined ‘human’ response. The pre-defined
‘human’ behaves like a filtered proportional-integral controller for velocity. That is,
the pre-defined response attempts to maintain constant slave velocity. The com-
67
manded force is thus determined by
Fd =2 + 8s
s
20
s + 20�v + Fm, (6.10)
where �v is the error between the slave velocity and the desired slave velocity. In
terms of Fig. 6.3 the desired slave velocity in the first region is set to be 0.03m/s.
For the stiff contact test, the desired velocity in the second region is set to 0m/s
once a force of 5N is reached. For the loss of contact test, the desired velocity in the
second region is set to 0m/s as soon as the puncture occurs.
Results of the comparisons are shown in Fig. 6.4-6.7, and following those figures
the results for the same tests using the PI human model are shown in Fig. 6.8-6.11.
Comparing the results when the real human is providing the desired force and when
the PI human model is providing the desired force reveals that the controller response
at stiff contact (and loss of contact) are remarkably the same.
Fig. 6.4 shows the results of the 1st trial and confirm that the proposed controller
significantly out performs both the H2 and output feedback controller. Both the H2
and output feedback controller apply significant force to the wall when contact is
made. Table 6.2 records the average maximum force exerted by the slave on the
wall over the 11 trials and the standard deviation of this maximum. Only the 0.5sec
range after contact with the wall is made is considered when calculating the values
in Table 6.2, after which time the human is able to react to the increased force.
Standard deviations show that there is not significant deviation between trials and
justifies single tests from hereon.
As can be seen, such forces would either damage force sensors (force sensors on
neuroArm have a sensing range of 32N in the x, y direction and 56N in the z di-
rection) or at least not reflect the humman commanded behaviour and these results
68
Table 6.2: Average maximum measured force at the slave end effector due to theproposed controller, H2 controller, and the output feedback controller
Controller Average Maximum Measured Force Standard Deviation
Proposed 4.15N 0.24H2 122.46N 2.40
Output Feedback 25.96N 2.83
confirm the danger when contact with a wall is encountered in teleoperation. Maxi-
mum contact force decreases for both the proposed controller and the H2 controller
as the time delay increases (not shown). This is because the controller is allowed
more time to pull back before the high contact force reaches the human operator.
Oscillations in the measured force when in contact in the material seem undesir-
able. However, looking at the slave robot position when in contact with the wall, it
can be seen that these oscillations in force correspond to small oscillations in slave
position (for the proposed controller at least, which have an average magnitude of
around 0.2mm).
Fig. 6.7 shows this result of the loss of contact test. The H2 controller is not
appropriate for the loss of contact test as discussed in Section 5. Note that the
primary goal of this results section is to show that stability and improved performance
in the case of stiff contact while ensuring that performance is not compromised in
the loss of contact case. Fig. 6.7 confirms that we do not compromise performance
in the loss of contact test. In fact, there is less positional overshoot when using our
controller (though at the expense of increase slave speed).
69
0 1 2 3 4 5 60
0.05
0.1
Time (s)
Slav
ePos
itio
n(m
)
Stiff Contact Test: Comparison between the Proposed Controller (P.C.), H2 Controllerand the Output Feedback Controller
P.C.H2O.F.C.
0 1 2 3 4 5 6−2−1
0123
Time (s)Slav
eVel
ocity
(m/s
)
P.C.H2O.F.C.
0 0.02 0.04 0.06 0.08 0.1 0.12−2−1
0123
Slave Position (m)Slav
eVel
ocity
(m/s
)
P.C.H2O.F.C.
0 1 2 3 4 5 60
2
4
6
Time (s)
Forc
e(N
) P.C.
FdFm
0 0.02 0.04 0.06 0.08 0.1 0.120
2
4
6
Slave Position (m)
Forc
e(N
)
FdFm
0 1 2 3 4 5 60
50
100
Time (s)
Forc
e(N
) H2
FdFm
0 0.02 0.04 0.06 0.08 0.1 0.120
50
100
Slave Position (m)
Forc
e(N
)
FdFm
0 1 2 3 4 5 60
1020
3040
Time (s)
Forc
e(N
) O.F.C.
FdFm
0 0.02 0.04 0.06 0.08 0.1 0.120
1020
3040
Slave Position (m)
Forc
e(N
)
FdFm
Figure 6.4: Comparison of the three controllers for the stiff contact test. Proposedcontroller hits the wall with 118N less force than the H2 controller and 21N lessforce than the output feedback controller.
70
0 1 2 3 4 5 60
0.02
0.04
0.06
0.08
0.1
Time (s)
Slav
ePos
itio
n(m
)
Stiff Contact Test (Close up Performance)
P.C.H2O.F.C.
0 1 2 3 4 5 6
−0.02
0
0.02
0.04
0.06
Time (s)
Slav
eVel
ocity
(m/s
)
P.C.H2O.F.C.
0 0.02 0.04 0.06 0.08 0.1 0.12
−0.02
0
0.02
0.04
0.06
Slave Position (m)
Slav
eVel
ocity
(m/s
)
P.C.H2O.F.C.
0 1 2 3 4 5 6−0.8−0.6
−0.4−0.2
0
0.20.40.6
Time (s)
Forc
eE
rror
(N)
(Fd−
Fm
)
P.C.H2O.F.C.
0 0.02 0.04 0.06 0.08 0.1 0.12−0.8−0.6
−0.4−0.2
0
0.20.40.6
Slave Position (m)
Forc
eE
rror
(N)
(Fd−
Fm
)
P.C.H2O.F.C.
Figure 6.5: Zoomed in version of Fig. 6.4 to emphasize the performance benefit ofthe proposed controller
71
0 1 2 3 4 5 60
0.02
0.04
0.06
0.08
0.1
Time (s)
Slav
ePos
itio
n(m
)
Stiff Contact Test (Proposed Controller Only)
0 1 2 3 4 5 6
−0.02
0
0.02
0.04
0.06
Time (s)
Slav
eVel
ocity
(m/s
)
0 0.02 0.04 0.06 0.08 0.1 0.12
−0.02
0
0.02
0.04
0.06
Slave Position (m)
Slav
eVel
ocity
(m/s
)
0 1 2 3 4 5 6−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
Time (s)
Forc
eE
rror
(N)
0 0.02 0.04 0.06 0.08 0.1 0.12−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
Slave Position (m)
Forc
eE
rror
(N)
Figure 6.6: A version of Fig. 6.5 with only the proposed controller performance.Axis are the same as in Fig. 6.5.
72
0 1 2 3 4 5 6 70
0.05
0.1
Time (s)
Slav
ePo
sitio
n(m
)
Loss of Contact Test: Comparison between the Proposed Controller (P.C.)and the Output Feedback Controller (O.F.C.)
P.C.O.F.C.
0 1 2 3 4 5 6 7−0.4
−0.2
0
0.2
Time (s)
Slav
eV
eloc
ity(m
/s)
P.C.O.F.C.
0 0.02 0.04 0.06 0.08 0.1 0.12−0.4
−0.2
0
0.2
Slave Position (m)
Slav
eV
eloc
ity(m
/s)
P.C.O.F.C.
0 1 2 3 4 5 6 7−0.5
0
0.5
1
Time (s)
Forc
e(N
)
P.C.
FdFm
0 0.02 0.04 0.06 0.08 0.1 0.12−0.5
0
0.5
1
Slave Position (m)
Forc
e(N
)
FdFm
0 1 2 3 4 5 6 7
0
0.5
1
Time (s)
Forc
e(N
)
O.F.C.
FdFm
0 0.02 0.04 0.06 0.08 0.1 0.12
0
0.5
1
Slave Position (m)
Forc
e(N
)
FdFm
Figure 6.7: Comparison between the proposed controller and an output feedbackcontroller for the loss of contact test. The proposed controller has less positionalovershoot than the output feedback controller, but a greater negative velocity.
73
0 1 2 3 4 5 6 70
0.05
0.1
Time (s)
Slav
ePos
itio
n(m
)
Stiff Contact Test (PI human model): Comparison between the Proposed Controller (P.C.), H2Controller and the Output Feedback Controller
P.C.H2O.F.C.
0 1 2 3 4 5 6 7−2−1
0123
Time (s)Slav
eVel
ocity
(m/s
)
P.C.H2O.F.C.
0 0.02 0.04 0.06 0.08 0.1 0.12−2−1
0123
Slave Position (m)Slav
eVel
ocity
(m/s
)
P.C.H2O.F.C.
0 1 2 3 4 5 6 70
2
4
6
Time (s)
Forc
e(N
) P.C.
FdFm
0 0.02 0.04 0.06 0.08 0.1 0.120
2
4
6
Slave Position (m)
Forc
e(N
)
FdFm
0 1 2 3 4 5 6 70
50
100
Time (s)
Forc
e(N
) H2
FdFm
0 0.02 0.04 0.06 0.08 0.1 0.120
50
100
Slave Position (m)
Forc
e(N
)
FdFm
0 1 2 3 4 5 6 70
1020
3040
Time (s)
Forc
e(N
) O.F.C.
FdFm
0 0.02 0.04 0.06 0.08 0.1 0.120
1020
3040
Slave Position (m)
Forc
e(N
)
FdFm
Figure 6.8: Comparison of the three controllers for the stiff contact test using the PIhuman model. The controller response is quite similar to those shown in Fig. 6.4.
74
0 1 2 3 4 5 6 70
0.02
0.04
0.06
0.08
0.1
Time (s)
Slav
ePos
itio
n(m
)
Stiff Contact Test (PI human model, Close up Performance)
P.C.H2O.F.C.
0 1 2 3 4 5 6 7
−0.02
−0.01
0
0.01
0.02
0.03
Time (s)
Slav
eVel
ocity
(m/s
)
P.C.H2O.F.C.
0 0.02 0.04 0.06 0.08 0.1 0.12
−0.02
−0.01
0
0.01
0.02
0.03
Slave Position (m)
Slav
eVel
ocity
(m/s
)
P.C.H2O.F.C.
0 1 2 3 4 5 6 7
−0.4
−0.2
0
0.2
0.4
0.6
Time (s)
Forc
eE
rror
(N)
(Fd−
Fm
)
P.C.H2O.F.C.
0 0.02 0.04 0.06 0.08 0.1 0.12
−0.4
−0.2
0
0.2
0.4
0.6
Slave Position (m)
Forc
eE
rror
(N)
(Fd−
Fm
)
P.C.H2O.F.C.
Figure 6.9: Zoomed in version of Fig. 6.8
75
0 1 2 3 4 5 6 70
0.02
0.04
0.06
0.08
0.1
Time (s)
Slav
ePos
itio
n(m
)
Stiff Contact Test (PI human model, Proposed Controller Only)
0 1 2 3 4 5 6 7
−0.02
−0.01
0
0.01
0.02
0.03
Time (s)
Slav
eVel
ocity
(m/s
)
0 0.02 0.04 0.06 0.08 0.1 0.12
−0.02
−0.01
0
0.01
0.02
0.03
Slave Position (m)
Slav
eVel
ocity
(m/s
)
0 1 2 3 4 5 6 7
−0.4
−0.2
0
0.2
0.4
0.6
Time (s)
Forc
eE
rror
(N)
0 0.02 0.04 0.06 0.08 0.1 0.12
−0.4
−0.2
0
0.2
0.4
0.6
Slave Position (m)
Forc
eE
rror
(N)
Figure 6.10: A version of Fig. 6.9 with only the proposed controller performance.Axis are the same as in Fig. 6.9.
76
0 1 2 3 4 5 6 70
0.05
0.1
Time (s)
Slav
ePo
sitio
n(m
)
Loss of Contact Test (PI human model): Comparison between the ProposedController (P.C.) and the Output Feedback Controller (O.F.C.)
P.C.O.F.C.
0 1 2 3 4 5 6 7−0.2
−0.1
0
0.1
0.2
Time (s)
Slav
eV
eloc
ity(m
/s)
P.C.O.F.C.
0 0.02 0.04 0.06 0.08 0.1 0.12−0.2
−0.1
0
0.1
0.2
0.3
Slave Position (m)
Slav
eV
eloc
ity(m
/s)
P.C.O.F.C.
0 1 2 3 4 5 6 7
−0.5
0
0.5
1
Time (s)
Forc
e(N
)
P.C.
FdFm
0 0.02 0.04 0.06 0.08 0.1 0.12
−0.5
0
0.5
1
Slave Position (m)
Forc
e(N
)
FdFm
0 1 2 3 4 5 6 7
−0.5
0
0.5
1
Time (s)
Forc
e(N
)
O.F.C.
FdFm
0 0.02 0.04 0.06 0.08 0.1 0.12
−0.5
0
0.5
1
Slave Position (m)
Forc
e(N
)
FdFm
Figure 6.11: Comparison between the proposed controller and an output feedbackcontroller for the loss of contact test using the PI human model. Again, the controllerresponse is quite similar to the results shown in Fig. 6.7.
77
6.2.2 Proposed controller performance
The remainder of this section rigorously tests the performance of the proposed con-
troller, showing
• the filtering properties of the backstepping technique
• the effect of time delay
• the evolution of errors s and z
• neural network outputs
• the boundedness of neural network weights
6.2.2.1 Filtering properties of Backstepping
Fig. 6.12 shows the controller performance when the backstepping technique is not
used. In this case, the actual control sent to the slave is Eq. (3.22) and the neural
network weight update laws do not include the derivative terms in the weight update
law resulting from the tuning function approach (γ = 0). This controller is tested
using the nominal gains from Table 6.1 that are applicable. In addition to the control
law Eq. (3.22) the control signal is filtered before it is sent to the slave. The controller
is then tested for various filter cutoff frequencies. A first order filter of the form
output
input=
ωb
s + ωb
, (6.11)
is used where ωb represents the break frequency in rad/s.
Comparable results are achieved to the proposed controller when ωb = 15. For
large ωb the controller tends to go unstable. After much testing, it became apparent
that this instability actually arises from the controller exciting natural modes of
78
vibration in stiff contact. In the stiff environment the natural mode of the undamped
vibration is
ωn =
�Ke
M= 173.2rad/s, (6.12)
which is quite high (the damped natural frequency will be slightly lower). The
controller tends to excite these natural modes and the system becomes unstable. In
addition to the force responses of Fig. 6.14, the frequency spectrum of the control
force is included. As expected, high frequency components of the control signal
appear near the calculated natural mode of undamped vibration. Because controlling
these high frequency modes is not of particular concern, there is justification from
a performance perspective in adding a low-pass filter to the controller output. A
first order filter with ωb sufficiently damps these natural vibrations and prevent the
controller from causing instability. Thus, it can be said that under the nominal
control gains in Table 6.1 the backstepping technique behaves most like a first order
filter with cutoff frequency ωc = 15. Not surprisingly, this is in exact agreement with
the analysis in Section 3.3.2 in which it was shown that the cutoff frequency of the
backstepping filter effect is exactly determined by the gain G2.
Fig. 6.14 displays the effect of gain G2 has on the filtering properties of the
control signal. Additional damping, likely due to the neural network φ2w2, shifts the
response of controller so that its dominant frequency in stiff contact is further from
the natural mode. This decreases the risk of exciting the natural mode and causing
resonant behavior in the system.
79
0 1 2 3 4 50
1
2
3
4
5
6
Time (s)
Fc(N
)
ωb = 100
Filtered, ωbs+ωb
, control signal for stiff contact test without backsteppingfor various ωb
0 1 2 3 4 50
1
2
3
4
5
6
Time (s)
Fc(N
)
ωb = 50
0 1 2 3 4 50
1
2
3
4
5
6
Time (s)
Fc(N
)
ωb = 25
0 1 2 3 4 50
1
2
3
4
5
6
Time (s)
Fc(N
)
ωb = 15
Figure 6.12: Filtered control signal, Fc, without backstepping for various filter breakfrequencies. Contact with wall is made when Fc is approximately 2. Higher breakfrequencies allow excitation of the system’s normal modes.
80
0 100 200 300 400 5000
0.05
0.1
0.15
0.2
0.25
0.3
ω(rad/s)
|Fc(j
ω)|
ωb = 100
Frequency spectrum of filtered, ωbs+ωb
, control signal for stiff contact test without backsteppingfor various ωb
0 100 200 300 400 5000
0.05
0.1
0.15
0.2
0.25
0.3
ω(rad/s)
|Fc(j
ω)|
ωb = 50
0 100 200 300 400 5000
0.05
0.1
0.15
0.2
0.25
ω(rad/s)
|Fc(j
ω)|
ωb = 25
0 100 200 300 400 5000
0.05
0.1
0.15
0.2
0.25
ω(rad/s)
|Fc(j
ω)|
ωb = 15
Figure 6.13: Frequency spectrum of the filtered control signal, Fc, without backstep-ping for various filter break frequencies. Higher frequency components tend to excitethe natural modes of the system.
81
0 1 2 3 4 5 6−1
0
1
2
3
4
5
6
7
Time (s)
Fc(N
)
G2 = 80Control signal Fc and associated frequency spectrum when using backstepping with
various gains G2
0 2 4 6 80
1
2
3
4
5
6
7
8
Time (s)
Fc(N
)
G2 = 20
0 100 200 300 400 5000
0.05
0.1
0.15
0.2
0.25
G2 = 80
|Fc(j
ω)|
ω(rad/s)0 100 200 300 400 5000
0.05
0.1
0.15
0.2
0.25
G2 = 20
ω(rad/s)
|Fc(j
ω)|
Figure 6.14: Showing the filtering properties of the backstepping method. The back-stepping technique attenuates high frequency control signals and thus allows stableoperation.
82
6.2.2.2 Neural Network Outputs, Boundedness of Neural Network Weights, Evolu-
tion of System States
The neural network outputs for the stiff contact test are given in Fig. 6.15 and for
the loss of contact test in Fig. 6.16. As expected, the outputs change drastically
at transition points in order for the neural network to model the large changes in
environment stiffness.
The results in the section also examine the boundedness of the neural network
weights. A commanded force trajectory was designed and the controller was run for
200 trials using this same trajectory. It was not necessary to test this portion of the
controller using human produced trajectories. Additionally, it is easier to convince
ourselves that the weights are bounded when a consistent trajectory is applied. The
controller was simulated with
Fd = Fm + 0.2N. (6.13)
Fig. 6.17-6.18 show the results. As can be seen, the robust weight update laws
ensure the boundedness of the neural network weights. Of particular interest is
Fig. 6.19 in which the supervisory learning term is removed from the update law
of the parameter p. Under the same conditions, the system goes unstable without
a supervising learning term. It is interesting to note that the error s drives the
parameter p to zero, yet as p approaches zero the system goes unstable.
Finally, Fig. 6.20 show the evolution of the system states s and z as the desired
force trajectory is repeatedly imposed on the controller. The neural networks suc-
ceed in reducing s and z. However, it is important to remember that the proposed
controller shows its advantage in its speed of adaptation rather than its ability to
learn over time. Decreasing the learning gains would give a smoother learning curve,
83
but it was decided that the speed of adaptation is more important. This is why
the error s reaches its steady state value in approximately five trials. Nevertheless,
controller performance does improve with use.
84
0 1 2 3 4 5 60
2
4
6
8
Time (s)
Forc
e(N
)Neural Network Outputs for a stiff of contact test
FdFm
0 1 2 3 4 5 6−4
−2
0
2
4
Time (s)
φw
1
Output of neural network 1
0 1 2 3 4 5 6−0.5
0
0.5
1
1.5
Time (s)
φw
2
Output of neural network 2
0 1 2 3 4 5 69.5
9.69.79.89.910
Time (s)
p
Adaptive parameter
Figure 6.15: Neural Network outputs in stiff contact. A wall is hit at around 3.5sand the neural network outputs react accordingly.
85
0 1 2 3 4 5 6 7 8−1
−0.5
0
0.5
1
Time (s)
Forc
e(N
)Neural Network Outputs for a loss of contact test
FdFm
0 1 2 3 4 5 6 7 8−6
−4
−2
0
2
4
Time (s)
φw
1
Output of neural network 1
0 1 2 3 4 5 6 7 8−2
−1
0
1
2
Time (s)
φw
2
Output of neural network 2
0 1 2 3 4 5 6 7 87
8
9
10
11
Time (s)
p
Adaptive parameter
Figure 6.16: Neural Network outputs in loss of contact. The puncture occurs ataround 2.5s and neural network outputs react quickly.
86
0 20 40 60 80 100 120 140 160 180 2000.05
0.1
0.15
0.2
0.25
w1,R
MS
Root Mean Squared Neural Network Weights
Trial
0 20 40 60 80 100 120 140 160 180 2000.01
0.02
0.03
0.04
0.05
0.06
w2,R
MS
Trial
0 20 40 60 80 100 120 140 160 180 2006
7
8
9
10
pR
MS
Trial
Figure 6.17: Root mean square neural network weights for 200 trials. Weight con-vergence is achieved.
87
0 20 40 60 80 100 120 140 160 180 2001
1.5
2
2.5
3
3.5Maximum Weight over the Duration of a Trial
||w
1||∞
Trial
0 20 40 60 80 100 120 140 160 180 2000.2
0.4
0.6
0.8
1
||w
2||∞
Trial
Figure 6.18: Maximum neural network weights for 200 trials. Weight convergence isachieved.
88
0 5 10 15 20 25 300.05
0.1
0.15
0.2
0.25
0.3
w1,R
MS
RMS Neural Network Weights without supervising ˙p
Trial
0 5 10 15 20 25 300.09
0.1
0.11
0.12
0.13
0.14
0.15
w2,R
MS
Trial
0 5 10 15 20 25 303
4
5
6
7
8
9
10
pR
MS
Trial
Figure 6.19: Root mean square neural network weight when there is no supervisedlearning in ˙p. Instability occurs after 28 trials.
89
0 20 40 60 80 100 120 140 160 180 2000.09
0.095
0.1
0.105
0.11
0.115
0.12Root Mean Squared system errors
s RM
S
Trial
0 20 40 60 80 100 120 140 160 180 2000.015
0.02
0.025
0.03
0.035
0.04
0.045
z RM
S
Trial
Figure 6.20: Convergence of states s and z over 200 trials.
90
6.2.2.3 Time delay
The proposed controller is tested for time delays up to 1sec. Fig. 6.21 displays
the results. When the time delay becomes large it is hard for the user to control
the slave, but an important result is that the system does not go unstable. The
difficulty in controlling the slave is an artifact of the time delay itself and represents
a weakness in the human’s ability to account for the delay. The most important
result is that regardless of the time delay, the impact force of the slave on the wall is
roughly 3.5N and this displays the ultimate advantage of using force control in a time
delayed system. Under position control, the controller would apply excessive force
to the wall because the human commanded position may be beyond the physical
constraints defined by the wall.
The case of sudden loss of contact is then tested with an environment in the
presence of time delay. It is important to test the loss of contact scenario because it
is expected that a force controlled system will behave undesirably in the case when
there is a time delay (large positional overshoot can occur because the controller
aims at tracking the last commanded force, Fd(t − T ) � Fm(t)). Fig. 6.22 shows
the results. The slave robot losses contact at x = 0.1m. As anticipated, there
is significant positional overshoot when the slave loses contact and this overshoot
increases with increased time delay (from 0.45cm for T = 0.01s up to 10cm for
T = 0.2s). Nevertheless, because of the auxiliary error definition the overshoot is
reduced due to the damping characteristics of s.
One would assume that decreasing Λ would decrease the position overshoot when
there is a sudden loss of contact. Although this is true initially, a larger Λ also
improves the response of the system when moving in free space (x is proportional to
Fd by a proportionality constant Λ). Therefore, a large Λ actually allows the human
91
operator to react to the positional overshoot faster. In most cases, it was found to
be more beneficial to have Λ = 1.
This test also displays the position tracking abilities of the proposed controller
when in free space.
92
0 2 4 6 8 10 120
2
4
6
8
10
12
Time (s)
Forc
e(N
)
Force tracking for stiff contact test under various time delays T
T = 0.1s
FdFm
0 2 4 6 8 10 120
2
4
6
8
10
12
Time (s)
Forc
e(N
)
T = 0.25s
0 2 4 6 8 10 120
2
4
6
8
10
12
Time (s)
Forc
e(N
)
T = 0.5s
0 2 4 6 8 10 120
2
4
6
8
10
12
Time (s)
Forc
e(N
)
T = 1s
Figure 6.21: Force response of the proposed controller in the presence of time delayfor a stiff contact test. Impact force remains the same for arbitrary time delays.
93
0 2 4 6 8 10 120
0.05
0.1
0.15
0.2
0.25
Time (s)
Slav
ePos
itio
n(m
)
T = 0.01sSlave position for loss of contact test under various time delays T
0 2 4 6 8 10 120
0.05
0.1
0.15
0.2
0.25
Time (s)
Slav
ePos
itio
n(m
)
T = 0.05s
0 2 4 6 8 10 120
0.05
0.1
0.15
0.2
0.25
Time (s)
Slav
ePos
itio
n(m
)
T = 0.1s
0 2 4 6 8 10 120
0.05
0.1
0.15
0.2
0.25
Time (s)
Slav
ePos
itio
n(m
)
T = 0.2s
Figure 6.22: Slave position in the presence of time delay for a loss of contact test(contact lost at x = 0.1m). Positional overshoot increases with increased time delay.
94
Chapter 7
Conclusions
A feedback controller has been proposed for the control of a remote surgical robot.
A force feedback haptic device supplies the human operator with force information
at the remote site. Force sensor information at the master-human interface provide
a desired force trajectory for the slave. It is argued that a force tracking slave
is sufficient for high fidelity haptic control when the human operator is supplied
with force and visual feedback from the remote site. Force tracking has obvious
advantages for teleoperation systems, particularly in the presence of time delays and
when contact with a hard (constrained) surface is made.
The feedback controller aims at minimizing force errors between the commanded
force and the force measured at the patient. An auxiliary error definition also ensures
that the robot’s velocity is kept low as well as providing positional control in free mo-
tion. Reducing the robot’s velocity has distinct advantages when the slave suddenly
loses contact with the environment. A force tracking slave produces large positional
overshoot, a problem compounded when the time delay is large. The backstepping
technique filters out high frequency components of the control signal. Thus, high
frequency natural modes inherent in stiff environments are not excited while low
frequency control was sufficiently maintained. It is argued that this is valid for the
realistic cases where the high frequency natural modes of the system are naturally
slightly damped.
Two neural networks as well as an adaptive parameter model unknown system
dynamics. Indeed, the controller has no information about the environment or slave.
95
These adaptive components also add damping to the system and ensure that high
frequency spikes in force measurements are not transmitted to the slave actuators.
Weight updates aim at reducing state errors and are inherently robust to unmodeled
dynamics. Designing the weight update laws at the second stage of backstepping al-
low for a tuning function design and robustifies the weight updates. A novel method
for updating the weights when using a projection rule is proposed. Using the super-
vised learning method proposed in [58] forces the weights to remain close to their
initial value and allow for fast adaptation as well as weight convergence over time.
The true benefit of the adaptive control design is its ability to react quickly to changes
in the remote environment. Performance nonetheless improves over time.
A smooth robust control law designed using Lyapunov redesign methods ensures
uniform ultimate boundedness of all signals in the presence of modeling errors from
the neural networks. Stability of the overall system is proven using a Lyapunov
control function and a bound on all signals exists.
Experimental results verify the proposed controller’s performance. First, it is
compared to an optimal H2 control and a passivity based control and shown to
be superior in many ways. Next, the effect of the backstepping technique is shown.
Stability of the system is displayed by repeatedly hitting a hard surface 200 times and
recording the neural network weight convergence as well as state error convergence.
The novel proposed weight update method is tested for the adaptive parameter.
Finally, it is shown the effect of a communications time delay on the system. In
particular, it is shown that the maximum force exerted by the slave on the patient
does not increase with time delay. As well, the effect of suddenly losing contact with
a surface is shown.
In summary, three main contributions were made and tested in this work:
96
• A unique auxiliary error definition reduces force errors while providing position
control when the slave moves in free space,
• Using neural adaptive backstepping to ensure stable control when contact with
a stiff environment is made,
• Novel neural network update law ensures stable and robust control.
7.0.3 Future Work
There is a multitude of additional future work that could potentially arise from this
research. From a theoretical point of view, additional analysis on the filtering prop-
erties of the backstepping technique would prove insightful. For instance, quantifying
the optimal filtering for various environments to ensure the controller does not excite
the high frequency modes of the environment would prove useful. Also, examining
various orders of filters induced by the backstepping technique would be interesting.
For example, using an additional step of backstepping would induce a second order
filter on the virtual controls allowing for better filter performance with stepper cut-
offs. Also, the effect of the time delay on the system stability was not quantified
rather only showed that it appears stable for delays up to 1sec. Examining the case
of varying time delays would also be of importance in order to allow teleoperation
over time varying and unreliable communications channels such as the internet or
wireless communication to space. Finally, an extension from the 1 DOF case to a
multi-DOF case is a logical next step. Testing the combined effects of disturbances
and errors at each joint to the overall performance would rigorously test the valid-
ity of the control design. Also, I would be interested to test the novel supervised
projection update method for the muli-DOF case in which the adaptive parameter
97
becomes a matrix of neural networks.
From a validation stand point, it would be beneficial to perform experiments
with a real slave robot. Hidden dynamics and subtleties inevitably surface when a
controller is taken from the simulation stage to experiment. These arise from sensor
noise, actuator limitations and nonlinearities, and unmodeled dyanmics. Also, the
assumption was made that a perfect master device is used in which the measured
force measured at the slave is perfectly reflected to the human operator. Using a
real force sensor rather than the virtual force sensor would allow the consideration
of limitations on the master side.
Bibliography
[1] R. Aracil, M. Buss, S. Cobos, M. Ferre, S. Hirche, M. Kuschel, and A. Peer,
The Human Role in Telerobotics, pp. 11–24. Springer, 2007.
[2] R. Cole and D. Parker, “Stereo tv improves manipulator performance,” in Pro-
ceeding of the SPIE, (Bellingham, WA.), pp. 18–27, 1990.
[3] A. Meier, C. Rawn, and T. Krummel, “Virtual reality: Surgical application -
challenge for the new millennium,” Journal of the American College of Surgeons,
vol. 192, pp. 372–384, March 2001.
[4] C. Basdogan, C. Ho, M. Srinivasan, and M. Slater, “An experimental study on
the role of touch in shared virtual environments,” in ACM Transactions on CHI,
pp. 443–460, 2000.
[5] S. Brave and A. Dahley, “intouch: A medium for haptic interpersonal commu-
nication,” in Proceedings of CHI, pp. 363–364, 1997.
[6] B. Fogg, L. Cutler, P. Arnold, and C. Eisbach, “Handjive: A device for in-
terpersonal haptic entertainment,” in Proceedings of CHI, (Los Angeles, CA.),
pp. 57–64, 1998.
[7] E. Sallnas, K. Rassmus-Grohn, and C. Sjostrom, “Supporting presence in col-
laborative environments by haptic force feedback,” in ACM Transactions on
CHI, pp. 461–476, 2000.
[8] I. Oakley, S. Brewster, and P. Gray, “Can you feel the force? an investigation
of haptic collaboration in shared editors,” in Proceedings of EuroHaptics, 2001.
98
99
[9] neuroArm, “neuroarm,” March 2010. http://www.neuroarm.org/.
[10] G. Ballantyne and F. Moll, “The da vinci telerobotic surgical system: the virtual
operative field and telepresence surgery,” Surgical Clinics of North America,
vol. 83, pp. 1293–1304, 2003.
[11] F. Isgro, A. Kiessling, M. Blome, A. Lehmann, B. Kumle, and W. Saggau,
“Robotic surgery using zeus microwrist technology: the next generation,” Jour-
nal of Cardiac Surgery, vol. 18, pp. 1–5, 2003.
[12] F. Tendick and S. Sastry, Minimally Invasive Robotic Telesurgery, pp. 89–94.
Kluwer Academic Publishers, 2001.
[13] P. Fager and P. von Wowern, “The use of haptics in medical applications,”
The International Journal of Medical Robotics and Computer Assisted Surgery,
vol. 1, pp. 36–42, 2005.
[14] F. Seto, Y. Hirata, and K. Kosuge, “Real-time cooperating motion generation
for man-machine systems and its application to medical technology,” Technology
and Health Care, vol. 15, pp. 121–130, 2007.
[15] D. Lawrence, “Stability and transparency in bilateral teleoperation,” IEEE
Transactions on Robotics and Automation, vol. 9, pp. 624–637, October 1993.
[16] A. Aziminejad, M. Tavakoli, R. Patel, and M. Moallem, “Transparent time-
delayed bilateral teleoperation using wave variables,” IEEE Transactions on
Control Systems Technology, vol. 16, pp. 548–555, May 2008.
[17] G. Sankaranarayanan and B. Hannaford, “Virtual coupling schemes for position
coherency in networked haptic environments,” in Proceedings of the BioRob
100
Conference, (Pisa, Italy), 2006.
[18] M. Cavusoglu, A. Sherman, and F. Tendick, “Design of bilateral teleoperation
controllers for haptic exploration and telemanipulation of soft environments,”
IEEE Transactions on Robotics and Automation, vol. 20, pp. 1–7, August 2002.
[19] H. Lee and M. Chung, “Adaptive controller of a master-slave system for trans-
parent teleoperation,” Journal of Robotic Systems, vol. 15, pp. 465–475, 1998.
[20] R. Anderson and M. Spong, “Asymptotic stability for force reflecting teleop-
erators with time delay,” International Journal of Robotics Research, vol. 11,
pp. 135–149, April 1992.
[21] R. Anderson and M. Spong, “Bilateral control of teleoperators with time delay,”
IEEE Transactions on Automatic Control, vol. 34, pp. 494–501, May 1989.
[22] H. Kazerooni, T. Tsay, and C. Moore, “Telefunctioning: An approach to teler-
obotic manipulations,” in American Control Conference, (San Diego, CA),
pp. 2778–2783, 1990.
[23] Y. Strassberg, A. Goldenberg, and J. Mills, “A new control scheme for bilateral
teleoperating systems: Performance evaluation and comparison,” in Proceedings
of the IEEE/RSJ International Conference on Intelligent Robots and Systems,
(Raleigh, NC), pp. 865–872, July 1992.
[24] M. Tavakoli, A. Aziminejad, R. Patel, and M. Moallem, “High-fidelity bilateral
teleoperation systems and the effect of multimodal haptics,” IEEE Transactions
on Systems, Man, and Cybernetics, vol. 37, pp. 1512–1528, December 2007.
101
[25] Z. Hu, S. Salcudean, and P. Loewen, “Robust controller design for teleoperation
systems,” in IEEE Conference on Systems, Man, and Cybernetics Intelligent
Systems for the 21st Century, pp. 2127–2132, October 1995.
[26] J. Gil, A. Avello, A. Rubio, and J. Florez, “Stability analysis of a 1 dof hap-
tic interface using the routh-hurwitz criterion,” IEEE Transactions on Control
Systems Technology, vol. 12, pp. 583–588, July 2004.
[27] Y. Yokokohji, E. V. Poorten, and T. Yoshikawa, “Haptic control architectures
based on scattering theory and wave-variables,” in Proceedings of the Virtual
Reality Society of Japan Annual Conference, (Japan), pp. 319–322, 2002.
[28] G. Niemeyer and J. Slotine, “Stable adaptive teleoperation,” IEEE Journal of
Oceanic Engineering, vol. 16, pp. 152–162, January 1991.
[29] S. Stramigioli, A. van der Schaft, B. Maschke, and C. Melchiorri, “Geometric
scattering in robotic telemanipulation,” IEEE Transactions on Robotics and
Automation, vol. 18, pp. 588–595, August 2002.
[30] J. Ryu, D. Kwon, and B. Hannaford, “Stable teleoperation with time-domain
passivity control,” IEEE Transactions on Robotics and Automation, vol. 20,
pp. 365–373, April 2004.
[31] H. Khalil, Nonlinear Systems. New Jersey: Prentice Hall, 2002.
[32] S. Zak, Systems and Control. New York: Oxford University Press, 2003.
[33] E. Sontag, “A lyapunov-like characterization of asymptotic controllability,”
SIAM Journal of Control and Optimization, vol. 21, pp. 462–471, 1983.
102
[34] R. Freeman and P. Kokotovic, Lyapunov Design, pp. 932–940. IEEE Press,
1996.
[35] M. Krstic, P. Kokotovic, and I. Kanellakopoulos, Nonlinear and Adaptive Con-
trol Design. New York: John Wiley & Sons, Inc, 1995.
[36] M. Krstic, I. Kanellakopoulos, and P. Kokotovic, “Adaptive nonlinear control
without overparameterization,” Systems and Controls Letters, vol. 19, pp. 177–
185, 1992.
[37] J. Park and I. Sandberg, “Universal approximation using radial-basis-function
networks,” Neural Computation, vol. 3, pp. 246–257, 1991.
[38] K. Hornik, “Approximation capabilities of multilayer feedforward networks,”
Neural Networks, vol. 4, pp. 251–257, 1991.
[39] E. Bishop, “A generalization of the stone-weierstrass theorem,” Pacific Journal
of Mathematics, vol. 11, pp. 777–783, 1961.
[40] S. Seshagiri and H. Khalil, “Output feedback control of nonlinear systems using
rbf neural networks,” IEEE Transactions on Neural Networks, vol. 11, pp. 69–
79, January 2000.
[41] Y. Li, S. Qiang, X. Zhuang, and O. Kaynak, “Robust and adaptive backstepping
control for nonlinear systems using rbf neural networks,” IEEE Transactions on
Neural Networks, vol. 15, pp. 693–701, May 2004.
[42] R. Sanner and J. Slotine, “Gaussian networks for direct adaptive control,” IEEE
Transactions on Neural Networks, vol. 3, pp. 837–863, November 1992.
103
[43] S. Ge and C. Wang, “Direct adaptive nn control of a class of nonlinear systems,”
IEEE Transactions on Neural Networks, vol. 13, pp. 214–221, January 2002.
[44] J. Slotine and W. Li, “Adaptive robot control: A new perspective,” in Proceeding
of the 26th Conference on Decision and Control, (Los Angeles, U.S.A.), pp. 192–
197, 1987.
[45] P. Ioannou and P. Kokotovic, “Instability analysis and improvement of robust-
ness of adaptive control,” Automatica, vol. 20, pp. 583–594, 1984.
[46] F. Chen and H. Khalil, “Adaptive control of nonlinear systems using neural
networks - a deadzone approach,” in American Control Conference, (Boston,
MA.), pp. 667–672, 1991.
[47] K. Narendra and A. Annaswamy, “A new adaptive law for robust adaptation
without persistent excitation,” in American Control Conference, (Seattle, WA.),
pp. 1067–1072, 1986.
[48] Y. Fung, Biomechanics: Mechanical Properties of Living Tissues. New York:
Springer-Verlag, 1993.
[49] U. Kuhnapfel, H. Cakmak, and H. Maab, “Endoscopic surgery training us-
ing virtual reality and deformable tissue simulation,” Computers and Graphics,
vol. 24, pp. 671–682, 2000.
[50] S. Eppinger and W. Seering, “Understanding bandwidth limitations in robot
force control,” in IEEE International Conference on Robotics and Automation,
1987.
104
[51] J. Chow and C. Hanley, “Singular perturbation analysis of high-frequency filter
design,” International Journal of Control, vol. 51, pp. 705–720, October 1990.
[52] C. Macnab, G. D’Eleuterio, and M. Meng, “Cmac adaptive control of flexi-
ble -joint robots using backstepping with tuning functions,” in Proceeding of
the IEEE International Conference on Robotics and Automation, vol. 3, (New
Orleans, U.S.A.), pp. 2679–2686, 2004.
[53] L. Hsu and R. Costa, “Bursting phenomena in continuous-time adaptive systems
with a sigma -modification,” IEEE Transactions on Automation and Control,
vol. 32, pp. 84–86, 1987.
[54] C. Macnab, “Preventing bursting in approximate-adaptive control when using
local basis functions,” Fuzzy Sets and Systems, vol. 160, pp. 439–462, 2009.
[55] L. Lubin, S. Grocott, and M. Athans, H2 (LQG) and H∞ Control, pp. 651–661.
IEEE Press, 1996.
[56] D. Lawrence, L. Pao, M. Salada, and M. Dougherty, “Quantitative experimen-
tal analysis of transparency and stability in haptic interfaces,” in Proceedings
of the ASME International Mechanical Engineering Congress and Exposition,
(Atlanta, GA.), pp. 441–449, November 1996.
[57] P. Millman and J. Colgate, “Effects of non-uniform environment damping on
haptic perception and performance of aimed movements,” in Proceedings of the
International Mechanical Engineering Congress and Exposition, (San Francisco,
CA.), 1995.
[58] D. Richert, A. Beirami, and C. Macnab, “Neuro-adaptive control of robotic
105
manipulators using a supervisor inertia matrix,” in Proceeding of the 4th In-
ternational Conference on Autonomous Robots and Agents, (Wellington, N.Z.),
pp. 634–639, 2009.
106
Appendix A
Stability Analysis Including Disturbances
The stability of the system is examined when disturbances are included in the system
model, and particularly the robustifying behavior of Fc,rob. Since a uniform ultimate
bound has already been established for the system in the absence of disturbances,
it is sufficient to show that the system with only disturbances is likewise uniformly
ultimately bounded using the same Lyapunov functions of Chapter 4. Thus, begin
the analysis when disturbances appear in the Lyapunov candidate function, Eq.
(4.9), and only consider the terms of importance. In the following analysis, only
disturbances that arise from modeling errors are considered. Therefore, it is easier
to follow the below analysis by looking at Chapter 4 and noting that disturbance
terms are added every time a neural network is used,
V1,d = s(d1 + dp[−Fm + Fc] + pαrob). (A.1)
αrob has been designed such that
V1,d = s(d1 + dp[−Fm + Fc]− |s|1.1[µ2|− Fm + Fc|
1.1 + µ1]). (A.2)
The second stage of backstepping yields,
V2,d = V1,d + z(Fc,rob + ˙α− α), (A.3)
and only considering the disturbances
V2,d = V1,d + z(Fc,rob − αnom,d − αrob,d). (A.4)
107
Look at disturbances that arise from αnom (an extension of the analysis in Eq. (4.25))
yields
αnom,d = d2 + p−1
�∂φ1
∂sw1 + G1
�(d1 + dp[−Fm + Fc]), (A.5)
and considering αrob
αrob =− p−1|s|
1.1 d
dt(µ2|− Fm + Fc|
1.1 + µ1)− p−1 d
dt(|s|1.1)(µ2|− Fm + Fc|
1.1 + µ1)
−d
dt(p−1)|s|1.1(µ2|− Fm + Fc|
1.1 + µ1), (A.6)
=− p−1|s|
1.1(1.1µ2|− Fm + Fc|0.1[Fc − Fm]sgn(−Fm + Fc))
− 1.1p−1|s|
0.1ssgn(s)(µ2|− Fm + Fc|
1.1 + µ1)
+ ˙pp−2|s|
1.1(µ2|− Fm + Fc|1.1 + µ1). (A.7)
Two terms, s and Fm will give rise to disturbances in αrob because neural network
approximations must be used to implement them,
αrob,d =− p−1|s|
1.1(1.1µ2|− Fm + Fc|0.1
d2sgn(−Fm + Fc))
− 1.1p−1|s|
0.1(d1 + dp[−Fm + Fc])sgn(s)(µ2|− Fm + Fc|1.1 + µ1).
(A.8)
Renaming
κ1 =− p−1
�∂φ1
∂sw1 + G1
�, (A.9)
κ2 =p−1|s|
1.1µ2|− Fm + Fc|
0.1sgn(−Fm + Fc), (A.10)
κ3 =p−1|s|
0.1(µ2|− Fm + Fc|1.1 + µ1)sgn(s), (A.11)
108
and putting these results back into Eq. (A.4)
V2,d =V1,d + z(Fc,rob − d2 + κ1[d1 + dp(−Fm + Fc)]
+ κ2d2 + κ3[d1 + dp(−Fm + Fc)]), (A.12)
=s(d1 + dp[−Fm + Fc]− |s|1.1[µ2|− Fm + Fc|
1.1− µ1])
+ z(−d2 + κ1d1 + κ1[−Fm + Fc]dp + κ2d2 + κ3d1 + κ3[−Fm + Fc]dp + Fc,rob),
(A.13)
and including the robust control Fc,rob,
V2,d =s(d1 − µ1|s|1.1)
+ s(dp[−Fm + Fc]− µ2|s|1.1|− Fm + Fc|
1.1)
+ z(−d2 − µ3|z|1.1)
+ z(κ1d1 − µ4|κ1|1.1|z|
1.1)
+ z(κ1[−Fm + Fc]dp − µ5|κ1[−Fm + Fc]|1.1|z|
1.1)
+ z(κ2d2 − µ6|κ2|1.1|z|
1.1)
+ z(κ3d1 − µ7|κ3|1.1|z|
1.1)
+ z(κ3[−Fm + Fc]dp − µ8|κ3[−Fm + Fc]|1.1|z|
1.1), (A.14)
which defines eight regions. Each region contributes to V2,d being negative definite,
and thus creating a bound for V2,d, according to the results in Table A.1 Using the
same approach as in Section 2.1.2, a conservative bound on V2,d is found by noting
that V2,d is certainly negative if
|s| >
�d1,max
µ1
�0.91
∪
�dp,max
µ2
�0.91
= δs, (A.15)
109
Table A.1: Bounds which contribute to V2,d being negative definiteBound Condition
1 |s| >
�d1,max
µ1
�0.91
2 |s||− Fm + Fc|0.091
>
�dp,max
µ2
�0.91
3 |z| >
�d2,max
µ3
�0.91
4 |z||κ1|0.091
>
�d1,max
µ4
�0.91
5 |z||κ1(−Fm + Fc)|0.091>
�dp,max
µ5
�0.91
6 |z||κ2|0.091
>
�d2,max
µ6
�0.91
7 |z||κ3|0.091
>
�d1,max
µ7
�0.91
8 |z||κ3(−Fm + Fc)|0.091>
�dp,max
µ8
�0.91
or
|z| >
�d2,max
µ3
�0.91
∪
�d1,max
µ4
�0.91
∪
�dp,max
µ5
�0.91
∪
�d2,max
µ6
�0.91
∪
�d1,max
µ7
�0.91
∪
�dp,max
µ8
�0.91
= δz. (A.16)
Combining this result with that of Eq. 4.32 redefines the ultimate bound on the
system error to be
ξb =�
[δ2ξ] ∪ [δ2
s+ δ2
z] + δ2
w. (A.17)
110
Appendix B
Robust control for scaled tuning functions
As discussed in Appendix A, sometimes it is not desirable to use robust control terms
to make the bounds due to disturbances small. Indeed, in doing so the control places
far too much effort in overcoming disturbance terms rather than driving state errors
to equilibrium. This problem compounds because state bounds are derived based on
worst-case-scenarios which are rarely encountered.
Using a scaled tuning function approach robust behavior is maintained while
ensuring the nominal control is able to perform well. Nevertheless, designing a
robustifying control term to ensure boundedness of the states when we scale the
tuning functions is necessary.
The proof begins from Eq. (4.17) with a redefinition of the Lyapunov candidate,
V2 = V1 +γ
2z
2 +1
2β2wT
2 w2, (B.1)
where the addition of 0 ≤ γ ≤ 1 is paramount for the proof. Following the stability
proof in Chapter 4 from this point onwards shows that the final Lyapunov control
function derivative is achieved
V2 =−G1s2−G2z
2 + (1− γ)spz + γzur,τ
+ p
�τp + γ(−Fm + Fc)
�∂φ1
∂sw1 + G1
�p−1
z −1
βp
˙p
�
+ wT
1
�τ1 + γφ
T
1
�∂φ1
∂sw1 + G1
�p−1
z −1
β1
˙w1
�+ wT
2
�φ
T
2 z −1
β2
˙w2
�, (B.2)
where an additional robust control ur,τ is included for later design. Using the designed
111
weight update laws defined in Eqs. (3.26), (3.29), (3.34) yields
V2 = −G1s2−G2z
2 + (1− γ)spz + γzur,τ − ζ p(p− p) + ν1wT
1 w1 + ν2wT
2 w2. (B.3)
By treating (1− γ)spz as a system disturbance, design now
ur,τ =γ − 1
γsp, (B.4)
to end up with the exact expression in Eq. (4.29). However, it is found that with
γ � 1, ur,τ tends to become large. To ensure stability yet still take advantage of
the weighted tuning function method the following approach is used. Assuming that
γ � 1, approximate V2 by
V2,s,z ≈ −G1s2−G2z
2 + spz. (B.5)
The states related to w1,2 and p have been excluded because the have already been
shown to be bounded. Therefore, this Lyapunov derivative can easily be observed.
Using a switching criteria as proposed by [52]
γ =
1 if −G1s2 −G2z
2 + spz > 0,
γdesign otherwise,
(B.6)
thus ensuring that V is bounded. [52] also notes that changing the learning rates β1,p
when γ changes improves performance.