neural network model-based control a dissertation …
TRANSCRIPT
NEURAL NETWORK MODEL-BASED CONTROL
OF DISTILLATION COLUMNS
by
BALSHEKAR RAMCHANDRAN, B.E., M.S.Ch.E
A DISSERTATION
IN
CHEMICAL ENGINEERING
Submitted to the Graduate Faculty of Texas Tech University in
Partial Fulfillment of the Requirements for
the Degree of
DOCTOR OF PHILOSOPHY
Approved
Accepted
August, 1994
ACKNOWLEDGMENTS
I wish to extend my sincere thanks to Dr. Russ Rhinehart for his mentorship and
guidance in this work. I appreciate his patience and the calm disposition he displayed at
times when I was too digressive for his liking. He, in my eyes, is a superb combination of
a great researcher and a true teacher~a rare combination these days. Thank you, Russ!
I also wish to extend my sincere thanks to Dr. Jim Riggs for his valuable insights
during the course of this work, and his support and guidance during my entire stay at
Texas Tech. His free-spirit and "never-say-die" attitude will be remembered for a long
time. My thanks also go out to my other committee members Drs. Ray Desrosiers, Terry
Tolliver, and especially to Dr. Oldham, who taught me that it was not OK to be sloppy
because I was an engineer, and that anything that can be known, must be known! Many
thanks to all the faculty and staff members of the Department of Chemical Engineering for
all their help and support during my stay at Tech. Thanks are also extended to the
members of the Process Control and Optimization Consortium at Texas Tech University
for financial support of this work. I would like to thank Jim Deam for his constant
support and encouragement, and Stan Proctor and his Engineering Technology Group at
Monsanto Company, St. Louis, MO, where this work was conceived and got started.
This dissertation marks an important point in my life that began on August 17, 1988,
my first day in Lubbock, and in the past nearly six years I have had the opportunity to
grow and learn a number of new things, and experience the American way-of-life. There
are a number of people who have touched my life and left an impression that would last
this lifetime. I would like to thank the Heichelheims, for giving me a home away from
home; Vikram Singh, who tried very hard to teach me the "Meaning of Life;" Hoshang
Subawalla, for reasons I shall never know; June Heichelheim, for being such a wonderfiil
fiiend; Anand Laxminarayan, my cousin, the ultimate realist; Don Quixote, the ultimate
ii
idealist who taught me that life is not what really it is, but what really it should be; Tammy
"Tamara" Kent, for her constant support and willingness to help with a smile (she forced
me to write these words!!), and also all the entertainment in the Chem. E. Office with
replays of "Seinfeld" and "Married... With Children!"; Mary Beth Abemathy, for patiently
listening to my philosophy about life (much to her chagrin, unfortunately!); Vikram
Gokhale and Suhas Kulkami, from whom I learnt a lot; Paul Bray, for his daily morning
"Far Side" cartoons and introducing me to the brilliant flagon of his "cyanide-special"
cocktails; Bertie Wooster, the weedy butterfly who provided me with many hours of pure
unadulterated English "humour" at the times I needed it most; and to many many dear
friends who sought my ashram in Room 107, Chemical Engineering Building, and whose
company I shall cherish for many years to come.
In each and every step along this journey, the two people who beamed me their love
and unrelentless support "from across the seven seas" are my Appa and Amma back home
in India. All that they taught me and gave me has brought me a long way, indeed. And
finally, many thanks to the family that probably means more to me than anything else in my
life, and to the two wonderfiil people who have influenced me in more ways than I
possibly can count, who taught me the way of unconditional love, and to dream the
impossible dream, and to live life to the fiillest. Thank you. Aunt Dahlia and Uncle Tom!
In all these times, I have tried always to enjoy life and all of its offerings; to live a little, to
learn a little, to laugh a lot, and revel till my ribs squeaked (a la Bertie Wooster), and this
work is a testimony to that.
ui
TABLE OF CONTENTS
ACKNOWLEDGMENTS ii
ABSTRACT vii
LIST OF TABLES ix
LIST OF FIGURES xi
CHAPTER
L INTRODUCTION 1
1.1 Distillation Control and Its Importance 1
1.2 Basics of Distillation Control 1
1.3 Neural Networks and Its Relevance to Process Control 5
IL NEURAL NETWORKS 7
2.1 Feedforward Neural Networks 8
2.2 Computational Aspects in a Feedforward Neural Network 12
2.3 Training of Neural Networks 15
2.4 Backpropagation Training Algorithm 17
2.5 Analysis of the Backpropagation Algorithm 18
2.6 Optimization Approach to Neural Network Training 20
2.7 The Levenberg-Marquardt Method for Nonlinear Least Squares 25
2.8 Examples Using the Marquardt Method for Training Neural Networks 29
m. STEADY-STATE MODELS FOR DISTILLATION 35
3.1 Process Models and Process Inverse Models 35
3.2 Distillation Column Test Cases 36
3.3 Development of Steady-State Inverse Models for Distillation 37
iv
3.4 Optimal Training of Neural Networks 48
IV. DYNAMIC PROCESS SIMULATIONS 50
4.1 Mathematical Model for Nonideal Multicomponent Distillation 50
4.2 Additional Features of the Dynamic Process Simulators 57
4.3 Open-Loop Response Characteristics of the Processes 57
4.4 Steady-State Analyses of Distillation Column Operation 71
V. MODEL-BASED CONTROL STRATEGY 96
5.1 Nonlinear Process Model-Based Control (Nonlinear PMBC) 97
5.2 Nonlinear Process Model-Based Control of Distillation Columns 99
VI. CONTROL RESULTS 104
6.1 Lab-Column Controller Tests 104
6.2 High-Purity Column Controller Tests 126
Vn. PROCESS-MODEL MISMATCH 149
7.1 Process-Model Mismatch 149
7.2 Process-Model Mismatch for the Distillation Columns 151
7.3 "It's the Gain Prediction, Stupid!" 157
Vm. DISCUSSION, CONCLUSION, AND RECOMMENDATIONS 163
8.1 On Using Neural Network Steady-State Process Inverse
Models 163
8.2 On Optimal Training of Neural Networks 164
8.3 In Conclusion 166 8.4 Recommendations 168
BIBLIOGRAPHY 176
APPENDICES
A. ERROR BACKPROPAGATION TRAINING ALGORITHM 185
B. THE MARQUARDT ALGORITHM 193
C. EMPIRICAL CORRELATIONS FOR THE METHANOL-WATER SYSTEM 206
D.FORTRAN PROGRAMS FOR EXAMPLES IN APPENDIX B... 237
VI
ABSTRACT
Distillation control is difficult because of its nonlinear, interactive and nonstationary
behavior; but, unproved distillation control techniques can have a significant impact on
improving product quality and environmental resource protection. Advanced control
strategies use a model of the process to select the desired control action. While
phenomenological models have demonstrated efficient control of highly nonlinear and
interactive distillation colunms, they can get complicated and computationally intensive.
Further, these models may require frequent reparametrization to eliminate any process-
model mismatch that may have accrued with time.
Neural networks provide an alternate approach to modeling process behavior, and
have received much attention because of their wide range of applicability, and their ability
to handle complex and nonlinear problems. The main advantage in using neural networks
is that neural network models are computationally simple, and possess enormous
processing power, speed, and generality.
In this study, neural network process-inverse models were developed for two
different methanol-water distillation columns: (i) a lab-scale column; and (ii) an industrial-
scale high-purity column. The data required for "training" and "testing" the neural
networks for the two distillation columns were obtained from steady-state simulations of
the two distillation columns developed using a commercial steady-state simulation
package. The neural networks were trained using a very efficient nonlinear optimization
algorithm based on the Levenberg-Marquardt method.
The neural network steady-state process-inverse models were used in conjunction
with a reference system synthesis based on first-order dynamics. The neural network
model-based controllers were tested on dynamic simulations of the two distillation
vu
columns for both servo and regulatory modes of operation, and their performances were
compared with conventional static feedforward Proportional-Integral controllers.
The simplicity and directness of the novel approach presented in this study addresses
issues such as obtaining training and testing data from steady-state simulation packages,
training the neural networks with a more robust and efficient nonlmear optimization
algorithm, the use of steady-state process-inverse neural network models, and
incorporating the model with a reference system synthesis to formulate a very simple
multivariable control structure that make it distinct and better when compared with
conventional Proportional-Integral controllers. The methodology offers the advantages of
easy implementation and a practical solution to difficult control problems.
vm
LIST OF TABLES
3.1 Design and Operating Conditions for the Two Distillation Columns 39
4.1 Process Gains for Overhead and Bottom Compositions for the Lab Distillation Column 89
4.2 Process Gains for Overhead and Bottom Compositions for the High-Purity Distillation Column 90
4.3 First-Order Plus Deadtime Models for Overhead and Bottom Compositions for the Lab Distillation Column 91
4.4 First-Order Plus Deadtime Models for Overhead and Bottom Compositions for the High-Purity Distillation Column 92
4.5 Relative Gain Array for the Lab Distillation Column using the Average Process Gains 94
4.6 Relative Gain Array for the High-Purity Distillation Column using the Average Process Gains 95
6.1 Description of the Controller Tests for the Lab Distillation Column 105
6.2 Description of the Servo-mode Controller Tests for the High-Purity Distillation Column 106
6.3 Description of the Regulatory-mode (Feed Flowrate Upsets) Controller Test for the High-Purity Distillation Column 107
6.4 Description of the Regulatory-mode (Feed Composition Upsets) Controller Test for the High-Purity Distillation Column 108
6.5 Comparison of Controller Performance for the Lab Distillation Column 118
6.6 Neural Network Model-Based Controller Performance for the High-Purity Distillation Column 147
6.7 Conventional Feedback PI plus Feedforward Controller Performance for the Lab Distillation Column 148
B.l Values for INDEX(l) and their Corresponding Meaning 197
IX
B.2 Values for INDEX(2) and their Corresponding Meaning 198
B.3 Values for INDEX(3) and their Corresponding Meaning 199
B.4 Values for INDEX(3) Under Error Returns and their Corresponding Meaning 202
C. 1 Vapor-Liquid Equilibrium for Methanol-Water System at 1 atma 208
C.2 Enthalpy Data for Methanol-Water System at 1 atma 209
C. 3 Transport Property Data for Methanol-Water System at 1 atma 210
C.4 VLE for Methanol-Water System at 2 atma 221
C. 5 Enthalpy Data for Methanol-Water System at 2 atma 222
C. 6 Transport Property Data for Methanol-Water System at 2 atma 223
LIST OF FIGURES
1.1 Schematic of a simple distillation column 2
2.1 Feedforward neural network architecture 9
2.2 Signal processing within a neuron 10
2.3 Computational aspects in a feedforward neural network 13
2.4 Basic types of neural network learning mechanisms 16
2.5 Mapping a nonlinear fiinction with a neural network 33
3.1 Reflux rate predictions from neural networks for lab column 40
3.2 Boilup rate predictions from neural networks for lab column 42
3.3 Reflux rate predictions from neural networks for high-purity column 44
3.4 Boilup rate predictions from neural networks for high-purity
column 46
4.1 Schematic of a distillation column with details on the '/'th stage 52
4.2 Open-loop response to boilup rate changes in lab column 59
4.3 Open-loop response to reflux rate changes in lab column 62
4.4 Open-loop response to feed flowrate changes in lab column 65
4.5 Open-loop response to feed composition changes in lab column 68
4.6 Open-loop response to boilup rate changes in high-purity column 72
4.7 Open-loop response to reflux rate changes in high-purity column 76
4.8 Open-loop response to feed flowrate changes in high-purity column 80
4.9 Open-loop response to feed composition changes in high-purity
column 84
5.1 The neural network model-based control strategy 103
6.1 Neural network model-based controller without dynamic compensation on lab column 110
xi
6.2 Static feedforward Pl-controller without dynamic compensation on lab column 116
6.3 Neural network model-based controller with dynamic compensation on lab column 120
6.4 Response of controlled variables to neural network model-based controller without dynamic compensation on lab column 123
6.5 Setpoint changes with neural network model-based controller without dynamic compensation on high-purity column 127
6.6 Setpoint changes with static feedforward PI controller without dynamic compensation on high-purity column 132
6.7 Feed flowrate changes with neural network model-based controller without dynamic compensation on high-purity column 134
6.8 Feed flowrate changes with static feedforward PI controller without dynamic compensation 138
6.9 Feed composition changes with neural network model-based controller without dynamic compensation on high-purity column 140
6.10 Feed compensation changes with static feedforward PI controller without dynamic compensation 145
7.1 Steady-state process-model mismatch for the lab column 153
7.2 Steady-state process-model mismatch for the high-purity column 155
7.3 Steady-state process gains for the lab column 159
7.4 Steady-state process gains for the high-purity column 161
8.1 Proposed structure for constrained neural network model-based control 170
A.1 A 3-layered feedforward neural network 186
A.2 Processes in the 7'th neuron in the hidden layer 187
A.3 Processes in the 'yfth neuron in the output layer 188
C. 1 Vapor-Liquid equilibrium for methanol-water system at 1 atma 211
C.2 Saturated liquid temperature versus liquid-phase composition for methanol-water system at 1 atma 212
Xll
C.3 Saturated liquid density versus liquid-phase composition for methanol-water system at 1 atma 214
C.4 Saturated vapor density versus liquid-phase composition for methanol-water system at 1 atma 215
C.5 Subcooled liquid density versus liquid-phase composition for methanol-water system at 1 atma 216
C.6 Saturated liquid enthalpy versus liquid-phase composition for methanol-water system at 1 atma 217
C.7 Saturated vapor enthalpy versus vapor-phase composition for methanol-water system at 1 atma 219
C.8 Average molecular weight versus liquid-phase composition for methanol-water system at 1 atma 220
C.9 Vapor-Liquid equilibrium for 0.0 < x < 0.1 for methanol-water system at 2 atma 225
CIO Vapor-Liquid equilibrium for 0.1 < jc < 0.98 for methanol-water system at 2 atma 226
C. 11 Vapor-Liquid equilibrium for 0.98 < x < 1.0 for methanol-water system at 2 atma 227
C. 12 Saturated liquid temperature versus liquid-phase composition for methanol-water system at 2 atma 228
C. 13 Saturated liquid density versus liquid-phase composition for methanol-water system at 2 atma 229
C. 14 Saturated vapor density versus vapor-phase composition for methanol-water system at 2 atma 231
C. 15 Subcooled liquid density versus liquid-phase composition for methanol-water system at 2 atma 232
C. 16 Saturated liquid enthalpy versus liquid-phase composition for methanol-water system at 2 atma 233
C. 17 Saturated vapor enthalpy versus vapor-phase composition for methanol-water system at 2 atma 234
C. 18 Subcooled liquid enthalpy versus liquid-phase composition for methanol-water system at 2 atma 235
xui
CHAPTER I
INTRODUCTION
11. Distillation Control and Its Importance
For many reasons, distillation remains the most important separation technique in
chemical process industries around the world and constitutes a significant fraction of their
capital investment. The operating costs of distillation columns are often a major part of
the total operating costs of many processes. Within the U.S., there is an estimated 40,000
columns which consume approximately 3% of the total U.S. energy usage (Humphrey et
al., 1991). For these reasons, improved distillation control can have a significant impact
on reducing energy consumption, improving product quality, and protecting environmental
resources. However, distillation columns present challenging control problems because
their behavior is usually nonlinear, nonstationary, interactive, and their operation is often
subject to constraints and disturbances.
1.2 Basics of Distillation Control
A simple distillation column such as the one shown in Figure 1.1 will be used to
present some fiandamental aspects of distillation control. Even though the principles may
seem quite obvious and trivial, oversights of the basic process behavior are often reasons
for poor control. The column shown in Figure 1.1 has one single feed, and produces two
products. Heat is added to the column in the reboiler and removed in the condenser.
Reflux is introduced on the top tray. One of the best sources for details on the operational
and design aspects of distillation columns is found in the text by Treybal (1980).
The degrees of freedom, from a process control standpoint, is the number of variables that
can be or must be controlled. Mathematically speaking, the degrees of freedom can be
calculated by subtracting the total number of independent equations from
1
Feed NF
9
Condenser
LK— Overhead Distillate
Product
Boilup
Reboiler
Bottoms Product
Figure 1.1. Schematic of a simple distillation column
the total number of variables. An easier approach suggested by Luyben (1990), is to add
the total number of rationally placed control valves, where the term "rationally placed"
disqualifies poorly conceived designs that use two control valves in series, etc. In
Figure 1.1, there are five control valves, one on each of the following streams: distillate,
reflux, coolant, bottoms, and heating medium (usually steam). It is assumed that the feed
flowrate is set by the upstream process. Therefore, the simple distillation column has five
degrees of freedom. In any process, inventories, which include the liquid levels and the
pressures, must be controlled. Subtracting three variables from the five that must be
controlled gives two degrees of freedom. Therefore, there are two and only two
additional variables that can be controlled in this distillation column. Note that no
assumption has been made with regard to the number or the type of chemical components
being separated. So, irrespective of whether the distillation column is a simple binary
column or a complex multicomponent column, it has only two degrees of freedom. Of
course, this is true only for a systems under unconstrained operating conditions.
The two variables that are chosen as controlled variables depend on many factors.
Some common situations are:
(1) Control the composition of the light-key impurity in the bottom product and the
composition of the heavy-key impurity in the overhead product (distillate).
(2) Control a temperature in the rectifying section (the section above the feed tray) of
the column and a temperature in the stripping section (the section below the feed tray) of
the column.
(3) Control the reflux flowrate and a temperature somewhere in the column.
(4) Control the flowrate of the heating medium to the reboiler and a temperature near
the top of the column.
(5) Control the reflux ratio (ratio of the reflux flowrate to the distillate flowrate) and
a temperature in the column.
3
The above discussion shows that (a) only two things can be controlled, and (b)
normally at least one composition (or one temperature) somewhere in the column must be
controlled Once the five variables that must be controlled are specified (e.g., two
compositions, two levels, and pressure), we still have the task of deciding the choice of
controlled variable-manipulated variable pairing. The "pairing" problem is known as
determining the structure of the control system. Volumes have been written addressing
the issue of the distillation control structure (Luyben, 1992; Skogestad et al., 1990;
McAvoy, 1983). The pairing issue is of extreme importance because of the highly coupled
nature of the interactions between the controlled variables and the manipulated variables.
As a result, simple single-loop control using conventional Proportional-Integral-Derivative
(PID) controllers cause the control loops to interact leading to deterioration in control
performance. This is especially true if the objective is to control the two compositions at
both ends of the column using the reflux and steam flowrates as the manipulated variables
(Wood and Berry, 1973).
Accordingly, much research and development, in both private and public sector, has
focused on control methods that use modem computing power to cope with these control-
related difficulties (Luyben, 1992). Advanced control techniques require either the use of
conventional PID controllers with complex configurations and elaborate tuning procedures
(Papastathapoulou and Luyben, 1991; Ding and Luyben, 1990; Muhrer et al., 1990; Finco
et al., 1989; Elaahi and Luyben, 1985; Tyreus and Luyben, 1976) or the use of nonlinear
multi-variable models (Hokanson and Grestle, 1992; Pandit et al., 1992; Rheil, 1992;
Patwardhan et al., 1990; Riggs, 1990) for efficient control of distillation columns.
Comprehensive reviews on the use of nonlinear multi-variable models for advanced
process control strategies can be found in the papers by Bequette (1990), Bosley et al.
(1992), and Seborg et al. (1986).
1.3. Neural Networks and its Relevance to Process Control
The nonlinear models used in the nonlinear multi-variable control strategies
generally tend to become rigorous and computationally intensive as the process behavior
becomes complex. While control success has been demonstrated (Riggs et al., 1993;
Pandit et al., 1992; Pandit and Rhinehart, 1992; Cott et al., 1985), it is often at the
expense of computational power, operator-fiiendly interaction, or ease of controller
development and maintenance.
Neural networks offer an alternate approach to modeling process behavior as they do
not require a priori knowledge of the process phenomena. They "learn" by extracting pre
existing patterns from data that describe the relationship between the inputs and the
outputs in any given process phenomena. When appropriate inputs are applied to the
network, the network acquires "knowledge" from the environment in a process known as
"learning." As a result, the network assimilates information that can be recalled later.
Neural networks are capable of handling complex and nonlinear problems, process
information rapidly, and reduce the engineering effort required in model development.
Neural networks have been applied successfiilly to a variety of problems, such as
process fault diagnosis (Venkatasubramanian et al., 1990; Venkatasubramanian and Chan,
1989), modeling of semiconductor manufacturing processes (Himmel and May, 1993;
Reitman and Lory, 1993), system identification (MacMurray and Himmelblau, 1993;
Pottman and Seborg, 1992, Narendra and Parthasarathy, 1990), pattern recognition and
adaptive control (Hinde and Cooper, 1993; Cooper et al., 1992a,b; Cooper et al., 1990),
process modeling and control (You and Nikolaou, 1993; Nahas et al., 1992; Psichogios
and Unger, 1992; Bhat and McAvoy, 1990; Bhat et al., 1990; Narendra and Parthasarathy,
1990; Guez et al., 1988), and statistical time series modeling (Poll and Jones, 1994;
Weigand et al., 1990). In the area of distillation control, neural networks have found
application in identification and control of a packed distillation column (MacMurray and
5
Himmelblau, 1993) where a neural network model was used as the model in model
predictive control. Neural network control of distillation in a multi-variable model
predictive control framework also include studies on dynamic simulations (Willis et al.,
1990) and pilot plants (Megan and Cooper, 1993; Willis et al., 1992). The papers by
Thibault and Grandjean (1992) and Astrom and McAvoy (1992) provide in-depth reviews
on neural networks application in chemical process control.
CHAPTER II
NEURAL NETWORKS
Comparison of neural networks to conventional data processing and expert systems
allows a better understanding of the technology and its applications. Conventional
processing techniques apply explicit procedures or steps to numerical data in order to
arrive at an output. Expert systems, on the other hand, use logical facts as input and
employ a set of explicitly specified rules in a knowledge base to arrive at a decision. In
contrast, neural networks use no explicitly specified knowledge or procedure to analyze
new data. They extract pre-existing patterns from statistically-based data. When
appropriate inputs are applied to the network, the network acquires "knowledge" from the
environment in a process known as "learning." As a result, the network assimilates
information that can be recalled later. The above statements do not imply that the person
using neural networks for process modeling can entirely ignore understanding the process
behavior and its interactions. It simply means that the neural network does not need
understand the underlying phenomena of the process being modeled.
There are several different types of neural networks that are being studied and/or
used in applications. Some of the network models used commonly in process control
systems are: multi-layer perceptron, Kohonen's self-organizing map, adaptive resonance
network, the Hopfield network, the Boltzmann machine, and the cerebellar model
articulation controller (CMAC). Some of the above mentioned networks are called
feedforward networks and, others, feedback (or recurrent) networks, based on whether
the information flow occurs only in the forward direction or in, both, forward and
backward directions. Neural networks are also fiirther classified as either static or
dynamic systems, based on whether the stored mapping can be recalled instantly or
involves some delay or time-domain characteristics. Each type of neural network with
7
specific differences in internal structure and fimction have specific uses and advantages in
control applications. In this study, our focus is restricted to only feedforward neural
networks. Details on the structure and applications of some the other types of neural
networks can be found elsewhere (Astrom and McAvoy, 1992; Zurada, 1992; Bhat and
McAvoy, 1990).
2.1. Feedforward Neural Network.s
A network is a dense mesh of nodes and connections. The basic processing element
of a neural network is the neuron. The neurons operate collectively and simultaneously on
most or all data. Figure 2.1 illustrates the general structure of a feedforward network in
which the information flow occurs only in the forward direction, i.e., from the input to the
output. Feedforward neural networks are organized in layers and, typically, consist of at
least three layers: an input layer; one or more hidden layers; and an output layer. In
addition, there may be a bias neuron, that provides a constant and invariant output, say +1
or -1. The connections are the means for information flow. Each connection has an
associated weight, Wj, which is expressed by a numerical value that can be modified. The
weight is a measure of the connection strength between two neurons. Each neuron in the
hidden and output layers accepts information from other neurons in the network and
performs a specific computational task. Inputs to any neuron in the network are first
multiplied by the corresponding connection weight. The weighted inputs are then summed
up. The sum of the weighted inputs are finally modified by a transfer function to obtain
an output from the neuron. Information flows from the input layer to the output layer of
the neural network with the above mentioned processes occurring in each neuron in the
hidden and output layers till a network output is obtained. Figure 2.2 presents a graphical
picture of the processes occurring in each neuron in the hidden and output layers with^
8
Network Output i
Weights
Neurons
Output Layer
Hidden Layer
Input Layer
Network Inputs
Figure 2.1. Feedforward neural network architecture
Output from the Neuron
w(i+2)
x(i+l)
Inputs from other Neurons
Figure 2.2. Signal processing within a neuron
10
bemg the transformed output from the neuron, and z being the weighted sum of all inputs,
X, to the neuron.
The transfer fimction is also known as a mapping function or an activation function.
In eariy neural network models (Rosenblatt, 1958), the transfer fiinctions were discrete
and discontinuous fiinctions with binary outputs. This led to the conclusion that layered
neural networks had limited potential in solving more complex problems (Minsky and
Pappert, 1969). It was not until the mid-1980s that continuous differentiable transfer
fiinctions were discovered (Hopfield, 1984, 1982). It was then shown that a continuous
valued neural network with a continuous differentiable nonlinear transfer fiinction can
approximate any continuous fiinction arbitrarily well in a compact set (Cybenko, 1988).
Cybenko (1989) also demonstrated the fact that any arbitrary decision region can be well
approximated by a continuous neural network with only one single internal hidden layer
and any continuous sigmoidal transfer fiinction. Typically, transfer fiinctions are
sigmoidal (5-shaped), and can be either unipolar (output range from 0 to 1) or bipolar
(output range from -1 to +1) fiinction. The transfer fiinctions can also be linear, and can
consist of algebraic or differential equations. In a discrete transfer fiinction neuron, the
bias provides a threshold limit that triggers the neuron. In a continuous-valued transfer
fiinction, its role is not quite clear and many feedforward network formulations do not
have the additional bias neuron. In this research work, a hyperbolic tangent {y = tanh(2)),
which is a bipolar, continuously differentiable fiinction, was used as the transfer fiinction
for all neurons in the hidden and output layers. Also, a bias neuron with a constant output
of+1 was used. In our experience the hyperbolic tangent performs extremely well in
mapping the input-output relationships of many complex processes. More details on the
types of transfer fiinctions used in neural networks can be found elsewhere (Zurada,
1992).
11
2.2. Computational Aspects in a Feedforward Neural Network
It is important to understand the fiinctioning of a feedforward neural network from a
mathematical standpoint in order to understand how a neural network can be trained to
"learn" a particular process phenomenon. Figure 2.3 shows a feedforward neural network
similar to that in Figure 2.1 but with more details to allow a mathematical treatment.
Let x^ and Xj be two inputs to the network. Assume that they represent the
normalized values of some "real-worid" data, scaled to a range of ±1. Let the bias neuron
have a constant output of+1, i.e., ^ = 1. Let us number the weights in the following
sequence: w^ is the weight for the connection between the bias node and the first node in
the hidden layer; Wj is the weight for the connection between the first node in the input
layer and the first node in the hidden layer; w^ is the weight for the connection between the
second node in the input layer and the first node in the hidden layer. Start again with the
bias node and connect the second node in the hidden layer, and so on. After all the
neurons in the hidden layer have been connected, start with the bias node and connect all
the neurons in the output layer. Figure 2.3 shows all the weights, 13 in total, for the two-
input, three-hidden node, one-output (abbreviated as 2-3-1) feedforward neural network.
The general formula that defines the total number of weights in a feedforward network
with a bias neuron is
k = (A2,„ + \)n^,j + (%^ +1)«^„ (2.1)
where k is the total number of weights in the neural network, n,„ is the number of inputs to
the network, n^^^ is the number of neurons in the hidden layer, and « „ is the number of
outputs from the network.
The numbering scheme chosen herein is just one of the several that are used
commonly. The scheme shown here translates into an efficient algorithm that can be easily
coded in any computer programming language. Some algorithms use three subscripts to
12
Figure 2.3. Computational aspects in a feedforward neural network
13
denote each weight, w.j,^ which represents the connection weight between the 'fth node in
the ' -I'th layer and the 'fth node in the ' th layer in a multi-layered feedforward network.
If JC, and JCj are the two inputs to the network, then z„ the summed input to the Vth
node in the hidden layer is calculated as
2j =Z>.VVj +Xy.W2 +Jf2-^3>
22 =b.W^ +Jt^lW5+JC2-^6» ^ ^
z-^=b.w^+x^.w^+X2.Wg. (2.2)
The summed input to each neuron is then transformed by the transfer fimction, the
hyperbolic tangent in this case, to produce the transformed output. If/i, is the transformed
output of the 7'th node in the hidden layer, then
/i, = tanh(zi),
hj = tanh(22), and
/J3 = tanh(z3). (2.3)
If o, is the summed input to the '/'th node in the output layer, then
O, =*.M',0+/j,.H'j,+/l2.W,2+/?3.H',3, (2.4)
and the transformed output, y„ which also happens to be the network output, will be
j'l =tanh(o,). (2.5)
The above scheme describes the mathematical fiinctioning of a feedforward neural
network, and can easily be extended to any number of nodes in the input, hidden and
output layers. The above discussion also shows that a feedforward neural network is a
static system because once trained, the recall is instantaneous and the output is obtained in
one single-pass through the network. On the other hand, dynamic systems require
iterations with time before an output can be obtained.
14
2.3. Training of Neural Networks
A neural network model development involves training the neural network to "learn"
the input-output mapping from a set of examples. In general, learning can be defined as a
permanent change in behavior brought about by experience. In human beings, learning is
an inferred process; it can be assumed to have occurred by observing changes in
performance. In neural networks, learning is a more direct process, and typically, it can be
captured by distinct cause-effect relationships. For a neural network to perform any of the
tasks mentioned earlier, it has to learn the input-output mapping from a set of examples.
The process of learning corresponds to changes in the weights.
Training of a neural network can be either supervised or unsupervised. Figure 2.4
gives a graphical picture of the two basic learning modes. In supervised learning, the error
between the actual, K, and desired, D, responses is used to correct the network parameters
(the weights) externally so that the error decreases. Therefore, a set of input and output
patterns called a training set is required for supervised learning. In many situations, the
inputs, outputs, and the computed gradients are deterministic, and the minimization
proceeds over all its random fluctuations. As a resuh, most supervised learning algorithms
reduce to stochastic minimization of error in a multi-dimensional weight space.
In unsupervised learning, the desired response is not known; therefore, explicit error
information cannot be used to improve network behavior. Suitable weight self-adaptation
mechanisms have to be embedded in the network. The topic of unsupervised learning is
an area of active research interest in pattern recognition where it is used to perform
clustering of objects when there is no a priori information available about the classes.
More details on unsupervised learning can be found in the text by Zurada (1992).
An important part of neural network training is the learning rule. Learning rules are
algorithms that govern the modification of the internal representation (the weights) of the
15
Network Input, X
Network
w Actual
Network Output, Y
1 Teacher Desired
Output, D
(a) Supervised Learning
Network Input, X
Network
w Actual
Network Output, Y
(b) Unsupervised Learning
Figure 2.4. Basic types of neural network learning mechanisms
16
network in response to the inputs and the transfer fiinction. The learning rules fall into
seven basic categories: Hebbian Learning Rule (Hebb, 1949); Perceptron Learning Rule
(Rosenblatt, 1958); Delta Learning Rule (McClelland and Rumelhart, 1986); Widrow-
Hoff Learning Rule (Widrow, 1962); Correlation Learning Rule; Winner-Take-All
Learning Rule (Hecht-Neilsen, 1987); and Outstar Learning Rule (Grosberg, 1982, 1977).
Details regarding how each rule accomplishes the weight adjustment, the learning type
(supervised or unsupervised) involved, and the characteristics of the neurons for these
different learning rules are discussed in the text by Zurada (1992).
2.4. Backpropagation Training Algorithm
A popular training algorithm for multi-layered feedforward networks is called the
error backpropagation training algorithm or, simply, backpropagation. Backpropagation
uses the Generalized Delta Learning Rule (Rumelhart et al., 1986; Werbos, 1974), and
has been used extensively by researchers for network training. The algorithm is called so
because even though the network operates in the forward manner (i.e., from input to
output) during the classification stage, the weight adjustments enforced by the learning
rules propagate exactly backward, from the output layer toward the input layer. Classical
backpropagation is a gradient approach to optimization which is executed iteratively with
the implicit bounds on the distance moved in the search direction in the weight space fixed
via the learning rate, which is equivalent to a step size.
The problems of learning an input-output mapping from a set of P examples can be
transformed into the minimization of a suitably defined error fiinction. Although different
definitions of the error have been used, we will consider the "traditional" sum-of-squared-
differences error fiinction defined as
£ = -Z£p=-ZI (^p , -^p , ) ' . (2.6) 2 p=\ 2 p=h=i
17
where E is the total sum-of-squared error for all P patterns, Ep is the sum-of-squared error
for the '/7'th pattern, K , and D , are the desired and network predicted responses for the
'/'th output of the '/?'th pattern. The backpropagation algorithm is composed of two
stages. In the first, the contributions to the gradient coming from each pattern {dEJdw^j)
are calculated "backpropagating" the error signal. The partial contributions are then used
to correct the weights, after every pattern presentation. If 6;^ the gradient of the error
fiinction, is defined as
5 t = V £ , ( W , ) , (2.7)
where W is the weights matrix, the weight update is given as
AW,=-Ti5„ (2.8)
where r\ is the learning rate.
The backpropagation algorithm defines an important step in the history of neural
networks and understanding the mechanisms of weight adjustment using backpropagation
leads to a better appreciation of the general procedure of training neural networks. The
backpropagation algorithm is presented in detail in Appendix A.
2.5. Analvsis of the Backpropagation Algorithm
The essence of the backpropagation algorithm is the evaluation of the contribution of
each particular weight to the output error. This evaluation is possible because the
objective fiinction of a neural network contains continuously differentiable fiinctions of the
weights. It would not have been possible with discrete neurons.
Even though the backpropagation algorithm is a breakthrough in supervised learning
of layered neural networks, in practice, however, implementation of the algorithm may
encounter several difficulties. These difficulties are typical of those arising in multi
dimensional optimization approaches. Backpropagation has the advantage of being readily
adaptable to parallel hardware architectures. However, most current studies of artificial
18
neural networks (ANNs) are conducted on primarily serial rather than parallel processors.
On serial machines, backpropagation can be very inefficient because the choice of initial
weights affects the algorithm severely in terms of its convergence and the rate at which it
converges (Kramer and Leonard, 1990).
One of the problems is that the error minimization procedure may get "hung up" in a
local minimum of the error fiinction. The on-line procedure has to be used if all the
patterns are not available before learning starts, and a continuous adaptation to a stream of
input-output signals is desired. One of the reasons in favor of the on-line approach is that
it possesses some randomness that may help in escaping from local minimum. The
objection to this is that the method may, for the same reason, miss a good local minimum.
The on-line update is useftil when the number of patterns is so large that the errors
involved in the computation of the total gradient may be comparable to the gradient itself
The fact that many patterns possess redundant information has been cited as an argument
in favor of on-line backpropagation, because many of the contributions to the gradient are
similar, so that waiting for all contributions before updating can be wasteful (Le Cun,
1986). This is especially true mainly in pattern classification.
The effectiveness and convergence of the backpropagation algorithm depends
significantly on the learning constant, r]. In general, however, the optimum value of r|
depends on the problem being solved, and there is no single learning constant value
suitable for different training cases. This is a problem common to all gradient-based
optimization techniques. While gradient descent can be an efficient method for obtaining
the weight values that minimize an error, error surfaces frequently possess properties that
make the procedure slow to converge. In the original formulation, the learning constant,
r|, was taken as a fixed parameter. Unfortunately, if the learning constant is fixed in an
arbitrary way, there is no guarantee that the network will converge. But even if r| is
19
chosen appropriately so that the error decreases with a reasonable speed and oscillations
are avoided, gradient descent is not always the fastest method to employ.
In practice quite a few simple improvements have been used to speed up convergence
and improve the robustness of the backpropagation algorithm (Kramer and Leonard,
1990; Leonard and Kramer, 1990a,b; Hush and Sales, 1988). One such improvement is
the addition of a momentum term that helps to accelerate the convergence by
supplementing the current weight adjustment with a fraction of the most recent weight
adjustment. The weight update can be written as
AW,=-Ti5,+aAW,_„ (2.9)
where k and k-\ are used to indicate the current and the most recent training step,
respectively, and a is a user-selected positive momentum constant. The second term in
Equation 2.9 is called the momentum term and, typically, is chosen between 0.1 and 0.8.
The momentum term keeps the direction of the descent from changing too rapidly from
step to step. It has been shown that inclusion of the momentum term can considerably
speed up convergence when comparable r| and a are employed when compared with the
standard backpropagation technique.
If all the patterns are available, collecting the total gradient before deciding the next
step can be usefiil in order to avoid a mutual interference of the weight changes. This
strategy is common known as batching and has been used with the standard
backpropagation algorithm (Rumelhart et al., 1986). But still, the batch technique
requires that the user select the values for the learning and momentum constants.
2.6. Optimization Approach to Neural Network Training
One of the competitive advantages of neural networks is the ease with which they
may be applied to novel or pooriy understood problems. It is, therefore, essential to
consider automated and robust learning methods with good average performance on many
20
class of problems. The current trend is to use optimization tools and strategies that exhibit
distinctly superior performance (Peel et al., 1992; Barnard, 1992; Battiti, 1992; Hsiung et
al., 1991) and, furthermore, are easier to apply because they do not require the choice of
critical parameters (such as TI and a) by the user. Several researchers (Kramer and
Leonard, 1990; Kollias and Anastassious, 1988; Kung and Hwang, 1988; Ricotti et al.,
1988; Parker, 1987; Watrous, 1987; White, 1987) have shown that optimization
algorithms employing modem unconstrained optimization techniques based on the secant
or conjugate gradient methods together with the backpropagation concept are much
better than classical backpropagation itself
2.6.1. Conjugate Gradient Methods
One of the difficulties in using the steepest descent method is that a one-dimensional
minimization in some arbitrary direction a followed by a minimization in another direction
b does not imply that the fiinction is minimized on the subspace generated by a and b.
Minimization along direction b may in general spoil a previous minimization along
direction a (this is why the one-dimensional minimization in general has to be repeated a
number of times larger than the number of variables). On the contrary, if the directions
were noninterfering and linearly independent, at the end of A' steps the process would
converge to the minimum of the quadratic fiinction. The concept of noninterfering
(conjugate) directions is the basis of the conjugate gradient method for minimization. A
major difficulty with the above form is that, for a general fiinction, the obtained directions
are not necessarily the descent directions and numerical instability can result.
The use of the momentum term to avoid oscillations in the backpropagation method
(Rumelhart et al., 1986) can be considered as an approximated form of conjugate gradient.
In both cases, the gradient direction is modified with a term that takes the previous
direction into account, the important difference being that the parameter in the conjugate
21
gradient technique is automatically defined by the algorithm, while the momentum rate has
to be "guessed" by the user. More details on the conjugate gradient method are found
elsewhere (Press et al., 1992; Battiti, 1992; Leonard and Kramer, 1990a). Conjugate
gradient methods have been used by Barnard (1992) and Leonard and Kramer (1990a) for
training feedforward neural networks.
2.6.2. Newton's Method
Newton's method can be considered as the basic local method using second-order
information. It is important to stress that its practical applicability to multi-layered neural
networks is hampered by the fact that it requires calculation of the Hessian matrix, a
complex and expensive task. If the Hessian matrix (V^E) is positive definite and the
quadratic model is correct, one iteration is sufficient to reach the minimum. Assuming
that the Hessian can be obtained in reasonable computing times, the main practical
difficulties in applying the "pure" Newton's method arise when the Hessian is not positive
definite, or when it is singular and ill-conditioned. It is worth observing that, although
troublesome for the above reasons, the existence of the directions of negative curvature
may be used to continue from the saddle point where the gradient is close to zero. Battiti
(1992) has reviewed in detail Newton's method and some of its modifications to deal with
global convergence, indefinite Hessian, and iterative approximations for the Hessian itself
Modifications of Newton's method have been used by Poli and Jones (1994) and White
(1989) for training feedforward neural networks.
2.6.3. Secant Methods
When the Hessian is not available analytically, secant methods are widely used
techniques for approximating the Hessian in an iterative way using information only about
the gradient. Historically, these methods are also called as quasi-Newton methods. The
22
suggested strategy is to update a previously available approximation instead of
determining a new approximation. The Broyden-Fletcher-Goldfarb-Shanno (BFGS)
update (Broyden et al., 1973), a positive definite secant update, has been the most
successfiil update in a number of studies performed during the years. Secant methods for
learning in muhi-layer neural networks have been used, for example, in Watrous (1987).
The 0{hP-) complexity of BFGS is clearly a problem for very large networks, but the
method can still remain very competitive if the number of examples is very large, so that
the computation of the error fiinction dominates. Hsiung et al. (1991) and Parker (1987)
have used modifications of the secant method for training of feedforward neural networks.
2.6.4. Special Method for Least Squares
One drawback of the BFGS method is that it requires storage for a matrix of size A x
A' and a number of calculations of order 0{N^). While available storage is less of a
problem now than it was a decade ago, the computational problem still exists when A
becomes of the order of one hundred or more. It has also been shown that it is possible to
use a secant approximation with 0{N) computing and storage time that uses second-order
information in methods that are known as one-step secant (OSS) methods (Battiti, 1989).
But, if the error fiinction that is to be minimized is the usual fiinction described in
Equation 2.6, learning a set of examples is reduced to solving a nonlinear least-squares
problem, for which special methods have been devised. The question then is what makes
this problem different from that dealt in the general nonlinear fiinction minimization in
Sections 2.6.1-2.6.3? In a broad sense, it is not different at all!
Let us consider a model that depends nonlinearly on a set of A/ unknown parameters
a^k=\,2, ...,N. Using standard regression analysis notation to define an objective
fiinction (j), we wish to determine best-fit parameters by minimizing the function (j). The
minimization has to proceed iteratively because of the nonlinear fiinctional dependencies.
23
Given initial estimates for the parameters, a procedure that improves the initial solution
can be developed. The procedure is repeated until (j) stops or effectively stops decreasing.
The (|) fiinction can be expected to be well approximated by a quadratic form, suflBciently
close to the minimum, and can be written as,
<t>(a)«Y-d a+-a .D.a , (2.10)
where y is the vector of desired (expected) values, d is a A -vector of gradients, a is the N-
vector of parameters, D is the A x A/'Hessian matrix (Press et al., 1992). If the
approximation is a good one, the new estimates, a ,„, can be determined from the current
estimates, a „ ^ in a single step, from the following relationship
a.,„ =a^^ + D-^[-Vx'(a^^)]. (2.11)
On the other hand. Equation 2.10 could be a poor local approximation to the shape of
the fiinction that we are trying to minimize at a^ . In that case, using the steepest descent
method we can take a step down the gradient, i.e.,
Kext =a^^-c .Vx^a^^) , (2.12)
where the constant c is small enough not to exhaust the downhill direction.
It is imperative that the gradient of the objective fiinction (j) be computed at any set of
parameters a in order to use either Equation 2.11 or 2.12. In addition, the matrix D,
which is the second derivative matrix (the Hessian) of the (j) fiinction, is also needed in
order to use Equation 2.11. The crucial difference between the second-order methods
discussed in Sections 2.6.1-2.6.3 and the method discussed here is that there was no way
of directly evaluating the Hessian matrix in the second-order methods. One could only
evaluate the fiinction to be minimized and (in some cases) its gradient. Therefore, iterative
methods are required not just because the fiinction is nonlinear, but also in order to
generate information about the Hessian matrix. In the present method, the form of ^ is
known exactly, since it is based on a user-specified fiinction. Therefore, the Hessian
24
matrix is known, and Equation 2.11 can be used whenever needed. Equation 2.12 will be
used whenever Equation 2.11 fails to improve the fit, signaling failure of Equation 2.10 as
a good local approximation. More details on the least-squares method is available in the
text by Press et al. (1992).
2.7. The Levenberg-Marquardt Method for Nonlinear Least Squares
The general strategy for supervised learning of an input-output mapping is based on
combining a quickly convergent local method with a globally convergent one. We use the
Levenberg-Marquardt method (also known as the Marquardt method) (Marquardt, 1963)
for solving the nonlinear least-squares problem. The Marquardt method is a trust-region
modification of a Gauss-Newton method. Line-search optimization techniques are based
on finding a search direction and moving by an acceptable amount in that direction (step-
length based methods). While in line-search algorithms the direction is maintained and
only the step length is changed, there are alternative strategies based on choosing a step
length first, and then using a fiill quadratic model to determine the appropriate direction.
These methods are called model-trust-region methods with the idea that the model is
trusted only within a region, that is updated using the experience accumulated during the
search process.
The Marquardt method switches smoothly between the extremes of the Gauss-
Newton method (inverse-Hessian method based on simplifying the computation of the
second derivatives) and the steepest descent or gradient method (model-trust-region
modification). The latter is used far from the minimum, switching continuously to the
former as the minimum is approached. The algorithm for the Marquardt method is
presented in detail in the original paper by Marquardt (1963) and the text by Press et al.,
(1992). The algorithm is presented here in a simple manner for the sake of clarity.
25
Let us consider a set of w equations in m unknown variables of the form
/ 2 ( X j , X 2 , . . . , X „ ) = >'2.
f„{x^,X2,...,x„) = y„, (2.13)
where r are the unknown variables, y^ are the known values, and J are the known
fiinctions. The algorithm seeks to find a set of x that will minimize a user-defined
fiinction, such as the sum of squares error, (j), given by
(|>= ! ( / - > ' , ) ' . (2.14) /=i
The fiinctional form of/ are assumed to be known and the^, are constant. The
gradient of with respect to the parameters x has components
'^=-2hf,-yA (2.15) dxj^ ;=i ' dXf^
where k= 1,2, ..., w. The gradient is zero at (j) minimum
Taking an additional partial derivative gives
^ * -2h^^-if>-y,)^l (2.16) dx^dxi /=i dXf^ dxi dx^dxi
Removing the factors of 2 by defining
and
1 d% ^ki = 2 dx^dxi'
26
and making A = D/2 in Equation 2.11. Equation 2.11 can now be rewritten as the set of
linear equations
m Za,,5x,=p,. (2.17)
/=i
This set could be solved for the increments 6x that when added to the current estimates
give the next estimates. In the context of least-squares, the matrix A, equal to one-half
times the Hessian matrix, is usually called the curvature matrix.
The steepest descent formula, given in Equation 2.12, translates to bxi=c.Pi. (2.18)
Note that the components a ^ of the Hessian matrix A (Equation 2.16) depend on
both the first and second derivatives of the basis fiinctions with respect to their
parameters. The second derivatives occur because the gradient (Equation 2.15) already
has a dependence on df/dx^ so the next derivative simply must contain terms involving
d^f/dx^dxi. The second derivative term can be omitted when it is zero or small enough to
be negligible when compared to the term involving the first derivative. It also has the
additional possibility of being negligibly small in practice: the term multiplying the second
derivative in Equation 2.16 is (/] -y,). For a successfiil model, this term should be the
random measurement error of each point. The error can have either sign, and should, in
general, be uncorrelated with the model. Therefore, the second derivative terms tend to
cancel out when summed over /, giving a new definition for a^, as
a,, = i^^. (2.19) i=lOX^ OXi
Marquardt (1963) developed an elegant method for varying smoothly between the
extremes of the inverse-Hessian method (Equation 2.17) and the steepest descent method
(Equation 2 18). The method is based on two simple, but important, insights. There is no
clue about the order of magnitude or the scale of the constant c in Equation 2.18. The
27
gradient only gives the slope, and does not give the extent of that slope. Marquardt's first
insight is that the components of the Hessian matrix gives some information regarding the
order-of-magnitude scale of the problem. If (}) is taken to be non-dimensional, and p has
the dimensions of 1/x^ the constant of proportionality between p and Sx ^ must therefore
have dimensions of xl. Scanning the components of A yields only one component with
these dimensions, and that is 1/a^ the reciprocal of the diagonal element. Dividing the
constant by some non-dimensional factor, X, to reduce the scale, and with the possibility
of setting > . » 1 to cut down the step. Equation 12.18 can be rewritten as
^x,=-—P/,
or
A,a„5x/=p/. (2.20)
It is necessary that a„ be positive. But, by definition in Equation 2.19, it is guaranteed.
Marquardt's second insight is that Equations 2.20 and 2.17 can be combined to define
a new matrix A' such that
a ^ =aj^(l + X),
ttyt = a,it, (/'^ X (2.21)
and then replace both Equations 2.20 and 2.17 by
i : a > ; = p , . (2.22)
When X is very large, the matrix A is forced into being diagonally dominant, so
Equation 2.22 goes over to be identical to Equation 2.20 (the inverse-Hessian method).
On the other hand, as X approaches zero. Equation 2.22 goes over to Equation 2.17 (the
steepest descent method). In this manner, the Marquardt method uses the latter method
far from the minimum, switching continuously to the former as the minimum is
28
approached. The Marquardt method works very well in practice and has become the
standard of nonlinear least-squares routines.
The learning rule proposed here is that applicable to a standard nonlinear least-
squares analysis (Hsiung et al., 1991). The entire set of weights are adjusted at once
instead of adjusting them sequentially from the output layer to the input layer. The weight
adjustment is done at the end of each epoch (one exposure of the entire training set to the
network), and the sum of squares of all errors for all patterns is used as the objective
fiinction for the optimization problem. More details on the Marquardt method are found
elsewhere (Battiti, 1992; Press et al., 1992; Henley and Rosen, 1969). A description for
usage of the computer program that uses the Marquardt method for nonlinear estimation
and equation solving is presented in Appendix B.
2.8. Examples Using the Marquardt Method for Training Neural Networks
In this section we compare the performance of two training algorithms: the
Marquardt method and the backpropagation method, with the help of two different
examples from literature. The first example is a simple pattern recognition problem, and
the second one involves mapping of a nonlinear fiinction that was presented by Namatane
and Kimata (1989).
2.8.1. The "Iris" Classification Problem
It is desired to classify three different types of iris flowers into Class A, B, or C based
on four common attributes, say X^, X^, X^, and X^. The training set comprises 75 data
points with four inputs and three outputs. The outputs are either "0" or "1." Only one of
the three outputs for any given set of inputs (X,, X^, X-^, and X^ can be "1" with the other
29
two outputs being "0," implying that each flower belongs to a unique class (A, B, or C).
The test set also consists of 75 data points different from the ones in the training set.
A 4-4-3 network was trained using the Marquardt method, and was able to classify all
75 patterns without any error in 25 iterations through the nonlinear optimizer. Of the 75
data points in the test set, the network was able to classify 72 patterns correctly, yielding a
96% correct classification. Another 4-4-3 network was trained using a backpropagation
algorithm with fixed learning and momentum rates of 0.1 and 0.5, for a 1000 data
presentations. The "backprop network" was unable to make a "clean" classification by
producing outputs of "0" or " 1." In order to enable some comparison, it was decided to
consider the largest of the three outputs to correspond to "1," and the others as "0." With
this ad hoc aid to speed up the classification process, it was found that the "backprop
network" was able to classify all of the 75 patterns in the training set correctly. For the
test set, the network was able to classify 72 out of the 75 patterns correctly, giving once
again a 96% correct classification.
In case of the backpropagation network, training had to be re-started several times
before an "acceptable progress" in decrease in the normalized root mean square error was
noticed. Each time the learning rate and momentum rate had to changed by trial-and-error
till "good" values were found. In comparison, the Marquardt algorithm always gave
"clean" classifications by producing outputs that were either "0" or "1," and always
converged to 100% classification in 25 iterations or less. The initial weights were selected
at random to be small positive and negative values in the range ±0.1. Also, the inputs and
outputs were scaled using the same scaling function for both training methods.
2.8.2. Mapping a Nonlinear Function
Hsiung et al. (1991) used successive quadratic programming for a nonlinear optimizer
on a function reported by Namatane and Kimata (1989) to demonstrate the relative
30
effectiveness of the nonlinear optimization strategy over backpropagation for training
neural networks. The function that was mapped is given by
1 >'=^—(12 + 3x-3.5r+7.2jc^)(l + cos47cc)(l + 0.8sin37cc). (2.23)
Hsiung et al. (1991) considered mapping y as a function of five inputs: x, x , x ,
cos(7cx), and sin(7Dc). 60 random values of x in the range 0 to 1 were chosen, and the
other four inputs (x , x , cos(7cx), sin(7cx)) were then computed. The output ^ was
calculated for each value of x. The test data set was made up in a similar manner and
comprised 40 different points. Hsiung et al. (1991) report the resuhs of training 2i fully
connected five-input, three-hidden node, one-output feedforward neural network. A fiilly
connected network is a network in which every layer is connected to each layer ahead of
it, i.e., the nodes in the input layer are not just connected to all nodes in the hidden layer,
but are also connected to all the nodes in the output layer. Hsiung et al. (1991) used a
successive programming code^ with no constraints, and hence it defaulted to the BFGS
algorithm. They report that to reduce the mean square error to 3.17x10-^ (corresponds to
a normalized root mean square error of 2.298x10* ) for 60 training patterns took 200
iterations of the optimizer.
We followed the method suggested by Hsiung et al. (1991) to prepare the training
and test data sets for the function given by Equation 2.23. A 5-4-1 feedforward neural
network was trained using the Marquardt method and was allowed to run for 200
iterations. The normalized root mean square error after 200 iterations was calculated to
be 5.41x10- . Another 5-4-1 network was trained using the backpropagation algorithm
with fixed learning and momentum rates of 0.01 and 0.25, respectively, for 5000 data
presentations after which the normalized root mean square error was reduced to
'NPSOL from Stanford University, Department of Operations Research.
31
2.93x10-3. Figure 2.5a compares the mapping obtained from the 5-4-1 network trained by
the Marquardt method for the training and test data sets, and the analytical value of the
function. Figure 2.5b shows a similar comparison for the 5-4-1 network trained using the
backpropagation algorithm.
It is to be noted that the normalized root mean square error reported for the network
trained using the backpropagation algorithm is the "best" value obtained after running
several trials. It is by no means an optimum training performance. As mentioned earlier,
the backpropagation algorithm is extremely sensitive to initial values of the weights. It is
also to be noted that we did not use the exact same data set as used by Hsiung et al.
(1991). Also, they used Gaussian transfer functions for all their hidden nodes, and a linear
transfer function for the output node. We used hyperbolic tangent functions for all the
nodes in the network, in both the networks. The purpose of the comparison is purely to
show that a more robust training algorithm, such as the Marquardt method, is able to
handle the turning points in a complex nonlinear fiinction quite well when compared to a
backpropagation algorithm. The comparison with the method used by Hsiung et al.
(1991) is made only from the standpoint of affirming the approach of using optimization as
a tool for training neural networks, and not as a benchmark for the two optimization
techniques.
32
0 0.2 0.4 0.6 0.8
Figure 2.5. Mapping a nonlinear function with a neural network, (a) Network trained using Marquardt method.
33
Figure 2.5. Continued, (b) Network trained using backpropagation algorithm.
34
CHAPTER III
STEADY-STATE MODELS FOR DISTILLATION
In the past two decades, a great number of model-based control algorithms have been
proposed to achieve better performance and robust controllers. In-depth reviews on
model-based control strategies are presented in the papers by Bequette (1990), Bosley et
al. (1992) and Seborg et al. (1986). All these advanced techniques rely heavily on the
availability of a mathematical model that is a good representation of the dynamics of the
process being controlled. A vast majority of the techniques use linear or nonlinear
dynamic empirical models comprised of past values of the inputs and outputs of the
process. More recently, neural network dynamic models have been used in place of the
conventional empirical dynamic models in model-based control strategies (You and
Nikolaou, 1993; Raich et al, 1991; Bhat and McAvoy, 1990). These control strategies
fall under a general class known as Model Predictive Control (MPC).
Another model-based control technique developed by Lee and Sullivan (1988),
known as Generic Model Control (GMC), uses a controller based on a steady-state
"process inverse" model and a reference system synthesis (Bartusiak et al., 1989) based on
first-order dynamics.
3.1. Process Models and Process Inverse Models
Before discussing the concept of using steady-state models for control purposes, it is
important to differentiate between "process" models and "process inverse" models. A
process model refers to a mathematical equation, or a set of equations, that could
determine the estimated output of the process when given the process inputs. For
instance, in the case of distillation, a process model would predict the compositions of the
overhead and bottom products given the feed flowrate, feed composition, reflux rate,
35
boilup rate (or steam flowrate to the reboiler), the number of ideal stages, the stage
efficiency, etc. Kprocess inverse model refers to a mathematical equation, or set of
equations, that could determine the values of the manipulated variables that would
produce the desired process outputs. Once again, in the case of distillation, a process
inverse model would predict the reflux rate and boilup rate required to produce the desired
overhead and bottom product compositions, given all other pertinent input data.
Most of MPC strategies use both forms of the model, the process model for system
identification, and the process inverse model for the control action. If the process model
happens to be an empirical model, then the same model can be inverted to obtain the
desired control action. If the process model is a neural network model, then a separate
neural network model has to be developed to represent the process inverse.
For chemical process industries, it is highly desirable to use models that predict
directly the manipulated variables that would produce the desired outputs (Bhagat, 1990).
More particularly for chemical process control, the use of a process inverse model to
calculate explicitly the manipulated variables in order to follow a reference system or to
bring the process back to its set-point is extremely appealing. If the process dynamics can
be approximated to be first-order, then the process inverse dynamic models can be
replaced by process inverse steady-state models to obtain the control action. This is
precisely what the GMC strategy is based on. More details on GMC will be presented in
Chapter V.
3.2. Distillation Column Test Cases
It is desired to demonstrate the neural network model-based controllers on dynamic
simulations of two different methanol-water distillation columns: one is a 7-stage column,
and represents an experimental system at Texas Tech University (Pandit et al., 1992); and
the other is a high-purity industrial column (Rhinehart, 1994). The lab-scale column is an
36
atmospheric column, and produces products with approximately 4-5% impurity in the
overhead product, and 1-2% impurity in the bottom product. The reboiler hold-up in the
lab-column is roughly 30 times more than that in the condensate receiver, and hence
creates vastly different dynamics at the two ends. By contrast, the high-purity column is
an industrial column producing products with less than 1000 parts per million impurity,
and is typical of the "refining" column in a 3-column industrial methanol separation
process (Fruehauf and Mahoney, 1994; Mehta and Pan, 1971; Mehta and Ross, 1970).
These two cases will enable us to perform control studies to evaluate the use of neural
network models in a process model-based control environment.
3.3. Development of the Steadv-State Inverse Models for Distillation
The foremost requirement for development of any neural network model is the
availability of data that captures the relationship between the inputs and outputs of the
process. The data for training the neural networks were obtained from steady-state
process simulations for the two methanol-water distillation columns. The steady-state
simulations of the two methanol-water distillation columns were developed using a
commercially-available steady-state process simulation (CAD) package (HYSEM®' ). The
feed flowrate, F, the feed composition, z, the overhead composition, x ,, and the bottoms
composition, XQ were specified, and the steady-state simulators were used to determine the
reflux rate, L, and the boilup rate, F, needed to meet the specifications. The steady-state
process simulations were based on the NRTL thermodynamic model for vapor-liquid
equilibrium (VLE), and a Murphree stage efficiency of 75% for all the stages (chosen
arbitrarily), except the reboiler which is ideal, and both columns use a total condenser.
2HYSIM® is the registered trademark of Hyprotech, Calgary, Canada.
37
Table 3.1 gives the design specifications and operating conditions for the two methanol-
water distillation columns. "Experiments" performed on the steady-state simulators were
designed to cover a full square design.
The training set for the lab column comprised 81 data points, while that for the high-
purity column comprised 375 data points. The 81 data points for the lab column were
obtained by considering three data points for each of the four network inputs: two data
points to mark the minimum and maximum limits in the ranges specified in Table 3.1, and
one data point in between giving 3x3x3x3 = 81 data points. In the case of the high-purity
column, five data points each for F, x , and x over the range specified in Table 3.1, and
three data points for z were chosen giving 5x3x5x5 = 375 data points. Two separate four-
input, five-hidden nodes, two-output neural networks (abbreviated as 4-5-2 network)
were trained on data sets representing the lab and high-purity columns. Two separate test
data sets, consisting of 100 data points for the lab column and 250 data points for the
high-purity column, were prepared using the same steady-state process simulations by
considering values for F, z, X£>, and Xg intermediate to the ones used to make up the
training data set. The trained 4-5-2 networks were then tested to see how well they
predict the values for reflux and boilup rates for data in the test set.
Figures 3.1a and b show the comparison between the actual (CAD package data) and
the network predicted values for the reflux rate for the training data set and the test data
set, respectively, for the lab column. Figures 3.2a and b show similar comparisons for the
boilup rate for the training and test data sets, respectively, for the lab column. Figures
3.3a and b, and 3.4a and b show the corresponding resuhs for the high-purity column.
From the above comparisons, it can be seen that the neural networks have been able to
capture the operational characteristics of two distillation columns after the training
process, which took approximately 25 iterations of the nonlinear optimization routine
(approximately 5-10 minutes on a 486/50 MHz PC).
38
Table 3.1. Design and Operating Conditions for the Two Distillation Columns.
Specifications Lab Column High-Purity Column
CAD Simulation Design Conditions
1. No. of Stages (includes reboiler)
2. Feed Stage (from the reboiler)
3. Feed Quality 4. Reflux Quality 5. Pressure 6. Murphree Stage Efficiency
45
19
Subcooled to 120°F Saturated Liquid Subcooled to 120°F Subcooled to 120°F
1 atma. 2 atma. 75% 75%
Nominal Operating Conditions
1. Feed Rate (Ibmols/hr) 2. Feed Composition
(mole fraction methanol) 3. Overhead Product Composition
(mole fraction methanol) 4. Bottom Product Composition
(mole fraction methanol) 5. Reflux Rate (Ibmols/hr) 6. Boilup Rate (Ibmols/hr) 7. Reflux Ratio
Normal Operating Range
1. Feed Rate (Ibmols/hr) 2. Feed Composition
(mole fraction methanol) 3. Overhead Product Composition
(mole fraction methanol) 4. Bottom Product Composition
(mole fraction methanol) 5. Reflux Rate (Ibmols/hr) 6. Boilup Rate (Ibmols/hr)
0.35 0.3
0.92
0.025
0.132 0.243 1.17
0.164-0.588 0.2-0.4
0.85-0.95
0.02-0.07
0.0321-2.243 0.0832-2.166
800 0.12
0.999
0.001
180 258 1.89
600-1000 0.1-0.14
0.9985-0.9995
0.0005-0.0015
132.82-263-64 180.06-375.09
39
0 0.5 1 1.5 2 Reflux Rate (CAD Package), (Ibmols/h)
2.5
Figure 3.1. Reflux rate predictions from neural networks for lab column, (a) Results from training data set.
40
0 0.2 0.4 0.6 0.8 Reflux Rate (CAD Package), (Ibmols/h)
Figure 3.1. Continued, (b) Results from test data set.
41
2.5 1
0 0.5 1 1.5 2 Boilup Rate (CAD Package), (Ibmols/h)
2.5
Figure 3.2. Boilup rate predictions from neural networks for lab column, (a) Results from training data set.
42
0 0.2 0.4 0.6 0.8 Boilup Rate (CAD Package), (Ibmols/h)
1.2
Figure 3.2. Continued, (b) Results from test data set.
43
280
120 140 160 180 200 220 240 Reflux Rate (CAD Package), (Ibmols/h)
260 280
Figure 3.3. Reflux rate predictions from neural networks for high-purity column, (a) Results from training data set.
44
280
260 -I
•5 240 B
^220 -I
^200
B 180 -I
G 160 -
140
120 120 140 160 180 200 220 240
Reflux Rate (CAD Package), (Ibmols/h) 260 280
Figure 3.3. Continued, (b) Results from test data set.
45
400
i^350 -o B
5 300 -
I ?250 -
I 200
150 150 200 250 300 350
Boilup Rate (CAD Package), (Ibmols/h) 400
Figure 3.4. Boilup rate predictions from neural networks for high-purity column, (a) Results from training data set.
46
400
«5 350 "S B
Xi
300 -ka
o
a
a. 3 c§200
250
150 -« 150 200 250 300 350
Boilup Rate (CAD Package), (Ibmols/h) 400
Figure 3.4. Continued, (b) Results from test data set
47
3.4. Optimal Training of Neural Networks
The issue of optimal training of neural networks deals with determination of the
"best" model that solves a given problem. We define "best" as the lowest normalized root
mean squared error based on the test data set. The basic idea is to improve
generalizations, reduce the number of training examples required, and improve speed of
learning and/or classification using minimum number of hidden nodes. It is analogous to
model parsimony in classical statistical regression pariance. The key issue here is to
realize that as applications become more complex, the networks become larger (i.e., more
connections, and hence, more weights). More importantly, as the number of parameters
increase, overfitting problems may arise, with devastating performance on generalizations.
Overfitting is the term applied to describe the extent of the fit achieved by the neural
network as measured by the number of free parameters (the weights) of the network. As
in the case of other methods for function approximation, such as polynomial, too many
free parameters will allow the network to fit the training data arbitrarily closely, but will
not necessarily lead to optimal generalization.
Several methods have been developed and studied to optimize network training (Le
Cun et al., 1991; Weigand et al., 1990). While overfitting is an issue that cannot be
ignored, it becomes of increasing importance when the number of weights is of the order
of the number of training examples. Such networks are referred to as oversized networks.
Also, overfitting becomes more important when the gradient learning process
(backpropagation) is used for weight adjustment. In gradient learning, initially the hidden
units in the network all do the same work, i.e., they all attempt to fit the major features of
the data. As those features are accounted for, the major source of error in the network is
determined by the second most important feature of the training data. The units then start
to differentiate with some of them beginning to fit this second most important aspect of
48
the data. As the process of differentiation continues, the effective number of degrees of
freedom start to increase. Assuming that sampling error is small relative to other sources
of variation in the data, early network training seeks to fit the significant features of the
data. It is only at later times that the network tries to fit the noise.
A solution to stop overfitting is to stop training just before the network starts to fit
the sampling noise. We followed the technique proposed by Weigand et al. (1990) which
uses a separate validation data set to guide when to stop training. The validation data set
can be made up of an arbitrary number of data points from the training data set, say 10%
of the points in the training data set. At the end of each epoch (each time a new set of
weights have been determined), the validation data set is presented to the network, and the
prediction error on the validation set is obtained. The data points selected to be in the
validation set are not a part of the training data set anymore, and are not seen by the
network while training. Training is stopped when the normalized root mean square error
on the validation data set starts to increase.
However, with the optimization technique used for determining new set of weights,
the presence or absence of the validation set did not make much difference to the overall
network prediction characteristics. It is our opinion that with a more robust learning tool,
overfitting was not that serious a problem as long as the network architecture was selected
to ensure training errors are reduced in a reasonable number of passes through the training
data set. Other network configurations with three, four, five, six, and seven hidden nodes
were also trained, but the network with five hidden nodes gave the best overall
performance based normalized root mean squared error for the test data set.
49
CHAPTER IV
DYNAMIC PROCESS SIMULATIONS
To study and compare the performance of the neural network model-based
controllers with conventional PI controllers, the controllers have to be implemented on the
two methanol-water distillation columns. Before implementing the neural network model-
based controllers on the "real" process systems, it is advisable to perform the tests on
dynamic simulations of the "real" processes. Dynamic simulations facilitate better
understanding of the dynamic behavior of the processes, and provide insights into the
nature of the interactions between the inputs and the outputs. Also, performing the tests
on dynamic simulators enables studying the control issues without interference from other
operational aspects such as safety, economics, etc. Dynamic simulators for the two
methanol-water distillation columns were developed from first principles based on the
design data given in Table 3.1. The dynamic simulator for each distillation column is a
tray-to-tray formulation based on the multicomponent distillation structure developed by
Luyben (1990), and involves solving ordinary differential equations and algebraic
relationships on each stage.
4.1. Mathematical Model for Nonideal Multicomponent Distillation
The mathematical model for the nonideal multicomponent distillation is based on the
following assumptions:
(1) One fixed feed plate is used to introduce the vapor and liquid feed regardless of the
feed or operating conditions.
(2) Pressure is constant and known on each tray.
50
(3) Coolant and heating media dynamics are negligible in the condenser and the reboiler,
respectively.
(4) The condenser is a total condenser.
(5) Liquid hydraulics are calculated from the Francis weir formula (Luyben, 1990).
(6) Perfect level control in the reflux drum and the reboiler allows a constant holdup in
the reflux drum and reboiler by changing flowrates of the bottoms product, B, and
liquid distillate product, D.
(7) Dynamic response of the internal energies on the trays are much faster than the
composition or total holdup changes, and therefore energy balances on each tray are
just algebraic.
(8) Reflux rate, L, and the boilup rate, F, are the manipulated variables.
(9) An empirically-correlated polynomial equation obtained from regressing experimental
data is used for thermodynamic VLE.
(10) A single Murphree stage efficiency is used for all the stages, except the reboiler which
is ideal.
Consider the '/'th stage in a A -stage distillation column separating a feed containing n^
components as shown in Figure 4.1. Let the '/'th stage represent the feed stage which
allows for a feed containing both vapor and liquid fractions. The equations describing the
time-domain behavior on this stage are comprised essentially of an overall material
balance, component material balances, an energy balance, and the thermodynamic
equilibrium.
4.1.1. Overall Material Balance (one per stage)
The overall material balance on the feed stage can be written as
^ = z,,„+/;^+/-r,+i^.,-A-f^. (4.1) at
51
V. N
N+\
F,z
N
NF
3
"/v+i
D,x D
7'th Stage
B,x D
Figure 4.1. Schematic of a distillation column with details on the Tth stage
52
where M, is the liquid holdup (Ibmoles) on the '/'th stage; L, and L,_^ are the flowrates of
the liquid leaving the '/'th and '/-I'th stage, respectively; ^ and F,.i are the flowrates of the
vapor leaving the '/'th and '/-I'th stage, respectively; F,^ is the flowrate of the liquid
fraction of the feed entering on the '/'th stage; and FI^_^ is the flowrate of the vapor fraction
of the feed entering on the '/-I'th stage.
4.1.2. Component Material Balance (w -1 per tray)
The individual component balance on the feed stage can be written as
^ ^ ^ ^ = A.i',>u + F'<j + F!-,yU, + >-^yi-^., - A*,; - v,y>.i^ (4.2)
where x,j and x, j^ are the compositions of the yth component in the liquid leaving the '/'th
and '/+rth stage, respectively; >', and>',.i are the compositions of the yth component in
the vapor leaving the '/'th and '/-I'th stage, respectively; xfj is the composition of the yth
component in the liquid fraction of the feed entering the '/'th stage; and y^_^j is the
composition of the yth component in the vapor fraction of the feed entering on the '/-I'th
stage.
4.1.3. Energy Balance (one per stage)
The energy balance on the feed stage can be written as
^ ^ ^ = L,.AM + F'hf + FlX-^ + V,-A.^ - kh - KH„ (4.3) at
where //, and //,+, are the enthalpies of the liquid leaving the 7'th and 'i+Vth stage,
respectively; //, and //,., are the enthalpies of the vapor leaving the '/'th and '/-I'th stage,
respectively; hj" is the enthalpy of the liquid fraction of the feed entering the 'fth stage;
and //,^j is the enthalpy of the vapor fraction of the feed entering on the '/-I'th stage.
53
4.1.4. Thermodynamic Equilibrium (w per tray)
The thermodynamic vapor-liquid equilibrium is given by the functional dependence
that can be expressed as
y:,j=fip:.PTj.,x,j), (4.4)
where y,j is the composition of the yth component in the vapor phase in equilibrium with
the yth component in the liquid phase for the '/'th stage; P* is the saturation vapor
pressure at temperature 7,; and Pj- is the total system pressure.
Equations 4.1, 4.2, and 4.3 are applicable to any stage in the distillation column. If
the '/'th stage under consideration is not the feed stage, then the contribution due to the
feed are neglected. In terms of the dynamic process behavior, the liquid rates through out
the column are not the same (assumption #5). They depend on the fluid mechanics of the
tray, and often a simple relationship such as the Francis weir formula can be used to relate
the liquid holdup on the stage, M,, to the liquid flowrate leaving the tray, Z,,. The Francis
weir formula is given as
Q,=3.33ljh^j'\ (4.5)
where Q^ the is the liquid flowrate over the weir (ft /s), / is the length of the weir (ft),
and h^^ is the height of the liquid over the weir (ft). More rigorous relationships can be
obtained by considering detailed tray hydraulics to include effects of vapor flowrate,
densities, composition, etc.
A Murphree vapor-phase efficiency is used to describe the departure from equilibrium
(assumption #10) and is given as
E,^ = 2kZ2izkL, (4.6)
54
where >', is the actual composition of the vapor leaving the '/'th stage; >',.i is the actual
composition of the vapor leaving the '/-I'th stage; £,^ is the Murphree vapor efficiency for
the yth component in the '/'th stage.
The reboiler is considered to be the first stage and is an ideal stage (100% efficiency),
and the condenser is a total condenser and hence, it is not an equilibrium stage. For
calculation purposes, we designate the condenser as the "A/ l"th stage. The equations
describing the reboiler and the condenser are slightly different due to the perfect level
control assumption (assumption #6). Perfect level control assumes that the holdup in the
condenser and reboiler is constant and does not change, and therefore
dM, d/
and
0
d/
Under this assumption, the overall material balance in the reboiler then becomes an
algebraic equation that can be written as
L,.,-L,-V,=0. (4.7)
Similarly, for the condenser, the overall material balance gives
K „ - I „ „ - Z ) = 0. (4.8)
The reflux and boilup rates are the two variables that have to be specified by the operator
(see discussion in Section 1.2 for the degrees of freedom analysis). Knowing the reflux
and boilup rate. Equations 4.7 and 4.8 can be solved explicitly to calculate the overhead
and bottom product draw rates, D and I , (commonly denoted as B). The component
balances still remain ordinary differential equations that need to be solved to determine the
rate of composition change for the overhead and bottom products.
55
Also, assumption #7 allows us to substitute the differential equation for the energy
balance with an algebraic relationship that can be solved explicitly on each stage to
determine the vapor flowrate leaving each stage inside the column. Therefore, for any
general stage '/' taking feed into account. Equation 4.3 becomes
H, ^^-^^
Empirically correlated polynomial equations were used for the thermodynamic vapor-
liquid equilibrium. For the lab-column, empirical correlations were obtained by regressing
the experimental data for a methanol-water system at 1 atmosphere absolute pressure
(Henley and Seader, 1981) to obtain polynomial relationships for vapor-phase
composition, liquid- and vapor-phase enthalpies, and temperature as a function of liquid-
phase composition. For the industrial methanol-water column, similar empirical
polynomial correlations were obtained, but the data for VLE, liquid and vapor enthalpies,
and temperature were obtained from HYSIM® for a methanol-water system with the
NRTL thermodynamic model. Details for the empirical correlations are presented in
Appendix C.
The dynamic simulations give the response of the overhead and bottom product
compositions from the distillation columns under various operating conditions. The
differential equations given in Equations 4.1 and 4.2 were integrated with respect to time
using an explicit Euler integrator (Riggs, 1994) along with the algebraic energy balance
and the thermodynamic VLE correlations. The accuracy of the explicit Euler integrator
was checked against a more rigorous fourth-order Runge-Kutta integrator (Riggs, 1994).
It was found that the explicit Euler integrator yielded a comparable performance when
compared to the fourth-order Runge-Kutta integrator, and hence, the explicit Euler
integrator was used for both the dynamic simulations of both distillation columns.
56
4.2. Additional Features of the Dvnamic Process Simulators
First-order autoregressive drifts were added to all process inputs (F, z) to create
disturbances, and Gaussian noise was added to all measured variables (F, z, x^ x , A B,
L, V)to simulate instrument noise. Unmeasured process disturbances have a great effect
on the process behavior, and can affect the controllability of the process. Almost all
unmeasured disturbances in a distillation column can be simulated by changing stage
efficiencies. Accordingly, another first-order autoregressive drift was added to the
Murphree stage efficiency to simulate unmeasured process disturbances. The Gaussian
noise and autoregressive drifts provide realism to the simulated data. Also, all measured
data are filtered through a first-order filter before use in any calculation or historical
trending, thus introducing a dynamic lag. A nominal Murphree stage efficiency of 80%
and 85% were used in the dynamic simulators for the lab column and the high-purity
column, respectively. In addition, a 5-minute analyzer delay was added to all the
composition measurements (z, x^ and x^) for the high-purity column. Since the purity
levels in the lab column are not high, the lab column uses temperature to infer
compositions and, hence, no analyzer delays were used.
4.3. Open-Loop Response Characteristics of the Processes
The open-loop response of any process enables the study of the degree of
nonlinearity, nonstationarity, and level of interaction between the various inputs and
outputs of the process from both a quantitative as well as a qualitative viewpoint. Open-
loop studies, typically, involve changing one of the input variables by a known amount
while keeping all other inputs about their base case values, and noting the response of the
process outputs over a period of time to study the effect of the change. The dynamic
simulators were used to study the open-loop responses of the two distillation columns.
57
4.3.1. Open-Loop Responses for the Lab-Column
The open-loop characteristics of the lab-column were studied by making ±10%
change in F, z, I , and F, one variable at a time, from their corresponding base case values
shown in Table 3.1, and noting the response of the overhead and bottom product
composition.
Figure 4.2a shows the response of the overhead and bottom product compositions to
the ±10% change in boilup rate; Figure 4.2b shows the sequence of the'+' and '-' 10%
changes in boilup rate and the essentially "constant" reflux rate during the period of the
test; Figure 4.2c shows the variation in the feed flowrate, feed composition, and the
Murphree stage efficiency (the disturbances) affecting the process during the same time
period. The nonlinear nature of the process is observed from the fact that the magnitude
of the change in the overhead and bottom product compositions for the +10% change in
boilup rate is vastly different from that due to the -10% change. Also, the nonstationary
behavior is noticed from the fact that the overhead and bottom compositions do not
necessarily return to the "same" base case values when the boilup rate is brought back to
its base case value.
Figures 4.3a-c show similar results for a ±10% change in the reflux rate;
Figures 4.4a-c show the results from ±10% change in the feed flowrate; and
Figures 4.5a-c, the resuhs for ±10% change in the feed composition. The "seed" for the
pseudo-random number generator used in the algorithm for the autoregressive drift and
Gaussian noise was set to a different value for each open-loop test to study the influence
of random changes in the disturbances on the open-loop responses.
58
0 10 I I
20 30 Time, (hours)
40 0.65
50
Figure 4.2. Open-loop response to boilup rate changes in lab column, (a) Overhead and bottom product compositions.
59
0.2
0.18
«6
io.i6
o
X 0.14
3
0^ 0.12
0.1 0
Boilup Rate
-vv-
Rcflux Rate
10 20 30 Time, (hours)
40
Figure 4.2. Continued, (b) Reflux and boilup rates.
rO.3
-0.28
0.26
fO.24 I
F0.22 i
0.2 "?
F0.18 o.
F0.16 cS
-0.14
-0.12
0.1 50
60
0.38 0.9
Feed Flowrate, (Ibmols/h)
c o 0.36 -
O Ou
o 0.34
I 0.32
8
0.86
0.82
0.78 Efficiency
c u e o 00
2
I 0.3 . .M^vvv-VVN'W^^-"^^
Feed Composition, (mf MeOH)
-0.74
0.28 ^ 0
0.7
10 20 30 Time, (hours)
40 50
Figure 4.2. Continued, (c) Process disturbances.
61
0.2
§0 .16
E tf0.12
CO
O
§0.08
c/>
o §0.04
CQ
Overhead Composition
0
0
Bottoms Composition
10 20 30 Time, (hours)
40
0.95
0.85
0.8
-0.75
0.7
50
X o
0.9 *g
o :5 o a B o U
I >
o
Figure 4.3. Open-loop response to reflux rate changes in lab column, (a) Overhead and bottom product compositions.
62
0.2
0.18
I 016
Boilup Rate
y Reflux Rale
A iWY- rV^Ar l f—TVVWV-
0.14
—ww-SMJwsr^rvV * nW-^VS./SWW-i
0.12
l v-'*»rnw-w-VTvVV
0 1
0 10 40 20 30 Time, (hours)
Figure 4.3. Continued, (b) Reflux and boilup rates.
0.3
0.28
0.26
0.24 «5 in
0.22 I
0.2 ^
0.18 'S,
0.16 1
0.14
0.12
fO.l 50
63
0.36 rO.9
IJU
0.86
0.82
0.78
-0.74
e u '3 B m 00 2 in
I
Composition, (mf MeOH)
0.28 0 10 20 30
Time, (hours) 40
Figure 4.3. Continued, (c) Process disturbances.
0.7 50
64
0 10 20 30 Time, (hours)
40
hO.65
50
Figure 4.4. Open-loop response to feed flowrate changes in lab column, (a) Overhead and bottom product compositions.
65
0.2
0.18 -
«5
i o . i 6
X 0.14 3
G
0.12 -
0.1 4
0
Boilup Rate
\
/—vW-»*W^\/IM^*^Sy-r-V
Reflux Rate
10
I I
20 30 Time, (hours)
40
Figure 4.4. Continued, (b) Reflux and boilup rates.
0.3
0.28
0.26
V0.24 5
0.22 i
0.2 5 3
0.18 o,
0.16 CO 0.14
ho. 12
0.1 50
66
0.4
c o
o
0.37
^ 0.34 Efficiency
o
4 - 1 CO
-o
0.31
0.28 -
0.25
' \,yj/w-vvvr^ ,Eowrate, (Ibmol/h)
0.92
0
W'^'^*''*''''''*%v«*\^
0.84
./,v Y JKAvw-vv/•'VA/*' ^ K•* * ^
Composition, (mf MeOH)
10 20 30 Time, (hours)
40
Figure 4.4. Continued, (c) Process disturbances.
0.76
e w u 00 CO
I 0.68
0.6 50
67
3C O o
2
0.16
o
o o. B o
B o o
CQ
0.12
0.08
0.04
0 10 20 30 Time, (hours)
40 50
Figure 4.5. Open-loop response to feed composition changes in lab column, (a) Overhead and bottom product compositions.
68
0.2 -rO.3
0.18
i o . i 6
§0.14
Boilup Rate
Reflux Rate
0.28
0.26
0.24 2^ I 10.22 I
0.2 B s
0.18 a * -.—v 0.16 m
0.12 - 0.14
0.1 0 10 20 30
Time, (hours) 40
Figure 4.5. Continued, (b) Reflux and boilup rates.
0.12
0.1 50
69
0.4
0.36 c .2 Ui O O.
I 0.32
CO
8
0.28 -
0.24 -
0.2
0
/ Flowrate, (Ibmols/h)
A'
Composition, (mf MeOH)
».--«-*V>''~"M<\4/.y,^/Ay\-^ -
Efficiency
r l
0.96
0.92
0.88 I"
0.84 I
0.8 I
V*»Vv^v^^^.^^^^^,y»SV'y/»^
10 20 30 Time, (hours)
40
Figure 4.5. Continued, (c) Process disturbances.
CO
0.76 I
-0.72 f
0.68
0.64
0.6 50
70
4.3.2. Open-Loop Responses for the High-Purity Column
The high-purity column produces products with less than 1000 parts per miUion
impurities, and its operation is more nonlinear than the lab-column. Therefore, the open-
loop step tests were performed by making ±1% changes in F, 2,1, and F, one variable at a
time, from their corresponding base case values shown in Table 3.1.
Figure 4.6a shows the response of the overhead and bottom product compositions to
the ±1% change in the boilup rate; Figure 4.6b shows the sequence of the '+' and *-' 1%
changes in boilup rate and the essentially "constant" reflux rate during the period of the
test; Figure 4.6c shows the variation in the feed flowrate and feed composition (the
measured disturbances) affecting the process during the same time period; and Figure
4.6d, the variations in the Murphree stage efficiency, the unmeasured process disturbance.
Once again the nonlinear and nonstationary behavior of the process is easily noticeable
from the open-loop response.
Figures 4.7a-d show the results from the ±1% change in the reflux rate; Figures 4.8a-
d, the resuhs from the ±1% change in the feed flowrate; and Figures 4.9a-d, the resuhs
from the ±1% change in the feed composition.
4.4. Steadv-State Analyses of Distillation Column Operation
The open-loop responses provide a qualitative picture of the process behavior in
terms of the nonlinearity and the extent of interaction between the inputs and outputs.
The process nonlinearity and degree of interaction influence the level of difficulty of a
control problem. Interactions arise whenever the control problem is multivariable, and
when each manipulated variable affects more than one controlled variable. While the
dynamic behavior of a process is of great importance to the selection of the control
strategy, oftentimes, a steady-state analysis of the process can yield valuable insights about
the nonlinearity and degree of interaction involved.
71
0.005
B o ti m 0.001 -I
0 J
0
Overhead Composition
10 40
0.986
0.984
20 30 Time, (hours)
Figure 4.6. Open-loop response to boilup rate changes in high purity column, (a) Overhead and bottom product compositions.
50
72
"ix^-
300
290 -
«6
§280 x> o CO
0.270
'o CD
260
- . . .w . , . . , * , , , ^ * . . . .™, , . . - . . ^^
250
Reflux Rate
Boilup Rate
lyVv^jn^/Wr/ryy^MA/
0 10 20 30 Time, (hours)
40
Figure 4.6. Continued, (b) Reflux and boilup rates.
200
190
(/]
180 I
170
-160
c 3
150 50
73
802 1
801.2
«6
J 800.4 o CO
i 799.6
798.8 -
798
Flowrate
0.13
-0.128
0.126 g
HO. 124 g
%>/^47V^VWvM^^^ 1-0.122 I
,A>A.Y^.CM,W^^^^S^,
Composition
-0.116
0.114
0 10 20 30 Time, (hours)
40 50
Figure 4.6. Continued, (c) Measured process disturbances.
74
0.9
0.85
c u o e u 00 CO
CO
0.8 -
I 0.75
0.7 -
0.65 0 10 40 20 30
Time, (hours)
Figure 4.6. Continued, (d) Unmeasured process disturbance.
50
75
0.005 -1
cS 0.001
0 10 20 30 Time, (hours)
40
0.997
0.994
0.991
h 0.985 50
X o
d o Jo O
a B o CO t>
0.988 ^
Figure 4.7. Open-loop response to reflux rate changes in high-purity column, (a) Overhead and bottom product compositions.
76
300
290 -
s, 0,270 a, o
OQ
260 -
250
Reflux Rate
200
190
C/i
180 i
Boilup Rate
Wv'>W\A*' >>vV''Mv ^
0 10 40 20 30 Time, (hours)
Figure 4.7. Continued, (b) Reflux and boilup rates.
-170
Si
c
-160
150
50
77
802 1
801.2
O
I 800.4
CO
^ 799.6
798.8
798 -J
Flowrate
/ 'k/vfN^WVwVHO^^^
0.13
0.128
0.126 g
0.124 g
0.122 I
>A/V~*S>-''^V»*<-<*YA,^
•^^'Yw'^^X',
0.116
0.114
0 10 20 30 Time, (hours)
40 50
Figure 4.7. Continued, (c) Measured process disturbances.
78
0.9
0.85 -
0.8 -
c u u s u 00
a
I 0.75 "W. '
0.7
0.65 0 10 20 30
Time, (hours) 40 50
Figure 4.7. Continued, (d) Unmeasured process disturbance.
79
0.005 1
m 0.001
0 10 20 30 Time, (hours)
40
0.997 g
0.994
0.991
0.988
0.985 50
u
e 'S Vi O O
B o U -u
Figure 4.8. Open-loop response to feed flowrate changes in high-purity column, (a) Overhead and bottom product compositions.
80
300
290
o OQ
260
t^ Reflux Rate
T 2 0 0
190
«5 C/l
B 280 -| 'AHW>V.^^A>V^AWMA^^^ IgO
CO
§.270 -I ^ ^ 1 1-170
y Boilup Rate
^VWM^iM^WMv^^^:|/y^^
250
160
150 0 10 20 30
Time, (hours) 40 50
Figure 4.8. Continued, (b) Reflux and boilup rates.
o B
Si
s.
81
810 -1
^ 8 0 5 -I
O
B
I 800
795
790 -! 0
k" Flowrate
0.13
-0.127
-0.124
*••» ^•*9*S^'%-^'^"
Composition
^'Aj\,^rv,^.,,^^^^^r-r^.^^ -0.121
X o u
2 1 tn O O,
i 1 -0.118
0.115
10 20 30 Time, (hours)
40 50
Figure 4.8. Continued, (c) Measured process disturbances.
82
0.9 1
I 0.75
0 10 20 30 Time, (hours)
40 50
Figure 4.8. Continued, (d) Unmeasured process disturbance.
83
0.005 1
xO o
o
^ 0 B O
'<—» O
o. fo
CJ E o cS 0.
004
003
002
001
0
0 10 20 30 Time, (hours)
40 50
Figure 4.9. Open-loop response to feed composition changes in high-purity column, (a) Overhead and bottom product compositions.
84
300
290 -
Ui
§280 JD
D.270
O CQ
200
190
- •*^\HM^'W*wv^/^*^fi(\;yvA^^^
Ui
180 i Xi
260
Boilup Rate
y'Tf^y^Kfr-'^^^tJ^^V^^
170
160
3 3
250 150 0 10 20 30
Time, (hours) 40 50
Figure 4.9. Continued, (b) Reflux and boilup rates.
85
802
801.2
Ui
J 800.4 Flowrate
^
o ^vwnAv^HVv^^/w^^ 0.124
^ArvwVWA.0_12
Composition ..wVAVV^
798
0.13
0.128
X 0.126 O
2
o
0.122 o B o
^
0.118
0.116 0 10 20 30
Time, (hours) 40 50
Figure 4.9. Continued, (c) Measured process disturbances.
86
0.9
0.85
c
00 CO
00
0.8 -
I 0.75
0.7 -
0.65 -! 0 10 20 30 40
Time, (hours)
Figure 4.9. Continued, (d) Unmeasured process disturbance.
50
87
The most important piece of information from a steady-state viewpoint is the process
gain, which is defined as the ratio of the magnitude of the change in an output variable
with respect to the magnitude of the change in any given input variable, the changes being
calculated from some base case value, when all other inputs are held constant. In essence,
all the open-loop responses shown in Figures 4.2a-4.9a give a qualitative picture of the
steady-state process gains. For example, in Figure 4.2a, the overhead product
composition steady-state process gain for the +10% change in the boilup rate, K ^^^y, is
calculated as
*^ BC '^+\0
i.e.,
0.87615-0.77424 ^px +K= = -4.1938,
P,x„^v 0.243-0.2673
and the steady-state process gain for the overhead product composition for the -10%
change in boilup rate, K^^^^y, is calculated as
,. ^D,BC-^D,-mv 0.87615-0.90910 ^ _ .^ A _y = = = -1.3560.
^BC-^-io 0.243-0.2187
The steady-state process gain for the bottom composition can be calculated for the ±
10% change in the boilup rate in a similar manner. Table 4.1 shows the steady-state
process gains for the overhead and bottom product compositions in the lab column,
calculated for the ±10% changes in F, z, L, and F along with the average values for the
steady-state process gains. Table 4.2 shows similar results for the high-purity column.
Both processes show nonlinear behavior with up to 100% process gain changes over the ±
10% and ±1% ranges. Tables 4.3 and 4.4 show the first-order plus dead time models for
the open-loop responses for the lab column and the high-purity distillation column
88
Table 4.1. Process Gains for Overhead and Bottom Compositions for the Lab Distillation Column
K P.XB
K P^D
+10% 10% Ave. +10% 10% Ave.
L V F z
3.03 -1.50 0.15 1.29
1.88 -3.59 0.79 0.76
2.46 -2.55 0.47 1.03
2.21 -4.20 0.40 0.71
3.93 -1.36 1.37 1.24
3.07 -2.78 0.89 0.97
89
Table 4.2. Process Gains for Overhead and Bottom Compositions for the High-Purity Distillation Column
K p,XB
K P^D
+ 1% -1% Ave. +1% 1% Ave.
L V F z
9.1e-4 -3.9e-4 1.7e-4 9.6e-l
3.5e-4 -1.4e-3 9.3e-5 3.1e-l
6.3e-4 -8.8e-4 1.3e-4 6.3e-l
2.7e-4 -5.2e-3 2.7e-5 2.6e-l
5.2e-3 -2.0e-4 5.0e-5 2.6e+0
2.7e-3 -2.7e-3 3.9e-5 1.4e+0
90
Table 4.3. First-Order Plus Deadtime Models for Overhead and Bottom Compositions for the Lab Distillation Column
Bottom Composition Overhead Composition
2.5 3.1 0.3155+1 0.1725 + 1
-2.5 -2.7 0.2295 + 1 0.1865+1
1.0 0.9 0.2865+1 0.2725 + 1
1.0 0.9 0.3295+1 0.2435 + 1
91
Table 4.4. First-Order Plus Deadtime Models for Overhead and Bottom Compositions for the High-Purity Distillation Column
Bottom Composition Overhead Composhion
0.00063e -0.149 J
0.5685+1 0.00275e -0.189 J
0.7035+1
-0.00088e -0.0785
0.3135 + 1 -0.0027e -0.2825
01.4935+1
F 0.00013^ -0.054J
0.2975+1 0.000039e -0.1575
0.7995 + 1
0.63 0.2975+1
1.45e -0.1895
1.8115+1
92
expressed in terms of the average values for the steady-state process gains, open-loop tune
constants and the dead times.
Another tool for steady-state analyses of process systems is the relative gain array
(RGA) and its uses in analyzing control loop interactions (Bristol, 1966). The original
RGA development involved steady-state considerations only. Since then, however, the
analyses have been extended to include the dynamic considerations to study the control
system stability and design (McAvoy, 1981). The RGA is a matrix of numbers, X^, where
each element X^j represents the ratio of the steady-state gain between the '/'th controlled
variable and the yth manipulated variable when all other manipulated variables are
constant to the steady-state gain between the same two variables when all the other
controlled variables are constant. More details on the properties and calculations involved
in determining the elements of the RGA are found elsewhere (McAvoy, 1983; Luyben,
1990).
Tables 4.3 and 4.4 show the RGAs for the lab column and the high-purity column
calculated using the average values for the overhead and bottom product steady-state
process gains for the reflux and boilup rate changes. The elements in the RGA can be
numbers that can vary from very large negative values to very large positive values. The
closer the number is to 1.0, the less difference closing the other loop makes on the loop
under consideration implying less interaction. In the two cases considered presently, it can
be seen that the elements of the RGA indicate a strong interaction between the controlled
variable-manipulated variable pairs. The large values for X,j in the RGAs is rather typical
for the chosen controlled variable-manipulated variable pairings.
The open-loop responses, the process gains, and the RGAs provide a qualitative and
quantitative assessment, respectively, of the nonlinear, nonstationarity, and interactive
nature of the two distillation columns.
93
Table 4.5. Relative Gain Array for the Lab Distillation Column using the Average Process Gains
x^ 7.86 -6.86 XQ -6.86 7.86
94
Table 4.6. Relative Gain Array for the High-Purity Distillation Column using the Average Process Gains
Xj^ 3.3658 -2.2658 XQ -2.2658 3.3658
95
CHAPTER V
MODEL-BASED CONTROL STRATEGY
The inherent nonlinearities in the behavior of chemical process systems, such as
distillation columns, present a challenging control problem. In spite of this knowledge,
chemical processes have traditionally used linear system analysis and tools for design of
controller structure because the demands for linear system analysis and implementation are
usually quite small. Also, the fact that there is an analytical basis for the linear systems
theory lends itself to more rigorous stability and performance proofs. However, the use of
linear system techniques can be quite limiting if the process behavior is highly nonlinear.
During the past decade, there has been a significant increase in the number of control
system techniques that are base on nonlinear system concepts (Bequette, 1991). Model-
based controllers are not a new concept; the Zeigler-Nichols tuning rules utilizing a
process response curve for identification of the tuning parameters {KQ, T;, X ,) of a PID
controller are based on the model parameters of a first-order plus dead time model
{Kp. T, e).
Some of the most significant developments to model-based control include algorithms
that use linear models such as Dynamic Matrix Control (DMC) (Prett and Garcia, 1988;
Cutler and Ramaker, 1980), Model Algorithmic Control (MAC) (Richalet et al., 1978),
Internal Model Control (IMC) (Garcia and Morari, 1982), and some of their extensions
that used nonlinear models. An in-depth review of some of the above techniques and their
related extensions is available in the papers by Bosley et al. (1992) and Bequette (1991).
The above techniques are all similar in the sense that they rely on dynamic models to
predict the behavior of the process over some fijture time interval, and control actions are
based on these model predictions. These techniques are, therefore, classified under the
broad category of Model Predictive Control (MPC). Another technique that uses
96
steady-state models with a reference system based on first-order dynamics has also been
studied and implemented extensively (Lee, 1993; Pandit et al., 1992; Ramchandran et al.,
1992; Rhinehart and Riggs, 1990; Riggs and Rhinehart, 1990; Lee and Sullivan, 1988).
We shall examine this technique in more detail.
51. Nonlinear Process Model-Based Control (Nonlinear PMBC;)
The basis for nonlinear PMBC lies in the concept of what is called as Generic Model
Control (GMC) (Lee and Sullivan, 1988), and its closely aligned relatives known as
Reference System Synthesis (RSS) (Bartusiak et al, 1988) and Internal Decoupling
(Balchen et al., 1988). The strategy is to find values of the manipulated variables that
force a model of the process to follow a desired reference system or trajectory.
Consider a dynamic model of a process described by a set of differential equations:
y =f{y,u,d,p,t), (5.1)
where y, the change in the process outputs with respect to time, /, is some nonlinear
fiinction,/, of>', the vector of process outputs of dimension «; w, the state vector (vector
of manipulated variables) of dimension w; d, the vector of process disturbances of
dimension /; and/7, the vector of model parameters of dimension q. In this simplified case,
we have considered a square system, i.e., the number of outputs and inputs are the same.
However, the technique is not limited to only such systems (Lee and Sullivan, 1988).
When the process is away from its desired setpoint, y^p, we would like the rate of
change of>', i.e.,y, to be such that the process is returning towards the setpoint, i.e.,
y'sp'^^x^ysp-y)^ (5-2)
where K, is a diagonal matrix. In addition, we would like the process to have zero offset,
i.e.,
y'sp^^iliysp-y)^^^ (5-3) 0
97
where Kj is another diagonal matrix. Therefore, a suitable reference system that can yield
satisfactory control performance will be some combination of the above objectives, i.e.,
y'sp = ^liysp -y)+^2Uysp -y)^^ (54) 0
Since it is desired that the control algorithm ensure the rate of change of the outputs
follow the selected reference system, i.e.,
y=ysp- (5-5)
Therefore, combining Equations 5.1 and 5.4 yields
f{y,u,d,pj) = K^(ysp-y) + K2]iysp-y)dt. (5.6) 0
The control law in Equation 5.6, to be solved at every sample time for the
manipulated variables, is a set of nonlinear algebraic equations in unknown variables. In
the control law described in Equation 5.6 it is possible to obtain a solution only when the
manipulated variable chosen to control a particular output appears in the model equation.
Henson and Seborg (1990) have shown that the RSS methods, such as GMC, are based
on principles of differential geometry and are known as systems of relative degree 1.
The process model in Equation 5.1 assumes that a dynamic model of the process can
be derived. However, steady-state models that describe the nonlinear, interactive behavior
of the process are available more easily. Also, the exact nature of the process is rarely
known. In the face of these uncertainties, an approximate model of the form
fssiy.t^,d,p,t) = 0, (5.7)
represents the steady-state behavior of the process. Although these models describe the
steady-state nonlinear, interactive behavior of the process, some estimate of the process
dynamics is required. The most likely estimates available to the designer are the average
time constants of the process obtained from step response curves. Although these
estimates may be inaccurate at different operating conditions, the degree of approximation
is often sufficient to obtain good control performance. Assuming that the dynamics of the
98
process can be represented by a first-order model, a simple estimate of the time response
of the output variables in moving from one steady-state to another can be given as
ysp=T^~\yss-y)> (5.8)
where T is a diagonal matrix of the estimated open-loop time constants, sind yss are the
steady-state values of the output variables if no fiirther control action is taken (Lee, 1993).
The diagonal elements of the matrix T are averaged time constants of the output variables
based on step changes of all input variables. Combining this approximate description of
the process dynamics whh the reference system in Equation 5.4, the uhimate response can
be calculated as
yss=y + T(K^iysp-y) + K^]{ysp-y)dt). 0
Note that T-K, and T-Kj are simply two other diagonal matrices, and so the form of the
control law becomes
yss =y+Ki(ysp-y) + K2Uysp-y)dt. (5.9) 0
The control action required to achieve this performance can be determined by
replacing^y with ^ ^ in the nonlinear steady-state model described in Equation 5.7, and
solving for the manipulated variables, u.
5.2. Nonlinear Process Model-Based Control of Distillation Columns
Control of any process involves selection of the manipulated variables, and there are a
number of choices or control configuration schemes for a given process. Distillation
columns also have their share of these control schemes (McAvoy, 1983). We have chosen
to use the Xi~,-L, x^-F configuration (also known as the energy balance scheme). The
energy balance scheme is a simple scheme wherein the overhead and bottom product draw
rates, D and B, respectively, are on level control, and the reflux rate, L, and the boilup
99
rate, V, are the manipulated variables that control the overhead and bottom product
compositions, respectively. Large values for the relative gains are typical of the energy
balance scheme (McAvoy, 1983). Even though the energy balance scheme gives the
highest degree of interaction for the controlled variable-manipulated variable pairing (high
values for the elements of the RGA), it has the advantage of excellent disturbance
rejection features, and is the simplest scheme commonly used in industrial practice. In
distillation control, disturbance rejection is the more important requirement as most
columns often operate over fixed operating ranges, and are frequently subject to
disturbances.
If an approximate steady-state model is a phenomenological model, it is not necessary
for the model to be explicit in either the output or manipulated variables. But, if the
approximate steady-state model is a neural network model, then it is advantageous to have
a process inverse model because the manipulated variables that will give the desired
performance can be calculated directly.
5.2.1. Using Neural Networks for Distillation Control
The neural network models already developed are the plant-inverse models of the
distillation columns which take feed flowrate, F, feed composition, z, overhead
composition, x , and bottoms composition, x^ as inputs, and calculates the reflux rate, L,
and the boilup rate, F required to maintain the distillation column overhead (x^) and
bottom (xg) compositions at their desired setpoints, x^sp and Xg^p, respectively.
Using the neural network steady-state model and the reference system of Equation
5.4, the control law in Equation 5.9 can be rewritten for the overhead and bottom
compositions as
XD,SS =XD+ f^lD iXD,SP - ^D ) + ^2D 1 (^D.SP " ^D ) ^ ^ (5- ^0)
100
and t
XB,SS =XS+K^ (X5 5P - X5 ) + K^B J(XB,SP - XB )dt, (5.11)
0
where x^^^s and Xg^s are the steady-state target values, and x^^p and x^^p are the desired
setpoints for x , and x , the current values for the overhead and bottom compositions,
respectively. K^^, K2P, K^B and K2B are the control law tuning constants which are
actually the product of the estimated average open-loop time constant, ip, and the
elements of the diagonal matrices K, or Kj, respectively. The diagonal elements of the
matrices are chosen for each output independently to obtain a "reasonable" response for
the process system, the term "reasonable" response implying a close match to the natural
dynamic response of the process system.
It is important to ensure a bumpless transfer from the "manual" mode (open loop) to
the "auto" mode (closed loop). Here, the process simulators start up on open loop. The
initial reflux and boilup rates are determined from the respective neural network inverse
models given actual values of the feed flowrate and composition along with some desired
XDSS and x ^ , and the processes are allowed to settle to near steady-state conditions.
When the controller is switched on, it is brought on-line with the intention of maintaining
the overhead and bottom product compositions at the last measured values. This prevents
an old setpoint "bump." Under this condition, x^^ « x^ and x^^ « x , which implies that
the contributions due to the error and cumulative error terms in Equations 5.10 and 5.11
will be negligible, and a bias can be calculated for each controlled variable as follows: K,=^D^S-^D (5 12)
and
bx,=XB,ss-XB^ (513)
where b^ and b^ are the biases on the overhead and bottom product compositions,
respectively. The steady-state target set-points, x ^ ^ and x ^ , are operator-specified
101
values. For the start-up operation, they are not calculated using the control law in
Equations 5.10 or 5.11. The overhead and bottom product composhions, x^ and x , are
measured from the process. The biases represent the mismatch between the process and
the neural network model, and is calculated only once, whenever the controller is switched
to automatic. The control law with the bias term included then reads as follows: r
XD,SS = ^x^ + ^D + ^ID (^ASP -XD) + K2DJ ( ^ A S P " ^D )^^ (5- H ) 0
and t
XB,SS = *x, +XB+ f^iB ixB,sp -XB) + K2B J{XB^SP " ^B )d/• (5.15) 0
Figure 5.1 gives a schematic description of the nonlinear PMBC control strategy that
uses the neural network steady-state model. The nonlinear PMBC controller "looks" at
the process at every controller time interval and calculates target values Xj^^s and x^^s
based on Equations 5.14 and 5.15. The steady-state target values along with the
measured values for feed flowrate, F, and feed composition, z, are used in the neural
network model for the distillation column. The neural network then calculates the reflux
rate, I , and the boilup rate, V, that will drive the process to the temporary steady-state
targets, x ^ and x ^ .
Changes in the disturbances (feed flowrate and feed composhion) are fed directly into
the model which enables the nonlinear PMBC controller to provide a nonlinear
feedforward response as well as nonlinear feedback. The nonlinear PMBC law serves to
linearize the outputs from the process with the help of the steady-state approximate model
assuming first-order process dynamics. This technique has also been referred to as
external (input-output) linearization (Isidori, 1989; Isidori et al., 1981). Also, the PMBC
controller has the advantage of using the process model to provide direct decoupling of
the manipulated variables for a multiple-input multiple-output system.
102
Distillation Column
D. Y
Ysp
Xsp
Trained Neural Network
Figure 5.1. The neural network model-based control strategy
103
CHAPTER VI
CONTROL RESULTS
The neural network model-based controller includes two elements: the reference
systems defined by Equations 5.14 and 5.15, and the steady-state neural network process
inverse model. The neural network controllers were tested for both servo (setpoint
changes) as well as regulatory (disturbance rejection) modes of operation on the dynamic
simulators of both columns. The results of the neural network controllers were bench-
marked against conventional PI controllers with a feedforward element for feed flowrate
and feed composition changes. Decouplers were not included in the conventional strategy
because the cross gain changes required such extensive gain scheduling that they could not
be structured as per conventional industrial practice.
Several controller tests were performed to check for setpoint changes and for
disturbance rejection capabilities. The lab column has a faster response time than the high-
purity column, therefore, one 60-hour run enabled study of both servo and regulatory
modes of operation for the lab-column. The high-purity column with its slower response
time required separate tests to present effectively the servo and regulatory modes. Table
6.1 gives a description of the controller tests for the lab column, while Table 6.2 gives the
description of the servo-mode controller tests, and Tables 6.3 and 6.4 describe regulatory-
mode controller tests for the high-purity column.
6.1. Lab Column Controller Tests
Figures 6. la-e show the resuhs from the controller tests described in Table 6.1 for the
lab column with the neural network model-based controller. Figure 6.1a shows the
response of the controlled variables, i.e., the overhead and bottom product compositions,
to setpoint changes and the variations in the process disturbances; the measured
104
Table 6.1. Description of the Controller Tests for the Lab Distillation Column
Time Description of the Changes (hours)
0.0 Open-loop start up with the following nominal values: F= 0.35 Ibmoles/h; z = 0.3 mole fraction methanol; L = 0.26 Ibmoles/h; V= 0.37 Ibmoles/h; T] = 80%
10.0 Controller switch on after bumpless transfer operation ^D.sp = 0^9 mole fraction methanol; XQ^P = 0.014 mole fraction methanol
15.0 Dual Composition Setpoint Change ^D.sp = 0-91 niole fraction methanol; x^^ = 0.025 mole fraction methanol
20.0 Dual Composition Setpoint Change XDSP^ 0.92 mole fraction methanol; x p = 0.035 mole fraction methanol
25.0 Dual Composhion Setpoint Change ^DSP " ^^3 mole fraction methanol; x p = 0.025 mole fraction methanol
30.0 Feed flowrate upset F = 0.42 Ibmoles/h (+20% change from nominal value)
35.0 Feed flowrate upset F = 0.28 Ibmoles/h (-20% change from nominal value)
40.0 Feed flowrate upset F = 0.35 Ibmoles/h (brought back to nominal value)
45.0 Feed composition upset z = 0.4 mole fraction methanol (+33% change from nominal value)
50.0 Feed flowrate upset z = 0.2 mole fraction methanol (-33% change from nominal value)
55.0 Feed flowrate upset z = 0.3 mole fraction methanol (brought back to nominal value)
60 0 End of Controller Tests
105
Table 6.2. Description of the Servo-mode Controller Test for the High-Purity Distillation Column
Time Description of the Changes (hours)
0.0 Open-loop start up with nominal values: F = 800.0 Ibmoles/h; z = 0.12 mole fraction methanol; L = 180.0 Ibmoles/h; V= 258.0 Ibmoles/h; r| = 80%
10.0 Controller switch on after bumpless transfer operation ^DSP = 0.99915 mole fraction methanol (850 ppm impurity) Xgsp = 0.0022 mole fraction methanol (2200 ppm impurity)
15.0 Dual Composition Setpoint Change ^Dsp = 0-99^ " ol fraction methanol (1000 ppm impurity) XBSP = 0.001 mole fraction methanol (1000 ppm impurity)
25.0 Dual Composition Setpoint Change XDSP = 0 9995 mole fraction methanol (500 ppm impurity) XBSP = 0.0005 mole fraction methanol (500 ppm impurity)
35.0 Dual Composition Setpoint Change XDSP" 0.999 mole fraction methanol (1000 ppm impurity) Xoco = 0.001 mole fraction methanol (1000 ppm impurity)
40.0 Dual Composition Setpoint Change XDSP = ^.9985 mole fraction methanol (1500 ppm impurity) Xo^o = 0.0015 mole fraction methanol (1500 ppm impurity)
50.0 End of Controller Test for Servo mode of operation
106
Table 6.3. Description of the Regulatory-mode (Feed Flowrate Upsets) Controller Test for the High-Purity Distillation Column
Time Description (hours)
0.0 Open-loop start up with nominal values: F = 800.0 Ibmoles/h; z = 0.12 mole fraction methanol; L = 180.0 Ibmoles/h; V= 258.0 Ibmoles/h; r| = 80%
10.0 Controller switch on after bumpless transfer operation XD.SP ^ 0.99915 mole fraction methanol (850 ppm impurity) Xp.sp ^ 0 0022 mole fraction methanol (2200 ppm impurity)
15.0 Dual Composition Setpoint Change ^D.sp ^ 0 999 mole fraction methanol (1000 ppm impurity) ^B.sp ~ 0 001 niole fraction methanol (1000 ppm impurity)
25.0 Feed Flowrate upset F = 900.0 Ibmoles/h (+12.5% change from nominal value)
35.0 Feed Flowrate upset F= 800.0 Ibmoles/h (brought back to nominal value)
45.0 Feed Flowrate upset F= 700.0 Ibmoles/h (-12.5% change from nominal value)
55.0 End of Controller Test for Feed Flowrate Upsets
107
Table 6.4. Description of the Regulatory-mode (Feed Composition Upsets) Controller Test for the High-Purity Distillation Column
Time Description (hours)
0.0 Open-loop start up with nominal values: F = 800.0 Ibmoles/h; z = 0.12 mole fraction methanol; L = 180.0 Ibmoles/h; V= 258.0 Ibmoles/h; TI = 80%
10.0 Controller switch on after bumpless transfer operation ^D.sp " 0.99915 mole fraction methanol (850 ppm impurity) ^B.sp " 0.0022 mole fraction methanol (2200 ppm impurity)
15.0 Dual Composition Setpoint Change XD.SP ^ 0.999 mole fraction methanol (1000 ppm impurity) ^B.sp " 0.001 mole fraction methanol (1000 ppm impurity)
25.0 Feed Composition upset z = 0.14 mole fraction methanol (+16.6% change from nominal value)
35.0 Feed Composition upset z = 0.12 mole fraction methanol (brought back to nominal value)
45.0 Feed Composition upset z = 0.10 mole fraction methanol (-16.6% change from nominal value)
55.0 End of Controller Test for Feed Composition Upsets
108
disturbances, feed flowrate and feed composhion, are shown in Figure 6.1b; the
unmeasured disturbance, Murphree stage efficiency, is shown in Figure 6. le. Figure 6. Ic
shows the changes in the manipulated variables, and Figure 6.Id shows the steady-state
target values for the overhead and bottom product composhions with time. The neural
network model presumes that the steady-state targets are the "true" setpoints to which h
tries to control the process. The difference between the actual setpoint and the steady-
state target is one indication of the mismatch between the neural network model and the
actual process.
After the open-loop process start-up, the neural network controller is brought on-line
(bumpless transfer) to maintain the overhead and bottom product composhions at the
value measured at 10 hours. The reflux and boilup rates are changing (Figure 6. Ic) to
maintain the overhead and bottom compositions at their respective setpoints. Even
though there is no nominal change, the random drift in the disturbances and Murphree
stage efficiency requires a noticeable change in the manipulated variables. The changing
Murphree stage efficiency is an unmeasured disturbance, and the neural network controller
corrects for this change purely on feedback. Feed flowrate and composition influences are
fed forward through the steady-state neural network model without any dynamic
compensation. Dual composition setpoint changes are made at 15, 20 and 25 hours (see
Figure 6. la). The setpoint changes are filtered to give a reference trajectory for the
setpoints. Feed flowrate disturbances are introduced at 30, 35 and 40 hours, and feed
composhion disturbances are introduced at 45, 50 and 55 hours (see Figure 6.1b). Each
change was designed to make the lab column operate under different condhions. Note
from Figure 6. Ic that the manipulated variables work much harder for the second feed
composhion change yet the noise on the controlled variables is the same. This shows the
ability of the controller to understand the process gain changes, and to reflect them in the
manipulated variable action.
109
0 10 20 30 40 Time, (hours)
50 60
Figure 6.1. Neural network model-based controller without dynamic compensation on lab column, (a) Overhead and bottom product compositions.
110
0.45
0.41 -
CO
0.37
CQ
J 0.33
u u
0.29
0.25 H 0
0.23 I
f0.15
10 20 30 40 50 60 Time, (hours)
Figure 6.1. Continued, (b) Measured process disturbances.
I l l
0 10 20 50 30 40 Time, (hours)
Figure 6.1. Continued, (c) Manipulated variables.
60
112
-0.02 0 10 50 60 20 30 40
Time, (hours) Figure 6.1. Continued, (d) Steady-state targets for controlled variables.
113
0.9
0.85 -
"o
t 0.8
0.75 --
0.7
0 10 20 30 Time, (hours)
40 50 60
Figure 6.1. Continued, (e) Unmeasured process disturbance.
114
As a benchmark, PI controllers with a static feedforward correction for feed flowrate
and feed composition influences were also implemented on the lab column and tested for
the same servo and regulatory changes described in Table 6.1. Figure 6.2a shows the
response of the controlled variables, and Figure 6.2b shows the changes in the manipulated
variables as determined by the PI controllers. Table 6.5 gives a comparison of the values
of the integrals of squared error (ISE), absolute error (lAE), and valve travel (VT) (a
penalty fimction that quantifies "the amount of work" done by the manipulated variables,
and is defined as the cumulative sum of |Z<, - Z,,+,|, for the '/'th sampling). All three
performance measures are normalized by the time period over which the integrals are
accumulated. Also, the sequence of disturbances, noise and drifts were kept identical to
that used in the neural network controller tests.
Both controllers were tuned to subjectively balance minimization of the ISE, lAE,
and VT for both the servo and regulatory modes, and the same tuning constants were used
throughout all performance tests. The Zeigler-Nichols tuning rules obtained from the
open-loop step response tests were used to calculate the initial values of the tuning
constants for the feedforward PI controllers. While there are methods for obtaining
estimates for the tuning constants in the GMC Law (Lee, 1993), the neural network
controller was tuned heuristically by increasing the proportional gains (A', and A' ^ in
Equations 5 14 and 5.15) till oscillations were observed. Then the integral constants {K2D
and K2B in Equations 5.14 and 5.15) were increased to remove offset in a reasonable time.
While the approach is not the most optimal, h reflects the industrial practice in tuning
controllers on-line, and is simple to implement.
The feedforward PI controller shows good control for the setpoint changes. But when the
operating conditions change it shows its inability to predict the process gain changes, and
implying the need for gain scheduling, and more advanced PI control strategies. The
nature of the changes in the manipulated variable (Figure 6.1c and 6.2b) also show that the
115
0.1
5 0.02 CQ
0 ^ 0
Overhead Composition
r'M-v^ ,.,, 0.93
Bottom Composition
10 50
0.95
0.91
0.89
-0.87
0.85
20 30 40 Time, (hours)
Figure 6.2. Static feedforward Pl-controller without dynamic compensation on lab column, (a) Overhead and bottom product composhions.
60
X o
8, B o U
I
116
TO.94
0 10 20 30 40 Time, (hours)
50 60
Figure 6.2. Continued, (b) Manipulated variables.
117
Table 6.5. Comparison of Controller Performances for the Lab Distillation Column
Neural Network Model-Based Controller
Time (hours)
15.0 20.0 25.0 30.0 35.0 40.0 45.0 50.0 55.0 60.0
ISE(x^)
1.161e-4 1.235e-4 1.291e-4 1.196e-4 6.606e-4 2.804e-3 1.107e-3 4.553e-4 4.594e-3 1.031e-3
ISE(x^)
1.371e-4 1.993e-4 1.227e-4 1.059e-4 4.023e-4 l.OOOe-3 4.440e-4 2.379e-4 4.452e-3 8.217e-4
lAE(x,)
1.559e-l 1.602e-l 1.672e-l 1.639e-l 2.869e-l 5.575e-l 3.727e-l 2.482e-l 6.239e-l 3.532e-l
IAE(x^)
1.730e-l 1.973e-l 1.647e-l 1.478e-l 2.368e-l 3.412e-l 2.629e-l 2.07 le-1 6.650e-l 2.999e-l
VT(F)*
0.5193 0.5869 0.5730 0.9302 1.7687 1.1883 1.3471 1.4846 1.3548 1.3746
VT(I)'
0.5300 0.5670 0.5944 0.9596 1.8918 1.1816 1.4232 1.6371 2.6353 1.5011
Time (hours)
15.0 20.0 25.0 30.0 35.0 40.0 45.0 50.0 55.0 60.0
Conventional Feedback PI plus Feedforward Controller
ISE(X5) ISE(x^) IAE(X5) IAE(x^) VT(^* VT(L)
1.251e-4 1.248e-4 1.640e-4 9.432e-4 1.624e-3 4.364e-3 1 746e-3 8.942e-4 1.433e-3 4.716e-4
5.205e-4 3.967e-4 2.704e-4 1.427e-3 1.193e-3 3.258e-3 1.520e-3 2.693e-4 8.646e-4 2.637e-4
1.639e-l 1.651e-l 1.852e-l 5.062e-l 5.323e-l 8.787e-l 6.250e-l 3.413e-l 4.283e-l 2.506e-l
3.732e-l 2.917e-l 2.395e-l 6.190e-l 5.148e-l 8.534e-l 6.410e-l 2.382e-l 3.605e-l 2.203e-l
0.9024 0.8914 0.9379 0.9086 1.0229 1.1243 1.0591 1.0836 1.0843 0.9883
1.3690 1.1833 1.0184 0.8570 0.9237 1.0084 0.9731 0.9931 1.1811 1.0459
- Valve Travel for the manipulated variable (reflux rate, L and boilup rate, V)
118
neural network controller is a nonlinear controller while the PI controller is a linear
controller. The neural network controller makes aggressive changes in the manipulated
variables when compared with the PI controller because the neural network controller uses
a nonlinear model of the process which enables better prediction of the required changes in
manipulated variable action over the entire operating range.
The neural network model-based controller was also tested with dynamic
compensation for feed flowrate and feed composhion for tests described in Table 6.1.
Figure 6.3 a shows the response of the overhead and bottom product composhions;
Figure 6.3b shows the changes in the manipulated variables; and Figure 6.3c shows the
steady-state target values for the overhead and bottom product compositions calculated by
the control laws. The controller performance did not show any significant improvement
that warranted dynamic compensation of the measured disturbances. Comparison of
Figures 6. Id and 6.3c shows the effect of dynamic compensation on the steady-state target
values. Whh dynamic compensation (Figure 6.3c), the controller does not take corrective
action immediately, and therefore, makes the change in the "right" direction when the
disturbance is noticed. On the other hand, whhout dynamic compensation (Figure 6.1c),
the controller reacts too soon, and therefore, heads in the "wrong" direction initially
before turning around.
Figures 6.4a-c show an expanded time-scale representation of the controller test
results shown in Figure 6. la. Figure 6.4a, b, and c show the setpoint changes, the feed
flowrate disturbances, and the feed composition disturbances, respectively. The resuhs
are for the neural network model-based controller without dynamic compensation for the
measured disturbances.
119
0.1
g0.08
1 0.06 -c o
Ui
O
I" 0.04 o B o O0.02
PQ
0
0
Overhead Composition
* ^S^'AKA'V"' /VV '^A'*\'V<' "t /v>»>U|Wv>A,
Setpoints
^V>*s/W'- <W' <'V' Mf*|M \ ^^
Bottoms Composition
0.95
0.93
0.91
0.89
0.87
X o
c o *^ Ui O O.
B o
CJ CS
u >
o
0.85 10 20 30 40
Time, (hours) 50 60
Figure 6.3. Neural network model-based controller with dynamic compensation on lab column, (a) Overhead and bottom product composhions.
120
0 10 20 30 40 Time, (hours)
50 60
Figure 6.3. Contmued. (b) Manipulated variables.
121
-0.02 0 10 20 30 40
Time, (hours) 50
fO.85 60
Figure 6.3. Contmued. (c) Steady-state targets for controlled variables.
122
10 20 Time, (hours)
1-0.85 30
Figure 6.4. Response of controlled variables to neural network model-based controller without dynamic compensation on lab column, (a) Setpoint changes.
123
0.1
g0.08 o s 1
0.06 c o Ui
o I" 0.04 o U £ o I 0.02 CD
^
Bottoms Com|x>sition
0 -t 30
• ^ ^ V ' ^ ' ^ V ^ * ^ ' ^ ^ '
Overhead Composition
ji% r- .1 J M i^"*"^!^*-!
35 40 Time, (hours)
0.95
0.93 O o s 1
Setpoints
^ ^ ^7—
0.91 cf o Crt O Ou
0.89 i u
CO
o 0.87 §
O
0.85 45
Figure 6.4. Continued, (b) Feed flowrate changes.
124
0.1 T
gO.08 -\7^ ^JN.r>. • V * v - ' - '
Overhead Composition
0.06 -c .2 'w
o
E0.04 o U £ o I 0.02
Bottoms Composition
y^.n,/^^^ .f\jn. y^,.^O^y~^
0
45
0.95
X 0.93 O
50 55 Time, (hours)
60
Figure 6.4. Continued, (c) Feed composition changes.
125
6.2. High-Puritv Column Controller Tests
Figures 6.5a-e show the control resuhs for the controller tests described in Table 6.2
for the high-purity column with the neural network model-based controller. Figure 6.5a
shows the response of the controlled variables; Figure 6.5b shows the variations in feed
flowrate and composition (the measured disturbances) affecting the process; Figure 6.5c
shows the changes in the manipulated variable; Figure 6.5d shows the steady-state target
values for the overhead and bottom product compositions; and Figure 6.5e shows the
variations in Murphree stage efficiency (the unmeasured disturbance). The overhead and
bottom responses show a similar initial open-loop start-up conditions as seen in the case of
the lab column. The controller is swhched on at 10 hours, followed by a series of setpoint
changes. Figures 6.6a and b show the corresponding responses obtained by using a
feedforward PI controller on the same column for the setpoint changes described in Table
6.2, and with the same disturbances, noise, and drifts as used for the neural network
model-based controller.
Figures 6.7a-d show control resuhs for the tests described in Table 6.3, with Figures
6.7a, b, c and d showing the response of the controlled variables, the feed flowrate
disturbances, the changes in the manipulated variables, and the steady-state target values,
respectively, for the neural network model-based controller. Once again, the column is
started-up on open-loop, and the neural network controller is used to bring the column to
the base case operating conditions of 1000 ppm impurity in both the overhead and bottom
product. The feed flowrate disturbances affect the column starting at 25 hours. Figures
6.8a and b show the corresponding results from the feedforward PI controller for the same
feed flowrate upsets at the same base operating conditions.
Figures 6.9a-d show control results for the tests described in Table 6.4, with Figures
6.9a, b, c and d showing the response of the controlled variables, the feed flowrate
126
0.005
g 0.004
":: 0.003 i O
Ui O
10.002 o B o I 0.001 CQ
0 0
Overhead Composition
10
Bottom Composition
20 30 40 Time, (hours)
Setpoints
50
-0.999
-0.998
-0.997
-0.996
0.995
X
o 2 1 e
'S Ui O tx B o
U CO <i •e u >•
O
0.994 60
Figure 6.5. Setpomt changes with neural network model-based controller without dynamic compensation on high-purity column, (a) Overhead and bottom product composhions.
127
850 0.15
830 -
o J 810 o
^790
^
770 -
Feed Flowrate 0.14 O u s
0.13 I Ui O (X B o u
0.12 •§
Feed Composition
750 0 10 20 30 40
Time, (hours) 50
fO.ll 60
Figure 6.5. Contmued. (b) Measured process disturbances.
128
330
0 10 20 30 Time, (hours)
40 50
200
60
Figure 6.5. Continued, (c) Manipulated variables.
129
0.005
X ^ 0.0042
0.0034 -
o ^ 0.0026 t—'
CO
•^0.0018 CO u 4—* C/3
0.001
Overhead Composition X o
0.995 ^
-0.99
0.985 ^
CO
-0.98 -^ CO
c/0
f 0.975
0 10 20 30 Time, (hours)
40 50 60
Figure 6.5. Continued, (d) Steady-state targets for controlled variables.
130
0.95
1 8 0.9 S w
20.85 C/D
I 0.8 -
0.75 -
0.7 —r-
10
I I
20 30 Time, (hours)
0 40 50 60
Figure 6.5. Continued, (e) Unmeasured process disturbance.
131
0.005
g 0.004
a o ? 0.001 CD
0 10 20 30 40 Time, (hours)
50
-0.999 X
o
0.998 *g
-0.997
0.996
-0.995
c .2 tn O cx B o •o
CO
-e > O
0.994 60
Figure 6.6. Setpoint changes with static feedforward PI controller without dynamic compensation on high-purity column, (a) Overhead and bottom product composhions.
132
330 -1 r200
310 —"••wv*'---'--.\MfA^Ar ^7^ -180
5 \ - ^ \ c«
2 \ "o ^ Boilup Rate Reflux Rate
270 \ i -140
u 4—> CO
« K—4*^1^,^^ I ^
230 ^ \ \ \ \ \ 1-100 0 10 20 30 40 50 60
Time, (hours)
Figure 6.6. Continued, (b) Manipulated variables
133
0.005
g 0.004
2
":: 0.003 c o *^^ ' Ui
o 1*0.002 o
CJ
I 5 0.001 CQ
0
0
^jt^^-"'^"*/ \''>^'^''.f'iA ^ S ^ ^ ^ ^
Overhead Composition
Bottom Composition
^ ' V T ^ ' ^ ' * ' V ' ' ^ ' ' ^ / ^ * ' sMU>YV|>r <\>^AftY ifY/
0.999 X O u
0.998 1 c 2
Setpoints ^ 0.997 ' ^ a. I
0.996 ^ CO
o x: ha
o -0.995 (5
0.994 10 20 30 40
Time, (hours) 50 60
Figure 6.7. Feed flowrate changes with neural network model-based controller without dynamic compensation on high-purity column, (a) Overhead and bottom product compositions.
134
950
890 ^
Ui
O
J 830 Q
^ 7 7 0
710
650 0
Feed Flowrate
Feed Composition
10 20 30 Time, (hours)
40 50
0.15
-0.14 X
o
c o -0.13 -2 Ui
O £ o U
-0.12 1
0.11 60
Figure 6.7. Continued, (b) Measured process disturbance.
135
300
150 4 0 10 20 30 40
Time, (hours) 50 60
Figure 6.7. Continued, (c) Manipulated variables.
136
0.005
X
^ 0.0042
X 0.0034
4>
5 0.0026
4 i ^
CO 4->
-d* 0.0018 CO
0.001
Overhead Composition
Bottom Composition
0.993
0.986
0.979
0.972
X
o
"S
t C/5
0.965 0 10 20 30 40 50 60
Time, (hours) Figure 6.7. Continued, (d) Steady-state targets for controlled variables.
137
0.005
0 10 20 40 50
0.999
-0.998
0.997
0.996
0.995
X o 2 1 c o Ui
O I u T3
CO
>
o
0.994 60 30
Time, (hours) Figure 6.8. Feed flowrate changes with static feedforward PI controller
without dynamic compensation on high-purity column, (a) Overhead and bottom product composhions.
138
300
30 Time, (hours)
Figure 6.8. Contmued. (b) Manipulated variables.
139
0.005
-0.999
-0.998
-0.997
0.996
-0.995
0.994
X o
d o Ui O a. B o
T3 CO
O
-£ > O
30 Time, (hours)
Figure 6.9. Feed composition changes with neural network model-based controller without dynamic compensation on high-purity column, (a) Overhead and bottom product compositions.
140
830
0 10 20 30 40 Time, (hours)
50
r0.15
60
Figure 6.9. Contmued. (b) Measured process disturbance.
141
300
Boilup Rate
•§270
£
5 240 CO
X
1 2 1 0
o
I 180
X
Reflux Rate
150 ^• —r-
10 0 20 30 40 Time, (hours)
50 60
Figure 6.9. Continued, (c) Manipulated variables.
142
0.005
X
^ 0.004
§< 0.003 -
o 00
^ 0.002 t—' CO
-^0.001 CO
C/3
0
0
Bottom Composition
10 20
Overhead Composition
40 50
X
o o
2
fO.96 60 30
Time, (hours) Figure 6.9. Continued, (d) Steady-state targets for controlled variables.
143
disturbances, the changes in the manipulated variables, and the steady-state target values,
respectively, for the neural network model-based controller. Figures 6.10a and b show the
corresponding resuhs from the feedforward PI controller for the same feed composhion
upsets at the same base operating conditions.
The neural network model-based controller shows superior performance because of
hs ability to predict the process behavior with a fair degree of accuracy over the entire
operating range. The feedforward PI controller shows good control, in fact one could say,
better control for the setpoint changes, but shows hs deficiency in predicting the process
gain changes when the process moves to different operating conditions. The controllers
were also tested for feed composition changes, and similar performances were observed.
Tables 6.6 and 6.7 give a comparison of the performance of the two controller based on
the normalized values for ISE, lAE and VT for the servo and regulatory tests performed
on the high-purity column.
144
0.005
g 0.004 -I
C M
£ ^0.003 ^ c .2 o I" 0.002 -I o
CJ
S 0.001 -I OQ
0 0
Overhead Composition
Bottom Composition Setpoints
10 20 30 Time, (hours)
40 50
rl
0.999 X O o
0.998 1
-0.997
-0.996
0.995
0.994 60
a o
. ^ 4—•
. V i «
Ui
O
f o
T3 CO u •e > O
Figure 6.10. Feed composhion changes with static feedforward PI controller without dynamic compensation on high-purity column, (a) Overhead and bottom product compositions.
145
300
I 270 £
Xi
if 240 \ CO
I §210 o
I 180
Boilup Rate
- f " ^ ^ ^ ^ ^ . ^ ^ Vf^'^fWwi
\tm&^-
Reflux Rate
150 4 0 10 20 30 40
Time, (hours) 50 60
Figure 6.10. Contmued. (b) Manipulated variables.
146
Table 6.6. Neural Network Model-Based Controller Performance for the High-Purity Distillation Colunm
Time (hours)
15.0 25.0 35.0 45.0 55.0
ISE(X5)
1.117e-6 2.286e-6 1.844e-6 1.846e-6 2.019e-6
Set
ISE(x^)
2.019e-6 1.269e-5 2.602e-6 3.639e-6 3.060e-6
-point changes
lAE{x^)
1.489e-2 3.084e-2 2.775e-2 2.81 le-2 2.885e-2
IAE(x^)
2.154e-2 6.45 le-2 3.383e-2 3.936e-2 3.629e-2
VT(F)*
113.84 214.88 210.13 220.69 223.67
VT(L)*
89.80 183.64 189.79 182.55 191.86
Time (hours)
15.0 25.0 35.0 45.0 55.0
Feed Composhion Changes
ISE(X5) ISE(x^) IAE(XB) IAE(X^) VT(F) V T ( I )
1.117e-6 2.286e-6 3.905e-6 4.541e-6 3.234e-6
2.019e-6 1.269e-5 7.072e-6 l.lOOe-5 l.lOOe-5
1.489e-2 3.084e-2 3.462e-2 3.638e-2 3.372e-2
2.154e-2 6.45 le-2 4.956e-2 5.868e-2 5.659e-2
113.84 214.88 279.15 241.10 224.74
89.80 183.64 200.41 191.84 200.43
Time (hours)
15.0 25.0 35.0 45.0 55.0
Feed Flowrate Changes
ISEixB) ISE(;c ) IAE{XB) IAE(X^) VT(F) VT(L)
1.117e-6 2.286e-6 2.809e-6 2.952e-6 3.745e-6
2.019e-6 1.269e-5 1.541e-5 1.796e-5 1.510e-5
1.489e-2 3.084e-2 3.260e-2 3.305e-2 3.336e-2
2.154e-2 6.451e-2 6.660e-2 6.753e-2 7.766e-2
113.84 214.88 274.67 254.35 254.41
89.80 183.64 228.63 208.34 211.18
- Valve Travel for the manipulated variable (reflux rate, L and boilup rate, V)
UI
Table 6.7. Conventional Feedback PI plus Feedforward Controller Performance for the High-Purity Distillation Column
Time (hours)
15.0 25.0 35.0 45.0 55.0
ISE(x^)
2.33 le-6 3.594e-6 3.103e-6 3.107e-6 3.797e-6
ISE(x^)
9.110e-7 5.047e-6 3.370e-6 2.250e-6 2.394e-6
Set-point changes
IAE(x^)
2.245e-2 3.917e-2 3.635e-2 3.625e-2 3.933e-2
lAEix^)
1.389e-2 4.572e-2 3.849e-2 3.120e-2 3.184e-2
VT(F)
1119.72 1660.48 1453.63 1549.52 1861.75
VT(I)
283.77 539.19 548.58 548.19 588.38
Feed Composition Changes
Time (hours)
15.0 25.0 35.0 45.0 55.0
ISEix^)
2.33 le-6 3.594e-6 5.637e-5 3.570e-6 5.075e-5
ISE(x^)
9.110e-7 5.047e-6 2.445e-5 1.41 le-5 1.264e-5
IAE(XB)
2.245e-2 3.917e-2 1.601e-l 1.220e-l 1.501e-l
IAE(x^)
1.389e-2 4.572e-2 9.775e-2 7.104e-2 7.359e-2
VT(F)
1119.72 1660.48 1830.71 1539.18 1590.38
VT(Z)
283.77 539.19 573.08 571.32 605.50
Feed Flowrate Changes
Time (hours)
15.0 25.0 35.0 45.0 55.0
ISE(X5)
2.33 le-6 3.594e-6 1.199e-4 6.884e-5 1.327e-4
ISE(:c^)
9.110e-7 5.047e-6 2.119e-4 1.341e-4 1.936e-4
lAE(x,)
2.245e-2 3.917e-2 2.378e-l 1.819e-l 2.591e-l
IAE(x^)
1.389e-2 4.572e-2 2.882e-l 2.416e-l 3.046e-l
VT(F)
1119.72 1660.48 1805.43 1487.70 1550.51
VT(L)
283.77 539.19 568.95 552.94 587.50
148
CHAPTER v n
PROCESS-MODEL MISMATCH
The issue of process-model mismatch is important from a model-based control
viewpoint because the efficacy of the control strategy is dependent critically on the model.
While a "good" model can enhance greatly the performance of the controller, a "bad"
model can easily make things worse. The question then becomes: what makes a model
"good" or "bad" from a process control perspective? It is usefiil to understand the basic
fimction of the model and how it can help improve the control of a process whose
behavior is characterized with high degrees of nonlinear, nonstationary, and interactive
behavior.
7.1 • Process-Model Mismatch
Conventional control structures based on classical PID controllers assume that the
process behavior over a small operating ranges exhibit linear dependence between the
controlled variable-manipulated variable pairings. Hence, in this operating region, linear
control laws are applicable; and as the theory is based on linear mathematical concepts, an
optimum controller configuration can be determined from exact analytical solutions.
However, the assumption of linearity is valid only over very small operating ranges as
most chemical processes are inherently nonlinear in their process behavior. Therefore, as
operating conditions change, the optimal settings also change. It is not practicable to
correct constantly for changing operating condhions. Therefore, some comprise is sought
and the controllers are tuned to obtain a "satisfactory" performance over a wide range of
operating condhions. Also, chemical processes exhibh nonstationary behavior making
identification of the process (an essential step in application of all linear control structures)
difficult because of the inability to identify the exact nature of the process. In addhion,
149
there is the inherent interaction between the controlled variable-manipulated variable pah's.
It is important to note that conventional PID controllers are model-based controllers; the
models used to identify the process are linear models of the process over a sufficiently
small operating range where the assumption of linearity is valid. The simple linear control
laws work fine for a number of cases but as the process behavior starts becoming more
complex in terms of the interactions and nonlinearity, the controller performance
deteriorates because the models are no longer able to account for the nonlinear process
behavior and are unable to decouple the interactions between the controlled variable-
manipulated variable pairs.
The models in advanced model-based control strategies enable decoupling of the
interactions between the controlled variables and manipulated variables. In addhion, if the
models are nonlinear, they provide a better understanding of the process behavior over a
wider range of operating conditions. The controller uses the model to decouple the
interactions between the controlled variables and manipulated variables while taking into
account the nonlinear process behavior, and therefore, they should have the ability to
control the process better. It must be noted that not all advanced model-based controllers
use nonlinear models. Many of the model predictive controllers use linear models (Bosley
etal., 1992).
Theoretically, if a model is an exact representation of the nonlinear, interactive
steady-state and dynamic process behavior, then it should provide the best control
performance under any operating condition. However, this is not true because no model is
ever exact. Chemical processes tend to be nonstationary and show different
characteristics as operating conditions change making the model inexact with respect to
the process. The extent of the inexactness between the actual process and the model of
the process is called the process-model mismatch. The mismatch between the process and
the model is one measure of the ability of the controller to control a process. Oftentimes,
150
as the mismatch increases, the controller performance can deteriorate rapidly. Controller
models may require adaptation (or parameter adjustment), either periodic (Riggs and
Rhinehart, 1990) or on-line (Rhinehart and Riggs, 1991), in order to minimize the process-
model mismatch and yield satisfactory controller performance.
Since it is virtually impossible to come up with an exact representation of the real
process behavior over a wide range of operating condhions, we attempt to capture the
major nonlinear characteristics of the process in the model by choosing a suhable number
of adjustable parameters. The aim in nonlinear PMBC is to develop models that can
approximate the real process behavior with sufficient accuracy. In the nonlinear PMBC
formulation using the GMC law, the integral term in the control law (Equations 5.14 and
5.15) provides an addhional mechanism that adjusts for the process-model mismatch.
7.2. Process-Model Mismatch for the Distillation Columns
The neural network models developed to control the distillation columns were
intentionally kept different from the dynamic simulations of the two methanol-water
distillation columns; the dynamic simulators represent the processes being controlled. The
dynamic simulators for the distillation columns were not used to obtain data to train the
neural network models. All the data for neural network training and testing were obtained
from the CAD simulations. While an empirically correlated polynomial equation obtained
from regressing experimental data were used for the VLE in the dynamic simulators, the
NRTL VLE model were used in the steady-state CAD simulations. In the CAD
simulations a Murphree stage efficiency of 75% was used in obtaining all the data for
network training and testing for both the columns, while in the dynamic simulations
nominal Murphree stage efficiencies of 80% and 85% were used in the lab colunm and the
high-purity column, respectively. Also, in the dynamic simulators, a first-order
autoregressive drift was added to the process inputs, feed flowrate and feed composhion,
151
and to the nominal Murphree stage efficiency. The neural network models do not have
any additional parameters that are adjusted, ehher periodically or on-Une, once the
networks have been trained.
In order to measure the extent of the process-model mismatch, 25 data points from
the two testing data sets were selected at random and presented to the trained neural
networks representing the two distillation columns. The neural networks were used to
calculate the reflux and boilup rates that would be required to achieve the steady-state
conditions defined by the values of F, z, x^ and Xg. The predicted values for L and T were
then used in the dynamic simulators along with values for F and z. The dynamic
simulators were allowed to run till "near" steady-state condhions were identified, and the
steady-state values for x^ and x^ were noted. The fact that the "process" (the dynamic
simulators) and the "model" (the neural networks trained using CAD data) are different is
illustrated in Figures 7.1 and 7.2. Figure 7.1a compares the steady-state values of the
overhead product composition obtained from the dynamic process simulator for the lab
column with that used in the neural network model of the lab column. Figure 7. lb shows
a similar comparison for the bottom product composition in the lab column. Figures 7.2a
and 7.2b shows similar comparisons for the overhead and bottom product composhions
for the high-purity column.
The points indicated by the triangles represent the steady-state values of the overhead
and bottom composition obtained from an idealized dynamic simulator (i.e., no drifts on
feed flowrate, feed composition, and Murphree stage efficiency, and no noise on the
measured variables), while the circles represent the "near" steady-state values for the same
variables from the dynamic simulator with all the "bells and whistles" (noise and drifts
included). It is worth noting that the two neural networks show distinctly different
characteristics for the "noisy" condition when compared with the "ideal" condhions.
Under ideal conditions, the same neural network model is closer to the real process
152
X o ^ 0.95
I 0.9 £
u 0.85 -
£ ^ 0.8
0.75
o With Noise & Drift A W/0 Noise & Drift
o o
0.84 0.86 0.88 0.9 0.92 XD used m Network, (mf MeOH)
0.94 0.96
Figure 7.1. Steady-state process-model mismatch for the lab column, (a) Overhead product composhion.
153
0.08 T
X
o 2 "-: 0.05 O
4-1
-2 3
.£ c^ u
0.02 -
I OQ
X
o With Noise & Drift A W / 0 Noise &D©ft
-0.01 0.01 0.02 0.03 0.04
XB used m Network, (mf MeOH) 0.05 0.06
Figure 7.1. Continued, (b) Bottom product composhion.
154
X o
fa 0.998 -o CO
3
bo u
I 0.996
£ o
cia
0.994
'"8
A A
A
A A
O
1
@ e { i
A ^ ^ ^ ^ A
With Noise & Drift A W / O Noise & Drift
0.994 0.996 0.998 XD used in Network, (mf MeOH)
Figure 7.2. Steady-state process-model mismatch for the high-purity column, (a) Overhead product composhion.
155
0.008
X
o : ! 0.0064
o With Noise & Drift A W / O Noise & Drift
% 0.0048
c/o o
0.0032 -
o 0.0016
CD X
0 ^•
0
o o
0.0005 0.001 0.0015 XB used m Network, (mf MeOH)
0.002
Figure 7.2. Continued, (b) Bottom product composhion.
156
behavior, while under noisy conditions, the predictions are distributed evenly around the
actual process behavior. Even though the randomization introduced in the dynamic
simulator to simulate the "real worid" experience tends to increase the process-model
mismatch at any one time instant, the neural network is able to control the process
satisfactorily because the neural network model is closer to the idealized process behavior.
The noise and other random fluctuations do not have any adverse influence on the
network predictions. The networks have extracted the phenomenological process
characteristics from the training data successfiilly during the learning process.
7.3. "It's the Gain Prediction. Stupid!"
From the above discussion on process-model mismatch it is clear that there is a fair
degree of mismatch between the processes and the models. However, h has been shown
from the results in Chapter VI that the controllers perform satisfactorily under various
operating conditions and were able to control the processes successfiilly. The process-
model mismatch shows only that the model is different from the process, and yet we have
shown good controller performance. It does not qualify why the using the model has
enabled better controller performance. One reason for using a model is to decouple the
interactions between the controlled and the manipulated variables. Another important
reason is that the model captures the nonlinear process behavior, and therefore, is a better
representation of the real process. One of the most important fiinctions of a model that is
critical to its success as a good controller model is hs ability to predict process gain
changes. From a process control standpoint, it is the gain prediction that matters the
most (Riggs, 1993). Gain prediction is defined as the change in the manipulated variable
for a given change in the controlled variable. Gain predictions have two components: the
magnitude and the direction. While its important that the magnitude of the change be
approximate to the real process gain change, it is the direction that is more critical. If a
157
model is able to point the right direction whh a reasonably approximate magnitude of
change, the model has the potential to make good control decisions. For satisfactory
control of any process it is essential that the model used to infer the control action be able
to predict the process gains with sufficient accuracy (Riggs, 1993).
Figure 7.3a shows the process gains for the reflux rate with respect to the overhead
composhion, AL/Ax^ and Figure 7.3b shows the process gain for the boilup rate with
respect to the bottoms composition, AF/Ax , for the lab column. Figures 7.4a and b
shows similar results for the high-purity distillation column. Despite the fact that the
"process" and the "model" are distinctly different from each other, the model is able to
describe the process gain changes. However, h must be noted that even though we
believe that an essential feature for a good process control model is the ability to predict
the gain changes accurately, there could be other aspects that determine the fitness of a
model for control applications.
158
14
X 12 O o s 1 1 0 Ui
i 8
1' • 4
2 -
0
0 4 6 8 10 dL/dXD - Dyn. Sun., (Ibmols/h/mf MeOH)
12 14
Figure 7.3. Steady-state process gains for the lab colunm. (a) Change in reflux rate with overhead product composition.
159
-5 -4 -3 -2 -1 0 dV/dXB - Dyn. Sim., (Ibmols/h/mf MeOH)
Figure 7.3. Continued, (b) Change in boilup rate with bottom product composhion.
160
0 1 2 3 d(L)/d(XD)-Process, (Ibmols/h/mf MeOH)
(Thousands)
Figure 7.4. Steady-state process gains for the high-purity column, (a) Change in reflux rate with overhead product composition.
161
-0.002
-0.01 -0.008 -0.006 -0.004 d(V)/d(XB)-Process, (Ibmols/h/mf MeOH)
-0.002
Figure 7.4. Continued, (b) Change in boilup rate with bottom product composhion.
162
CHAPTER Vin
DISCUSSION, CONCLUSION, AND
RECOMMENDATIONS
8.1. On Using Neural Network Steadv-State Process Inverse Models
The idea of using neural networks to develop steady-state inverse models is simple in
concept. The use of a neural network representing the inverse of the process to calculate
explicitly the manipulated variables in order to follow a reference system is not only
extremely appealing but is also highly practicable. The inverse models can be developed
using neural networks simply by deciding the right set of input-output variables at the
network-training stage. Most process engineers should have access to some approximate
steady-state model (CAD package, analytical equations, etc.) of their process or plant
which they employ to optimize and evaluate its performance. These steady-state models
could be used readily to generate all the data required to develop neural network models.
Once a neural network model is developed, it is ideally suited for control purposes. The
method of implementing the multivariable controller using steady-state inverse models is
simple, and follows standard industrial practice for tuning controllers. Therefore, we think
that the technique can be applied in practice, relatively frequently.
Another issue is that of robustness of the model in the face of data uncertainty, also
known as fault tolerance characteristics of the model. Neural networks are known to be
highly fault tolerant (Harmon, 1992). This is an important advantage in using neural
network models for process control, and even though it is not noticed immediately, h has
been mentioned from time to time in neural network literature. The issue deals with the
ability of models in handling faulty, corrupted, or physically meaningless data. We
experienced one aspect of fault tolerance of neural network models by serendipity.
163
A cursory glance at Figure 6. Id shows that the steady-state target values for the
bottom composition set-point, x^^s^ calculated by the control law in Equation 5.15, takes
values that are not only outside the training range (0.02-0.07 mole fraction methanol), but
are also physically meaningless (< 0.0 mole fraction methanol) for a short period of time.
Despite these obvious infractions, the neural network model does not show any adverse
behavior on receiving such spurious data. The reason being that even when the network
has been trained over specified ranges for the input variables, the network output variables
(L and V) are always bounded between the minimum and maximum values for the outputs
as determined from the training data. Also, the nature of the sigmoidal transfer fiinction
(the hyperbolic tangent, in this case) is such that for any input in the range ±oo, the
transformed output is always bounded (±1, in this case), thus ensuring that even when
"garbage" goes in, "garbage" does not come out. The issue of tolerating faulty or
corrupted data is a real-world problem extremely relevant to process control because most
field instruments are electrical or electronic devices that transmit information from remote
locations to a central data acquisition station and are susceptible to random influences that
can corrupt the information very easily. A phenomenological model, for mstance, would
fail under similar circumstances, and would require proper safeguards to be buih into the
system to prevent the model from "crashing." This is just one aspect to fault tolerance,
and by no means, is an attempt to state that neural networks can handle all types of fauhs.
There are several different types of faults that can adversely affect a muhivariable
controller and this research does not aim to evaluate the fault tolerance of neural network
models. However, the fault tolerance aspect of neural network models is worth exploring.
8.2. On Optimal Training of Neural Networks
Determining the "right" neural network model does require some experience. Like all
other regression techniques, coming up with an appropriate model is a trial-and-error
164
procedure with no guaranteed method to assure the "right" model. While techniques have
been investigated to determine initial network configuration and a set of initial weights
(Scott and Ray, 1993), these are, at best, estimates that enable more efficient training with
an algorithm such as backpropagation. While overfitting is an issue that cannot be
ignored, h becomes of increasing importance when the number of weights is of the order
of the number of training examples (such networks are called oversized networks). Also,
overfitting becomes more important when the gradient learning process (backpropagation)
is used for weight adjustment. However, with the optimization technique used here for
determining each new set of weights, the presence or absence of the validation set did not
make much difference to the overall network prediction characteristics. Other network
architectures with three, four, five, six and seven hidden nodes were also trained, but the
network with five hidden nodes gave the best overall performance based on the
normalized root mean square error for all patterns in the test data set.
While it would be desirable to include as many data points as possible to constitute a
training data set, it would be "best" to obtain a "reasonably good" model using the
minimum amount of data, from an engineering viewpoint. There is a distinct difference for
why we wish to use neural networks when compared to classical Connectionist ideology.
Most Connectionist prefer to use neural networks to model processes where the
phenomenological behavior is not quite clear. In such cases, it is advisable to gather as
much data as possible to train and test the network. In the present case, the process
phenomenon of distillation is older than the science of chemical engineering, and is quite
well understood with volumes written about its design and operation (Kister, 1992; Kister,
1990). We wish to use neural networks as a tool for modeling because h offers the
advantages of simplicity and tremendous computational speed-two important necesshies
for process control applications among the other advantages already mentioned earUer.
Therefore, the aim is to develop as good a model with as little data as possible. The
165
lab-column model required 81 data points, while the high-purity column model required
375 data points. Using the standard rule-of-thumb from statistical regression
methodology, if w is the number of unknowns (weights) then, at least, 2n data points were
selected to consthute the training set. The training of the high-purity column model was
tried with 225 data points (3 data points for F and 2, and 5 for each jc and x ; i.e.,
3x3x5x5 = 225 points), and 375 points (3 for F, and 5 for each z, x^, and Xg, i.e., 3x5x5x5
= 375 points), but these data sets did not yield as good a testing error as that obtained
when the data set comprised 375 data points obtained by selecting 3 data points for 2, and
5 for each F, x^ and x^. The dynamic simulations concurred the fact that upsets due to
feed flowrate changes were more critical than those due to feed composhion changes, and
hence, more data points were needed to capture the changes due to the former. The high-
purity column operation is more nonlinear than the lab column, and hence, a more "fine-
grain" description of the input-output behavior is needed. However, no effort was made
to determine an optimum number of training data points in this study.
8.3. In Conclusion
The novel approach presented in this study shows that neural networks can indeed be
used to model the steady-state process inverse of complex systems (see Section 3.3), such
as distillation columns. The neural network models when coupled whh a simple reference
system synthesis can be used to formulate a very simple multivariable controller (see
Section 5.2). The control strategy and the controller structure is simple to implement and
offers a practicable solution to a difficult control problem. The simplicity and directness
of the approach in addressing issues such as obtaining training and testing data from
CAD packages (see Section 3.3), training the neural networks with a more robust and
efficient nonlinear least-squares algorithm (see Sections 2.7 and 2.8), incorporating the
model in the feedback controller (see Section 5.2), and the use of steady-state models
166
(see Section 5.2.1) make it distinct and better when compared with conventional PI
control strategy.
The study was carried out using dynamic simulations of two different methanol-water
distillation columns-a lab-scale system and a high-purity industrial system. The neural
network models were trained on data obtained using steady-state CAD simulations for the
two methanol-water columns, and the CAD simulations were kept intentionally different
from the dynamic simulations to introduce mismatch between the actual process and the
model of the process. The neural network model-based controllers show good
performance for both servo as well regulatory modes of operation. The neural network
models are extremely portable because the only thing that identifies one model from
another is the number and the numerical value of the weights. There is no difference in the
internal structure and working of two feedforward neural networks once their
architectures (number of input, hidden, and output nodes, and the type of transfer
fiinction) have been specified. // is the set of weights that define the model. This is a
distinct advantage while considering the practical implementation of such systems.
Neural networks are not, by any means, a panacea. They have their advantages and
disadvantages just like any other methodology. Neural networks should be considered as
a tool that has its place among other tools in a modeler's kit. Neural networks offer the
advantages of computational simplicity with the added ability to model complex systems
with enormous processing power, speed and generality, and their development requires
little engineering effort. The disadvantages are that developing neural network models
requires some expertise. Data capturing the essence of the input-output relationship is
critical. In our case, we had a reasonably good understanding of the underlying process
phenomena; but, there are many problems of practical importance where there is little or
no understanding of the physical process. For such situations, neural networks do offer a
possible opportunity for modeling, but the model development is at best a trial-and-error
167
procedure, and will require extensive testing and validation by an experienced "coach"
before it can be used with any confidence. Neural networks come in different shapes and
sizes with respect to neuron structure, architecture, and training techniques, much like ice
cream flavors. What is the best neuron structure? What is the right architectural size?
What IS the best training technique for a given structure and type? These questions are all
highly debatable and much research is focused at finding answers to these questions. We
have not attempted to determine the "best" answer to a given problem. Instead, our
attempt was to determine if a particular type of neural network can be utilized to solve a
practical problem in an efficient manner. It is our opinion that neural networks indeed do
a remarkable job and have great potential, especially in the area of modeling complex
chemical process systems for model-based process control applications.
8.4. Recommendations
No study can ever be said to be complete, and this one certainly is not. The main
purpose of any study is to open fiirther doors and avenues to explore. Here are some
issues that are important, in my opinion, both from a scientific as well as technological
viewpoint in the fiirther development of neural networks and hs application in process
control.
8.4.1. Control of Systems with Higher Degrees of Complexity
The present study was concerned with demonstration of using neural network models
on two methanol-water distillation columns. Even though the binary system has a
nonideal VLE, and distillation operation is nonlinear, interactive, and nonstationary, h still
remains a fairly simple system from a phenomenological viewpoint. As the number of
components increase, the process phenomena becomes increasingly complex. The real
advantage of neural networks can be realized in control of complex process systems such
168
as multi-component distillation with nonideal VLE, and reactor systems with muhi-phase
reactions. Presently, control of such systems is at best nominal, and rehes heavily on
operator experience.
8.4.2. Constraint Control
Most process are affected by operating system constraints. The issue of constraint
control is becoming more important from an industrial practice perspective because almost
all processes are pushed to operate close to their designed capacity. Under such operating
conditions, often most of the "knobs" available to the operator are close to their
"saturation" values. For instance, in case of a distillation column, it may be desirable to
operate the column at the highest possible throughput for a specified separation. The
reboiler heat duty and the overhead condenser cooling duty may be close to their
maximum capacity. The heating medium in the reboiler and the cooling medium in the
condenser may be constrained with respect to their flowrates. When one of the
manipulated variables is constrained then it can no longer be used a manipulated variable,
but remains fixed and invariant until the constraint is no longer valid. This study did not
address the issue of constraint control using neural network model-based controllers.
However, some conceptual thoughts on using neural networks in a model-based control
environment that would permit dealing both constrained as well as unconstrained
operating conditions are presented here.
Figure 8.1 illustrates a possible structure for a neural network model-based control
that enables handling constraints (Rhinehart, 1993). The neural network model in this
case is a "process" model and not an "process inverse" model, i.e., it predicts the process
outputs when given the process inputs. In case of specific distillation column, a process
model could predict the overhead and bottom product compositions given the feed
flowrate and composition, reflux and boilup rates. The neural network model-based
169
Disturbances
Setpoints
r r
Manipulated Variables from
Constrained Optimizer
Neural Network Process Model
L I
Constraints
I I I I
i i i i Constrained
Optimization Routine
Process Outputs (CVs)
J
Steady-State Targets for CVs
I
i
I
Network I Predicted •
CVs f
H
Error between Actual and Network
Predictions for the CVs
I
- ^
J
Figure 8.1. Proposed structure for constrained neural network model-based control
170
controller could comprise the neural network process model along with a reference system
synthesis with assumed first order dynamics such as the GMC law, and a constrained
optimization algorithm such as the Marquardt method. The error between the steady-state
targets for the controlled variables predicted the reference system and that predicted by
the neural network process model can be minimized to determine the manipulated
variables. A suhable objective fimction could be defined as
4) = y\(yi,SS - YlMN ) ' + ( ! - y\W2,SS - y2,NN ) ' , (8- D
where ^ i ^ and Tj s are the steady-state target values for the controlled variables, Kj ^
and Y2JSJM are the neural network predictions for the controlled variables, and T| (0 < TI < 1)
is a weighting factor on the objective function to decide which controlled variable is more
important. The optimization algorithm can be used to search for the manipulated variables
that minimize the error fiinction under both unconstrained or constrained operating
condhions.
There are several issues that need to be addressed with respect to developing and
using the process model in the manner as described above. While development of the
process model should not be much different than that for developing process inverse
models, one has to be carefiil in using the models. Care should taken to check for input
multiplicity which can cause a great deal of trouble. The fauh tolerance and robustness of
such a system also needs to be carefiilly examined. The above control structure, however,
does have the promise of handling both unconstrained as well as constrained operations.
8.4.3. Comparison of Various Model-Based Controllers with Similar Control Structures
Advanced control using conventional PI controllers has been the focus of a number of
studies on distillation columns (Skogestad et al., 1990), and h is understood that while
using PI controllers selection of the control configuration is critical to the success of the
171
application. McAvoy (1983) has shown using steady-state RGA analysis that for dual-
composhion control in distillation columns, the energy balance scheme (:c -L and Xg-V)
gives the most coupled controlled variable-manipulated variable pairings, and the double
ratio {XjyLID, x^-VIB) gives the most decoupled scheme, while the two ratio schemes
{xj^-L/D, x^-Fand Xj^-L, x^-VIB) give intermediate coupling. The PI controller
comparisons made in this study uses the x^-L and x^-F scheme which happens to be the
most difficult PI control configuration. To make a more reasonable comparison one has
to choose a suitable PI control configuration and test it against an model-based control
structure of the same type. Such a study would enable making some qualitative and
quantitative assessment about conventional control versus advanced control strategies.
8.4.4. Optimize Neural Network Structure
The neural network models developed in this study have shown that such models can
be easily developed and are suitable for use as process control models. The models are
not the "best" neural network models. Even though the models that have been developed
do a great job of predicting actual process gain changes, there could still be some room for
improvement. Optimal network training addresses issues of finding (training) the "best"
neural network for a given problem. Many such optimal training schemes have been
suggested in literature (Weigand et al., 1990; Le Cun et al., 1990). In my opinion, the
issue of optimal training becomes more critical when the networks become oversized (i.e.,
when the number of weights « number of data points), and if a training algorithm such as
backpropagation is used for weight adjustment. With the more robust Marquardt method
used for weight adjustment, it does not appear to be that critical. However, a suhable
technique, such as using a separate validation set (Weigand et al., 1990) or using the
second-derivative information (Le Cun et al., 1990), may be usefiil in deciding when to
stop training.
172
8.4.5. Using Neural Networks on Systems with Little Process Understanding
In this study, neural networks have been used to develop models of processes where
the phenomenological understanding of the process is quite clear. Distillation operation is
quite well known and understood to a great extent. The main point in using neural
network model instead of phenomenological models was to reduce the computations
involved and, thereby, increase the speed of the controllers. But, there are many chemical
systems where the phenomenological understanding is still not very clear. Neural
networks have the potential to model such systems provided enough data is available to
enable development of a neural network model. It is important to note that the data has to
capture the variations in the different independent variables and the effect these changes
have on the dependent variables. Oftentimes, in cases where the phenomenological
understanding is limited, the available data is repethious and does not reflect the cause-
effect patterns. However, neural networks do have the potential to solve such problems
and with the help of suitably designed experiments neural network models can be
developed.
8.4.6. Experimental Demonstration of Neural Network Model-Based Control
Even though the present study focused only on implementing the neural network
model-based controllers on dynamic simulations of the actual processes, care was taken to
make these dynamic simulations as "real" as possible in order to study the controller
performance from a real-worid perspective. In order to add to the credibility of the
technique, the neural network model-based controller has to be demonstrated on an
experimental system, and if possible, followed-up with an industrial implementation.
173
Implementation with a real-time system requires addressing several new issues such as
interfacing with the data acquishion system, and transportability of the models. The
experimental demonstration will verify some of the salient features of the neural network
models, such as fault tolerance and sensitivity to noisy data, that were observed from the
simulation studies.
8.4.7. Testing Neural Network Models that Different Steady-State Characteristics
This issue is specific to modeling distillation columns. In the neural network models
developed for controlling distillation columns four inputs were used, i.e., F, z, x^, and x^,
to obtain predictions for two outputs, i.e., Z. and V. In certain cases, when the stage
efficiency in a column is not affected greatly by changes in vapor and liquid loading in the
column, an alternate structure can be formulated. On a steady-state basis. Equations 4.1,
4.2 and 4.3 can be written to eliminate the feed flowrate, F from the above equations.
This is valid because now the effect of changing F is captured in the ratios L/F and V/F,
and L and V increase or decrease proportionally with F when all other conditions remain
unchanged. A neural network model can be developed to take three inputs, i.e.,z,Xj^ and
Xg, and predict the ratios L/F and V/F, . The network now becomes simpler (fewer inputs,
and therefore, fewer weights, and perhaps, fewer data points for training).
However, one has to be extremely cautious in applying such a model because,
oftentimes, systems with large flowrates require larger internal liquid and vapor flowrates,
and in such cases, the stage efficiency is affected greatly by the loading in the colunm. The
idea may not be valid for all cases, but, nevertheless it may be worth exploring.
174
8.4.8. On-line Adaptation of the Neural Network Models
In the present case, all the neural network models used for control purposes were
static models, i.e., the models do not change or adapt to changing process condhions.
Any mismatch or offset between the process and the model are eliminated by the integral
terms in the nonlinear PMBC control laws. The basic difference between a neural network
models and a phenomenological model is that the latter can be adjusted or adapted either
periodically (i.e., at steady-state condhions) or continuously (i.e., at every sampHng
interval) to keep the model as close as possible to the process at all times. Whh the
present neural network models, no attempt was made to adapt the model. If the model
represents the actual process behavior closely, it can be used for process optimization
purposes with a greater degree of confidence. There are several possible schemes to adapt
neural network models, and all the procedures usually involve adjusting at least one
parameter that captures most of the process uncertainties. For instance, stage efficiency in
distillation columns is one quantity that introduces the greatest uncertainty. One possible
scheme for adaptation of the neural network process-inverse model could be as follows:
use a separate neural network model that takes F, z, x^, Xg, L and V as inputs and predicts
the stage efficiency, r\ as the output. This network can be used to obtain predictions for
stage efficiency under the curtent operating conditions defined by F, z, x^y, Xg, L and V.
The new TI can then be used in the neural network process-inverse model which now takes
F, z, x^, Xg, and TI as inputs and predicts I and F. The adaptation procedure can be
applied either periodically (Riggs and Rhinehart, 1990) or in an incremental on-line
manner (Rhinehart and Riggs, 1991).
175
BIBLIOGRAPPIY
Astrom, K. J., and McAvoy, T. J. "Intelligent Control," in J. Proc. Cont,, Vol. 2, No. 3, 1992,115-127.
Balchen, J. G., Lie, B., and Solberg, I. "Internal Decoupling in Non-Linear Process Control," mModel. Ident. Control, Vol. 9, 1988, 137-148.
Barnard, E. "Optimization for Training Neural Networks," in IEEE Trans, on Neural Networks, Vol. 3, No. 2, 1992, 232-240.
Bartusiak, R. D., Georgakis, C, and Reilly, M. "Nonlinear Feedforward/Feedback Control Structures Designed by Reference System Synthesis," in Chem. Eng. Set Vol.44, 1989, 1837-1851.
Battiti, R. "Accelerated Back-propagation Learning: Two Optimization Methods," in Complex Syst., Vol. 3, 1989, 331-342.
Batthi, R. "First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method," m Neural Computation, 4, 1992, 141-166.
Bequette, B. W. "Nonlinear Control of Chemical Processes: A Review," mind Eng. Chem. Res., 30, 1991, 1391-1413.
Bhagat, P. "An Introduction to Neural Networks," in Chem. Eng. Prog, Vol. 86, No. 8, 1990, 55-60.
Bhat, N., and McAvoy, T. J. "Use of Neural Nets for Dynamic Modeling and Control of Chemical Process Systems," in Comp. and Chem. Eng, Vol. 14, No. 4/5, 1990, 573-583.
Bhat, N. v., Minderman, P., McAvoy, T. J., and Wang, N. S. "Modeling Chemical Process Systems Via Neural Computation," in IEEE Cont. Syst. Mag., Vol. 10, No. 3, 1990, 24-30.
Bosley, J. R., Edgar, T. F., Patwardhan, A. A., and Wright, G. T. "Model-Based Control: A Survey," in Advanced Control of Chemical Processes. Eds. K. Najim and E. Dufour, IF AC Symposium Series No. 8, 1992, 127-136.
Bristol, E. H. "On a Measure of Interactions for Muhivariable Control," in IEEE Trans. Auto. Cont., AC-\\, 1966, 133.
176
Broyden, C. G., Dennis, J. E., and More, J. J. "On the Local and Superiinear Convergence of Quasi-Newton Methods," in J.LM.A., Vol. 12, 1973, 223-246.
Cooper, D. J., Hinde, Jr., R. F., and Megan, L. "Pattern-Based Adaptive Process Control," in Comp. and Chem. Eng, Vol. 14, 1990, 1339-1350.
Cooper, D. J., Megan, L., and Hinde, Jr., R. F. "Comparing Two Neural Networks for Pattern-Based Adaptive Process Control," in AIChE J., Vol. 38, No. 1, 1992a, 41-55.
Cooper, D. J., Megan, L., and Hinde, Jr., R. F. "Disturbance Pattern Classification and Neuro-Adaptive Control," in IEEE Cont. Syst. Mag, 12(2), 1992b, 42.
Cott, B. J., Reilly, P. M., and Sullivan, G. R. "Selection Techniques for Process Model-Based Controllers." Paper presented at AIChE Spring National Meeting, Houston, TX, 1985.
Cutler, C. R., and Ramaker, B. L. "Dynamic Matrix Control-A Computer Control Algorithm," in Proc. Amer. Cont. Conf, San Francisco, CA, 1980, Paper WP5-B.
Cybenko, G. "Continuous Valued Neural Networks with Two Hidden Layers are Sufficient." Report, Department of Computer Science, Tufts University, Medford, 1988.
Cybenko, G. "Approximations by Superpositions of a Sigmoidal Function," in Math. Control Signal Systems, 2, 1989, 303-314.
Ding, S. S., and Luyben, W. L. "Control of Heat-Integrated Complex Distillation Configuration," mind Eng Chem. Res., Vol. 29, 1990, 1240-1249.
Elaahi, A., and Luyben W. L. "Control of an Energy-Conservative Complex Configuration of Distillation Columns for Four-Component Separations," in Ind Eng. Chem. ProcessDes. Dev., Vol. 24, 1985, 368-376.
Finco, M. v., Luyben, W. L., and Polleck, R. E. "Control of Distillation Columns with Low Relative Volatilities," in Ind Eng Chem. Res., Vol. 28, 1989, 75-83.
Fruehauf, P. S., and Mahoney, D. P. "Improve Distillation-Column Control Design," in Chem. Eng Progress, Vol. 90, No. 3, 1994, 75-83.
Garcia, C. E., and Morari, M. M. "Internal Model Control. 1. A Unifying Review and Some New Results," in Ind. Eng. Chem. Process Des. Dev., Vol. 21, 1982, 308-323.
Grossberg, S. Classical and Instrumental Learning by Neural Networks in Progress in Theoretical Biology. Vol. 3, Academic Press, New York, 1977, 51-141.
177
Grossberg, S. Studies of Mind and Brain: Neural Principles of Learning Perception. Development. Cognhion, and Motor Control Reidell Press, Boston, MA, 1982.
Guez, A., Eilbert, J. A., and Kam, M. "Neural Network Architecture for Control," in IEEE Cont. Sys. Mag, Vol. 8, No. 2, 1988, 22-25.
Harmon, P. "Neural Networks: Hot Air or Hot Technology? Part I," in Intelligent Software Strategies, Ed. Paul Harmon, Vol. VIH, No. 4, 1992, 1-12.
Hebb, D. O. The Organization of Behavior, a Neuropscvchological Theorv. John Wiley, New York, 1949.
Hecht-Neilsen, R. "Counterpropagation Networks," in Appl. Opt., Vol. 26, No. 23, 1987, 4979-4984.
Henley, E. J., and Rosen, E. M. Material and Energy Balance Computations. John Wiley & Sons, New York, 1969.
Henley, E. J., and Seader, J. D. Equilibrium-Stage Separation Operations in Chemical Engineering. John Wiley, New York, 1981.
Henson, M. E., and Seborg. D, E. "A Critique of Differential Geometric Approach to Nonlinear Process Control." Presented at the IF AC World Congress, Tallinn, Estonia, 1990.
Himmel, C. D., and May, G. S. "Advantages of Plasma Etch Modeling Using Neural Networks Over Statistical Techniques," in IEEE Transactions on Semiconductor Manufacturing, Vol. 6, No. 2, 1993, 103-111.
Hinde, R. F., and Cooper, D. J. "Using Pattern Recognition in Controller Adaptation and Performance Evaluation," in Proc. Amer. Cont. Conf, San Francisco, CA, 1993, 74-78.
Hokanson, D. A., and Gerstle, J. G. "Dynamic Matrix Control Multivariable Controllers," in Practical Distillation Control. Ed. W. L. Luyben, Van Nostrand Reinhold, New York, 1992, Chapter 12, 248-271.
Hopfield, J. J. ""Neural Networks and Physical Systems with Emergent Collective Computational Abilhies," in Proc. Natl. Acad Sci., Vol. 79, 1982, 2554-2558.
Hopfield, J. J. ""Neurons with Graded Response Have Collective Computational Properties Like Those of Two State Neurons," in Proc. Natl. Acad. Sci., Vol. 81, 1984, 3088-3092.
178
Hsiung, J. T., Suewatanakal, W., and Himmelblau, D. M. "Should Backpropagation be Replaced by a More Effective Optimization Algorithm?" in Proc. IJCNN, Seattle WA, 1991.
Humphrey, J. L., Seibert, A. F., and Koort, R. A. "Separation Technologies-Advances and Priorities." DOE Contract AC07-90ID 12920, February 1991.
Hush, D. R., and Sales, J. M. "Improving the Learning Rate of Backpropagation with the Gradient Reuse Algorithm," in Proc. IEEE Intl. Conf of Neural Networks (2nd), July, 1988, 441-448.
Isidori, A. Nonlinear Control Svstems. 2nd Edition, Springer-Veriag, New York, 1989.
Isidori, A., Krenner, A. J., Gori-Giori, C, and Monaco, S. "Nonlinear Decoupling via Feedback: A Differential Geometric Approach," in IEEE Trans. Auto. Cont, AC-26, 1981,331.
Kister, H. Z. Distillation-Design. McGraw Hill, New York, 1992.
Kister, H. Z. Distillation-Operation. McGraw Hill, New York, 1990.
Kollias, S., and Anastassious, D. "Adaptive Training of Multilayer Neural Networks using a Least Squares Estimation Technique," in Proc. IEEE Intl. Conf. of Neural Networks{2nd), Vol. I, 1988, 383-390.
Kramer, M. A., and Leonard, J. A. "Diagnosis Using Backpropagation Neural Networks-Analysis and Criticism," in Comp. and Chem. Eng, Vol. 14, No. 12, 1990, 1323-1338.
Kung, S. Y., and Hwang, J. N. "An Algebraic Projection Analysis of Optimal Hidden Units Size and Learning Rates in Backpropagation Learning," in Proc. IEEE Intl. Conf of Neural Networks (2nd), Vol. I, 1988, 363-370.
Le Cun, Y., Denker, J. S., and Solla, S. A. "Optimal Brain Damage," in Advances in Neural Information Processing Svstems. Ed. David S. Touretzky, Morgan Kaufinann, Vol.2, 1990,598-605.
Le Cun, Y. "HLM: A Multilayer Learning Network," in Proc. Connectionist Model Summer School, Pittsburgh, 1986, 169-177.
Lee, P L . "Generic Model Control-The Basics," in Nonlinear Process Control: Applications of Generic Model Control. Ed. Peter L. Lee, Springer-Veriag London Ltd., UK, 1993, 7-42.
179
Lee, P. L., and Sullivan, G. R. "Generic Model Control," in Comp. and Chem. Eng., Vol. 12, No. 6, 1988,573-580.
Leonard, J. A., and Kramer, M. A. "Improvement of the Backpropagation Algorithm for Training Neural Networks," in Comp. and Chem. Eng., Vol. 14, No 3, 1990a 337-341.
Leonard, J. A., and Kramer, M. A. "Limhation of the Backpropagation Approach to Fault-Diagnosis and Improvement with Radial Basis Functions." Presented at the AIChE Annual Meeting, Chicago, IL, 1990b, Paper 96e.
Luyben, W.L. Process Modeling. Simulation and Control for Chemical Engineers. 2nd Edition, McGraw Hill Co., New York, 1990, 129-141.
Luyben, W. L., (Editor). Practical Distillation Control. Van Nostrand Reinhold, New York, 1992.
MacMurray, J., and Himmelblau, D. "Identification of a Packed Distillation Column for Control via Artificial Neural Networks," in Proc. Amer. Cont Conf, San Francisco, CA, 1993, 1455-1459.
Marquardt, D. W. "An Algorithm for Least-Squares Estimation of Nonlinear Parameters," in J. Soc. Indust Appl. Math., 11, 2, 1963, 431-441.
McAvoy, T. J. "Connection Between Relative Gain and Control Loop Stability and Design," in AIChE J., Vol. 27, 1981, 613-619.
McAvoy, T. J. Interaction Analysis: Principles and Application. ISA Monograph #6, Instrument Society of America, NC, 1983.
McClelland, T. L., and Rumelhart, D. E. Parallel Distributed Processing. PDP Research Group, MIT Press, Cambridge, MA, 1986.
Megan, L., and Cooper, D. J. "A Neural Network Approach to Adaptive Control of a Pilot Plant Distillation Column." Presented at the AIChE Annual Meeting, St. Louis, MO, 1993, Paper 145f
Mehta, D. D., and Ross, D. E. "Optimize ICI Methanol Process," in Hydrocarbon Processing, November, 1970, 183-186.
Mehta, D. D., and Pan, W. W. "Purify Methanol This Way," in Hydrocarbon Processing, February, 1971, 115-120.
180
Minsky, M., and Pappert, S. Perceptrons. MIT Press, Cambridge, MA, 1969.
Muhrer, C. A., Collura, M. A., and Luyben, W. L. "Control of Vapor Recompression Distillation Columns," in Ind. Eng. Chem. Res., Vol. 29, 1990, 59-71.
Nahas, E. P., Henson, M. A., and Seborg, D. E. "Nonlinear Internal Model Control Strategy for Neural Network Models," in Comp. and Chem. Eng, Vol. 16, No. 12, 1992, 1039-1057.
Namatane, A., and Kimata, Y. "Improving the Generalizing Capabilities of a Back-Propagation Network," in Neural Networks, Vol. 1, 1989, 86-93.
Narendra, K. S., and Parthasarathy, K. "Identification and Control of Dynamical Systems Using Neural Networks," in IEEE Trans, on Neural Networks, Vol. 1, No. 1, 1990, 4-27.
Pandit, H. G., and Rhinehart, R. R. "Experimental Demonstration of Constrained Process Model-Based Control of a Nonideal Binary Distillation Column," Proc. Amer. Cont Conf, Chicago, IL, 1992, 630-631.
Pandit, H. G., Rhinehart, R. R., and Riggs, J. B. "Experimental Demonstration of Nonlinear Model-Based Control of a Nonideal Binary Distillation Column," Proc. Amer. Cont Conf, Chicago, IL, 1992, 625-629.
Papastathopoulou, H. S., and Luyben, W. L. "Control of Binary Sidestream Distillation Columns," in Ind Eng Chem. Res., Vol. 30, 1991, 705-713.
Parker, D. B. "Optimal Algorithm for Adaptive Networks: Second Order Backpropagation, Second Order Direct Propagation, and Second Order Hebbian Learning," in Proc. IEEE Conf on Neural Networks, Vol. II, 1987, 593-600.
Patwardhan, A. A., Rawlings, J. B., and Edgar, T. F. "Nonlinear Model Predictive Control," in Comp. Chem. Eng, Vol. 14, 1990, 123.
Peel, C, Willis, M. J., and Tham, M. T. "A Fast Procedure for the Training of Neural Networks," in J. Proc. Cont, Vol. 2, No. 4, 1992, 205-211.
Poh, I., and Jones, R. D. "A Neural Network Model for Prediction," in J. Amer. Stat Assn., Vol. 89, No. 425, 1994, 117-121.
Pottman, M., and Seborg, D. E. "Identification of Nonlinear Processes Using Reciprocal Multiquadratic Functions," in J. Proc. Cont, Vol. 2, No. 4, 1992, 189-203.
181
Press, W. H., Teukolsky, S. A., Vetteriing, W. T., and Flannery, B. P. Numerical Recipes in C, The Art of Scientific Computing 2nd Edition, Cambridge University Press, England, 1992, 683-688.
Prett, D. M., and Garcia, C. E. Fundamental Process Control. Butterworths, Boston, MA, 1988.
Psichogios, D. C, and Unger, L. H. "A Hybrid Neural Network-First Principles Approach to Process Modeling," in AIChE J., Vol. 38, No. 10, 1992, 1499-1511.
Raich, A., Wu, X., and Qmi, A. "Approximate Dynamic Models for Chemical Processes: A Comparative Study of Neural Networks and Nonlinear Time Series Modeling Techniques." Paper presented at the AIChE Annual Meeting, Los Angeles, CA, 1991, Paper 143b.
Ramchandran, S. "Marquardt Method-A Program for Nonlinear Optimization and Equation Solving." Technical Report, Department of Chemical Enginnering, Texas Tech University, Lubbock, TX, 1993.
Ramchandran, B., Riggs, J. B., and Heichelheim, H. R. "Nonlinear Plant-Wide Control: AppHcation to a Supercritical Fluid Extraction Process," in Ind. Eng. Chem. Res., 31, 1992, 290-300.
Rhiel, F. F. "Model-Based Control," in Practical Distillation Control. Ed. W. L. Luyben, Van Nostrand Reinhold, New York, 1992, Chapter 21, 440-450.
Rhinehart, R. R. Personal Communication, Department of Chemical Engineering, Texas Tech University, Lubbock, TX, 1993.
Rhinehart, R. R. and Riggs, J. B. "Two Simple Methods for On-line Incremental Model Parameterization," in Comp. and Chem. Eng, Vol. 15, No. 3, 1991, 181-189.
Rhinehart, R. R. and Riggs, J. B. "Process Control Through Nonlinear Modeling," in Control, Vol. 1, No. 7, 1990, 86-90.
Richalet, J. A., Rauh, A., Testud, J. D., and Papon, J. D. "Model Predictive Heuristic Control: Applications to Industrial Processes," in Automatica, Vol. 14, 1978, 413.
Ricotti, LP. , Ragazzini, S., and Martinelli, G. "Learning Word Stress in a Sub-optimal Second Order Backpropagation Neural Network," in Proc. IEEE Conf on Neural Networks, Vol. I, 1988, 355-361.
182
Rietman, E. A., and Lory, E. R. "Use of Neural Networks in Modeling Semiconductor Manufactunng Processes: An Example for Plasma Etch Modding," in IEEE Transactions on Semiconductor Manufacturing, Vol. 6, No. 4, 1993, 343-347.
Riggs, J. B. "It's the Gain Prediction, Stupid!" Personal Communication, Department of Chemical Engineering, Texas Tech University, Lubbock, TX, 1993.
Riggs, J. B. "Nonlinear Process Model Based Control of a Propylene Sidestream Draw Column," in Ind. Eng. Chem. Res., Vol. 29, 1990, 2221-2226.
Riggs, J. B., Beauford, M., and Watts, J. "Using Tray-to-Tray Models for DistUlation Control," in Nonlinear Process Control: Applications of Generic Model Control. Ed. Peter L. Lee, Springer-Veriag London Ltd., UK, 1993, 67-103.
Riggs, J. B., and Rhinehart, R. R. "Comparison Between Two Nonlinear Process-Model Based Controllers," in Comp. and Chem. Eng, Vol. 14, No. 10, 1990, 1075-1081.
Rosenblatt, F. "The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain," in Psych. Rev., Vol. 65, 1958, 386-408.
Rumelhart, D. E., Hinton, G. E., and Williams, R. J. "Learning Internal Representations by Error Propagation," in Parallel Distributed Processing. Eds. D. E. Rumelhart and J. L. McClelland and the PDP Research Group, MIT. Press, Vol. 1, Chap. 8, 1986, 318-362.
Scott, G. M., and Ray, W. H. "Creating Efficient Nonlinear Neural Network Process Models That Allow Model Interpretation," in J. Proc. Cont, Vol. 3, No. 3, 1993, 163-178.
Seborg. D. E., Edgar, T. F., and Shah. S. L. "Adaptive Control Strategies for Process Control: A Survey," in AIChE J., Vol. 32, No. 6, 1986, 881-913.
Skogestad, S., Lundstrom, P., and Jacobsen, E. W. "Selecting the Best Distillation Control Strticture," in AIChE J., Vol. 36, No. 5, 1990, 753-764.
Thibault, J., and Grandjean, B. P. A. "Neural Networks in Process Control-A Survey," in Advanced Control of Chemical Processes. Eds. K. Najim and E. Dufour, IF AC Symposium Series No. 8, 1992, 251-260.
Treybal, R. E. Mass Transfer Operations. 3rd Edition, McGraw Hill Co., New York, 1980, pp 342-473.
Tyreus, B. D., and Luyben, W. L. "Controlling Heat Integrated Distillation Colunms," in Chem. Eng Progress, Vol. 72, No. 9, 1976, 59-66.
183
Venkatasubramanian, V., Vaidyanathan, R., and Yamamoto, Y. "Process Fault Detection and Diagnosis Using Neural Networks-1. Steady-State Processes," in Comp. and Chem. Eng, Vol. 14, No. 7, 1990, 699-712.
Venkatasubramanian, V., and Chan. K. "A Neural Network Methodology for Process Fauh Diagnosis," in AIChE J., 35, 1989, 1993-2005.
Watrous, R. L. "Learning Algorithms for Connections and Networks: Applied Gradient Methods for Nonlinear Optimization," in Proc. IEEE Conf on Neural Networks, Vol.11, 1987,619-627.
Weigand, A. S., Huberman, B. A., and Rumelhart, D. E. "Predicting the Future: A Connectionist Approach," in Intl. J. Neural Systems, Vol. 1, No. 3, 1990, 193-209.
Werbos, P. "Beyond Regression: New Tools for Prediction and Analysis in Behavioral Sciences," Ph.D. Dissertation, Harvard University, 1974.
White, H. "Some Asymptotic Results for Backpropagation," in Proc. IEEE Conf on Neural Networks, Vol. HI, 1987, 261-266.
Widrow, B. "Generalization and Information Storage in Networks of Adaline 'Neurons'," in Self-Organizing Svstems. Eds. M. C. Jovitz, G. T. Jacobi, G. Goldstein, Spartan Books, Washington, D C , 1962, 435-461.
WiUis, M. J., Montague, G. A., Di Massimo, C, Tham, M. T., and Morris, A. J. "Artificial Neural Network Based Predictive Control," in Advanced Control of Chemical Processes. Eds. K. Najim and E. Dufour, IF AC Symposium Series No. 8, 1992, 261-266.
Willis, M. J., Di Massimo, C, Montague, G. A., Tham, M. T., and Morris, A. J. "On The Applicability of Neural Networks in Chemical Process Control." Presented at the AIChE Annual Meeting, Chicago, IL, 1990, Paper 16d.
Wood, R. K., and Berry, M. W. "Terminal Composition Control of a Binary Distillation Column," in Chem. Eng Sci., Vol. 28, 1973, 1707-1717.
You, Y., and Nikolaou, M. "Dynamic Process Modeling with Recurrent Neural Networks," in AIChE J., Vol. 39, No. 10, 1993, 1654-1667.
Zurada, J. M. Introduction to Artificial Neural Svstems. West Publishing Co., New York, 1992.
184
APPENDIX A
ERROR BACKPROPAGATION TRAINING
ALGORITHM
The following development of the error backpropagation training algorithm has been
adapted from the class notes of Dr. W. J. B. Oldham's graduate course on neural networks
at Texas Tech University (CS 5388: Neural Networks). Let us consider a three-layered
feedforward neural network with hyperbolic tangent transfer fiinctions for all the nodes in
the hidden and output layers. The hyperbolic transfer fiinction is given as
v|/(x) = tanh(x),
which can also be written as
^ W = ^ r — ^ ' (A.1) e -\-e
and the derivative of vj/(x) is given as
e -\-e
I.e.,
vl/•(x)=l-(vi/(x))^ (A.2)
The hyperbolic tangent transfer fiinction described in Equation A. 1 is continuous and is
bounded between ±1, for the range -c» < x < oo.
Consider the three-layered network as shown in Figure A. 1. The details of the
processes occurring at the '/'th hidden node and 'A'th output node are shown in Figure A.2
and A.3, respectively. Inputs to the '/'th node in the hidden layer can be written as
Zi=i:yv]jXj, (A.3) j
and the transformed output from the '/'th node in the hidden layer can be written as
185
X(l)
X(2)
X(j)
X(n)
Y(l)
Y(k)
M Y(m)
Figure A.1. A 3-layered feedforward neural network
186
X(j).W. . ^ 1,J
h(i)
Figure A.2. Processes in the 'i'th neuron in the hidden layer
187
h(i).w2. r(k) Y(k)
Figure A.3. Processes in the 'k'ih neuron in the output layer
188
hi = ^¥{^,). (A.4)
Similarly for the 'A'th node in the output layer, the summed input is given as
r,=I.wlh,, (A.5)
and the transformed output is given as
yk=^M- (A.6)
Let Ypi and Dpi be the actual (network predicted) and desired responses, respectively,
at the '/"th output node for the 'p'th input pattern, where 1 </? < P, and P is the total
number of input patterns. If £^ is defined as half of the error between the desired and
actual values, then
E^ = Uyp,-D^,)\ (A.7) •^ /—I
where w u/ is the total number of outputs from the network. The total error over allp
patterns is then
E=iEp, p=i
or
I .^"jH.,, ^ 2 E^-HZiYpi-Dpiy. (A.8)
2 p=i/=i
In a least-squares method, the weights are adjusted to minimize E. In the error
backpropagation training algorithm (EBTA), the weights are adjusted to minimize Ep. The change in the second layer of weights is given as
BE Apwl=-a-i, (A.9)
where a is the learning rate (step size). The partial derivative of the error fiinction can be
obtained by differentiating Equation A.7 as shown below:
189
I.e.,
^E 1 n ay T-f = -.2.I(y,,-D,,)-f, ow,^ 2 1=1 ^ '' a fa
or
dE. n^ dY. ^=^^Ypi-Dpi)-^- (AlO)
But, according to Equation A.6
Ypi^Wpi). (A. 11)
Therefore,
^ = V / ( / / ) - T : ^ . (A.12)
Substituting Equation A.2 and A.11 in Equation A.12 gives
a n/ -, dr., ^ = ( 1 - ( 1 ; / ) ' ) T 4 - (A.13)
Therefore, now Equation A. 10 after substitution from Equation A. 13 becomes
—f=2(i; , -^p,)( i-(} 'p,) ' )T4 (A.14) ^ , _, ,, „. , ~ . . ' -g^.
Now,
and
^ = /7„if/ = . (A.15) ^k,
190
Substituting Equation A.15 in Equation A.14 gives
dE. ^ = iYpi-Dpi)il-(Ypi)')hp,, (A.16)
and substituting Equation A.16 in Equation A.9 yields
Apwl =a(Dpi-Ypi){\-iYpif)hp^. (A.17)
With a similar treatment for the weights between the mput and hidden layers, h can be
shown that
dE„ "« « dr„, -^= Z(Yp,-Dpi)(\-(Ypif)-^, (A.18) dw;^ /Tf ^' ^'- ^" ' dw,j
and
dr^^dTp^dh^dZp^
dwjj dhp, dZp, dwl
Since,
and therefore, differentiating with respect to Z? , gives
dr,
dhp^
and in the similar manner, differentiating Equation A.4 with respect to 2 , gives
and differentiating Equation A.3 with respect to wjj gives
dz. —^ = x
191
(A. 19)
^ = wl (A.20)
2^=l-(^.)^ (A.21) ^p.
(A.22)
Substituting Equations A.20, A.21, and A.22 in Equation A. 19 and then Substituting
Equation A. 19 in A.18 gives
S = (i-(^)')^/f (};, -^,,)(i-(i',<)')n'. OWjj 1=1
which can be expressed as
A X = a(l-(hp,?)Xj"tiYpi - Dp,){\ -(7,;)')^,^ • (A.23)
Equations A. 17 and A.23 are the algorithms that are used to "backpropagate" the
weights from the outputs to the inputs. In general, the EBTA for weight adjustment for a
M-layered neural network can be shown to take form given below:
Apwi'-^=a.df-'^\x^-^, for/= 1, 2, ...,M-\,
where
5 M - / + i = ( / ) . _ } ; ) ( l _ ( } ; ) 2 ) , f o r / = l ,
and
gM-/+l ^(l_(^A/-/- . l )2)^5A/-/-H2^M-W foj./ = 2 , 3 , . . . , M - 1 , ' ' k=2
where n is the number of nodes in the '/'th layer
192
APPENDIX B
THE MARQUARDT ALGORITHM
The Marquardt-Levenberg method (also known as the Marquardt method)
(Marquardt, 1963) is a nonlinear optimization and equation-solving technique. The
algorithm can be used to estimate unknown variables in sets of nonlinear equations where
the number of variables is less than or equal to the number of equations. Simple
constraints on the parameters may be used to keep the solution in bounds. The following
paragraphs give a description and explanation for the usage of a FORTRAN subroutine
Marquardt (Ramchandran, 1993). Examples are provided to illustrate the usage of the
program code.
B.l. Description
Consider a set of w equations in k unknown variables of the form:
f\\X\-,X2,...,Xf^)-y^,
/ j (Xj , ^2 , . . . , JCjt) - 3^2'
where r are the unknown variables, y. are the known values, and/ are the known
fiinctions. Also n>k, and {x)^^ < x. < (x)^^, where (x)^^ and (x)^^ are the minimum
and maximum constraints on the unknowns.
The algorithm seeks to find a set of x that will minimize a user-defined fiinction, such
as the sum of squares error, <\>, given by
193
1=1
The algorithm can also be used to maximize a user-defined fiinction and will also
handle weighted objective fiinctions. The method of solution combines the best features
of gradient and Newton-Raphson procedures by using a suitable weighting parameter (the
parameter is adjusted internally by the routine). The method has the stability of the
gradient procedure with respect to poor starting values, and at the same time, it possess
the speed of convergence of the Newton-Raphson method when close to the final solution.
More details on the Marquardt method may be obtained from Press et al. (1992) and
Battiti (1992). The FORTRAN program code for the Marquardt method is attached m
Appendix B.
B.2. Limitations
The routine may converge to a relative minimum in the sum of squares surface (rather
than a global minimum), get hung on a ridge (when very long and narrow), or terminate
due to round-off errors. The bounds are intended to keep the solution inside a feasible
region, but h is assumed that the solution does not lie on a bound. If the answer does he
on a bound then h may not be found. If the solution is one of the bounds then the user
might want to extend the bounds and retry.
B.3. Program Usage
The FORTRAN subroutine MARQUARDT calculates the derivatives numerically.
The program can, however, be modified to handle analytical derivatives also. The user is
required to code the equations required to do the fiinction evaluations in the calling
program.
194
B.4. Declaratives
The following declaratives are needed for proper execution of the program:
DIMENSION B(2*K), Z(2*N), Y(N), BV(K), BMIN(K), BMAX(K), P(K*(N+2)+N),
A(KD,1), AC(KD,1), CC(6),INDEX(5),OUTPUT(5), where K and N are defined m the
argument list.
B.5. Argument List
The calling sequence is CALL MARQUARDT (K, B, N, Z, Y, CC, INDEX, BV,
BMIN, BMAX, OUTPUT, KD, P, A, AC, INDEXl), where
1 K-the number of independent variables (K > 1). {Input)
2. B-the vector of K unknowns. On first entry into the subroutine, initial estimates
must be supplied for B(l) through B(K). On each exit, the routine supplies a new
improved estimate of the unknowns. On the final exit, the vector contains the "best" point
found to date. Locations B(K+1) through B(2*K) always contains the "best" point (i.e.,
the base point) found to date. {Input and Output)
3. N-the number of equations to be solved (N > K). {Input)
4. Z-a vector of the N computed fiinction values calculated in the calling program
before first entry and on subsequent requests for fiinction evaluations. Locations Z(N+1)
through Z(2*N) contain fiinction values corresponding to B(K+1) through B(2*K).
{Input)
5. Y-a vector of N desired fiinction values. {Input)
6. CC-a real data storage vector for the convergence criteria factors.
(i) CC(l)-initial value of v. If CC(1) < 0.0, v is set internally to 10.0. v is the factor
used to change X by multiplication or division. For a finer one dimensional search, set v to
a smaller value, say 2.0. {Input)
195
(ii) CC(2)-inhial value of X. If CC(2) < 0.0, \ is set internally to 0.01. The value wUl
automatically change as computation continues. X is the factor that is used to combine the
move from gradient and the Newton-Raphson methods. When X is large, (i.e., 1.0), the
search is primarily in the negative gradient direction. When h is small (i.e., 0.00001), h is
primarily in the Newton-Raphson direction.
(ih)CC(3)-initial value of i. If CC(3) < 0.0, x is set internally to 0.001. x is used in
the convergence test (explained in INDEX(3)). {Input)
(iv)CC(4)-initial value of e. If CC(4) < 0.0, e is set internally to 0.00002. e is used
in the convergence test. {Input)
(v) CC(5)-inhial value of (J) . If CC(5) < 0.0, (|)^ is set internally to 0.0. When ^ <
min' ^ ^ partial derivatives from the previous iteration are used instead of computing them
again. {Input)
(vi)CC(6)-error limit, set to l.OxlO- o. {Input)
1. INDEX-an integer storage vector.
(i) INDEX(l)-used to control the sequence of operations internally. Must be set to
1 on initial entry into MARQUARDT. It is reset by MARQUARDT after initial entry.
Table B. 1 describes the values that INDEX(l) can have and their corresponding meaning.
{Input and Output)
(ii) INDEX(2)-used to determine if fiinction or derivative needs to be calculated, or if
a new base point is being reported. It is set by MARQUARDT. Table B.2 describes the
values that INDEX(2) can take and their corresponding meaning. {Output)
(iii)INDEX(3)-indicates status of search at new base point. INDEX(3) is set to K
initially. Table B.3 describes the values that INDEX(3) can take and their corresponding
meaning. {Output)
(iv)INDEX(4) - iteration counter. {Output)
196
Table B.l. Values for INDEX(l) and their Corresponding Meanmg
Value Meaning
0 Must be set on initial entry. {Input) 2 Analytical derivative mode. {Not applicable in this version).
{Output) 3 Numerical derivative mode. {Output) 4 Search mode. {Output) 5 New base point mode. {Output)
-1 Search cannot continue. {Output)
197
Table B.2. Values for INDEX(2) and their Corresponding Meanmg
Value Meaning
0 Calculate fiinction, Z(X), for all new values of X. 1 Calculate derivative vector of fiinction with respect to X(J), where
J is given in INDEXl. -1 A new base point has been found (The starting point is a new base
point.). Examine INDEX(3) for convergence.
198
Table B.3. Values for INDEX(3) and their Corresponding Meaning
Value Meaning
> 0 Gives the number of variables not satisfying convergence criterion where
\x,\+z where x and 8 are specified in CC(3) and CC(4). Recall MARQUARDT.
0 All parameters satisfy the convergence criterion.
< 0 Discussed under Error Returns.
199
8. BV-a vector indicating which of the B variables are actually to be varied by the
program. It may be varied (by the user) after each new base point. If BV(I) = 0.0, hold
B(I) constant. If BV(I) = 1.0, allow B(I) to vary by using numerical derivatives. {Input)
9. BMIN-a vector containing the lower bounds on all B variables. It may be varied
(by the user) after each new base point has been found.
10. BMAX-a vector containing the upper bounds on all B variables. It may be varied
(by the user) after each new base point has been found.
11. OUTPUT-an output vector of real variables which is reported at each new base
point. {Output)
(i) OUTPUT(l)-(j), value of the user-defined objective fiinction at the current base
point.
(ii) 0UTPUT(2)-y, the angle in degrees between the step actually taken and the
steepest descent direction at the last base point.
(iii)0UTPUT(3)-a counter for the number of times a return from MARQUARDT is
made. It is set to zero on first entry and incremented by 1 on each exit.
(iv)0UTPUT(4)-a counter for the number of functional evaluations required by
MARQUARDT. It is set to 1 on initial entry (to count the initial functional evaluation)
and incremented by 1 each time a return from a function evaluation is made.
(v) 0UTPUT(5)-a counter for the number of derivative evaluations. It is set to 0 on
the first entry and incremented by one each time a return from partial derivative request is
made.
12. KD-the number of rows of the storage matrices A and AC in the calling program,
KD must be greater or equal than K. Generally, KD = K+2. {Input)
13. P-a scratch vector used to store the values of all the partial derivatives computed
in the calling program. The first N*K locations contain the partial derivatives stored
columnwise:
200
dZy
cbCj
& 2
^x^
dz^
8X2
dz2
cbCj
dzi
^k
dz2
dx^
dz„ dz„ dz. 'n ^^n
dxi dx2 dXf^
The partial derivative in the P vector are calculated using finite difference approximations.
The space in the P vector is used to store the following data:
P( 1) - P(N*K) - N*K Jacobian matrix
P(N*K+1) - P(N*K+K) - Current value of B
P(N*K+K+1) - P(N*K+2K) - Value of B at each new base point + AB
P(N*K+2K+1) - P(N*K+2K+N) - Value of Z corresponding to current B
14. A-scratch matrix used internally of dimension (K, KD).
15. AC-scratch matrix used internally of dimension (K, KD).
16. INDEXl-dummy counter to store index of B when evaluating fiinctions.
B.6. Error Returns
INDEX(2) = -1 implies that either a new base point has been found (i.e., INDEX(l) =
5) or that the search cannot be continued (i.e., INDEX(l) = -1). If the search is to be
terminated either INDEX(3) has to be 0 (i.e., the convergence criteria has been satisfied)
or INDEX(3) is negative. Table B.4 describes the values that INDEX(3) can take and
their corresponding meaning.
201
Table B.4. Values for INDEX(3) Under Error Returns and their Corresponding Meaning
Value Meaning
-1 A new base point has been found but X > 1 and y > 90° and the convergence criteria have not been met. This implies that numerical difficuhies are present.
-2 There are more unknowns present than equations (N < K). -3 The total number of variables to be varied is zero as indicated in the BV
vector. -4 The convergence criteria have been met (same as INDEX(3) = 0), but
X > 1 and Y > 45°. This generally means that progress has been very slow due to perhaps the presence of a ridge.
-5 On entry the value of INDEX( 1) was 0 or negative. -6 One of the variables was out of the stated range of BMAX and BMIN on
entry. -7 The value of X > 10 but the convergence criteria have not been met. This
implies may be too small. -8 Convergence criterion have been met in equation solving but ^ > lO^^.
This implies existence of a relative minimum which is not an exact solution.
202
B.7. Examples for Use of Marnuardt Method fnr Equation Solving and Optimization
The examples discussed below illustrate the use of the FORTRAN subroutine
MARQUARDT as an equation solver and an optimizer.
B.7.1 Example 1: Multiple Reaction System (Example 8-12.2 from Henley and Rosen,
1969)
Synthesis gas manufacture involves the following reactions:
CH4 + H O o CO + 3H2 (I)
CO + H O o CO2 + H2 (II)
The K^ values for these reactions are 0.59 and 2.49. If the Ideal Gas law and Dahon's law
apply, how many moles of each component are present at equilibrium if initially 6 moles of
CH4 and 5 moles of H O are charged? The pressure is 1 atmosphere.
Solution : The equation relating equilibrium constant to the standard free energy of
reaction and the liquid or gas phase composhion can be written as
N N AG?r (In K,r)j = Z a , \n{x,p^) = Z a , \n{y,P) = - ( — ^ ) , ,
where, J = 1, 2, ..., M reactions, and x. is the mole fraction of component / in the Hquid
phase; p. is the vapor pressure; K^ j is the reaction equilibrium constant at temperature T\
y. is the mole fraction of component / in the gas phase; a is the stoichiometric coefficient
of component / in reaction^; AG° y, is the standard free energy change for the reaction at
temperature T; and R is the gas constant.
We seek a solution for the extents of reactions, e^ and ^j, that make/j and/j equal to
zero, where/, and/j are given by
/i(^i,^2) = -ln0.54 + Ia„ln>',(l) ;=1
203
/2(^i,^2) = -ln2.49 + Za,2ln>;,(l)
The stoichiometric matrix, a^ and initial moles are as given below:
Component Reaction (I) Reaction (II) Initial Moles
CH4 - 1 0 6
H2O -1 -1 5
CO 1 3 0
H2O 3 1 0
CO2 0 1 0
If A2. is the initial moles present of component /, then the moles present at any set of
extents are given as
2 ",=w,o + Eaye^ / = 1, 2,..., 5
Using the above equation values of «y are calculated, and/, and/2 are evaluated, and
e^ and 2 are adjusted by MARQUARDT to minimize/, and/2 independently. The
FORTRAN program code for Example 1 is attached in Appendix D along with the
corresponding program output after execution.
B.7.2. Example 2 : Find the solution to the following set of equations:
3JC, +X2 +2x3 - 3 = 0
-3x,+5x2+2x,X3-l = 0
25XjX2+20x3+ 12 = 0
Starting values are (1.0,1.0,1.0). The equations are scaled to the formf{x) = 1, and
the y vector is set to 1.0. Using numerical derivatives the solution was found after 28
evaluations of the function. The routine was entered 34 times. The angle between the
204
actual and steepest descent direction for the last step was 51.16°. The FORTRAN
program code for Example 2 is attached in Appendix D along with the cortesponding
program output after execution.
B.7.3. Example 3 : It is desired to determine the parameters a,, ttj, 03, and a^ that fit an
equation of the form
>/, =a,e^°*'' +a3e^" ''
to a set of nine data points in a least squares sense. Starting values, and maximum and
minimum values for the a's are given in the DATA statements. The FORTRAN program
code for Example 3 is attached in Appendix D along with the corresponding program
output after execution.
205
APPENDIX C
EMPIRICAL CORRELATIONS FOR THE METHANOL-
WATER SYSTEM
The dynamic simulations for the methanol-water distillation columns require system
specific information that describes the vapor-liquid equilibrium (VLE) for the methanol-
water system under the chosen operating conditions of pressure and temperature, the
liquid and vapor enthalpy data, and the liquid and vapor density data. While the
differential equations that describe the material and energy balances (discussed in Chapter
IV, Section 4.2) are standard for any distillation column, the behavior of the distillation
column, both from a steady state as well as a dynamic viewpoint, is dependent on the
thermodynamic properties of the system under consideration.
The system under consideration is a methanol-water system which is a nonideal binary
mixture. In order to get an accurate description of the thermodynamic behavior of the
system, one has to account for the nonidealities in the vapor and liquid phases. Since the
operating pressures are close to atmospheric pressure, one can assume that the vapor
phase behaves ideally, and that the nonideal behavior is essentially in the liquid phase. The
liquid-phase nonideality can be described whh the help of an activity coefficient model,
such as the Wilson model, or the NRTL model. Even though the activity models describe
the thermodynamic behavior more accurately, they tend to be more complicated from a
computationally standpoint. Also, the parameters of the activity model are values
regressed from experimental data obtained over specific range of conditions. Instead of
using the detailed thermodynamic model in the dynamic simulations, empirical relations
were developed using experimental data to correlate the VLE, vapor- and liquid-phase
enthalpies and densities as fiinction of the liquid-phase composhion. The empirical
206
con-elations are standard polynomial regression equations that provide sufficient accuracy
with the advantage of high computational speeds.
C. 1 Correlations for the Lab-Column Dvnamic Simulator
The lab column operates essential at atmospheric pressure condition, and the vapor-
liquid equilibrium for the methanol-water system for one atmosphere absolute pressure
(Henley and Seader, 1981) is reported in Table C.l, along with the corresponding
equilibrium temperature. Table C.2 shows enthalpy data for the same system for one
atmosphere absolute (Henley and Seader, 1981). Data for the average molecular weight,
saturated liquid and vapor density, and density of liquid subcooled to 120°F were obtained
from a steady-state process simulation package (HYSIM) using the NRTL thermodynamic
model for the methanol-water system at one atmosphere absolute pressure. Table C.3
shows the above mentioned data.
C. 1.1. Correlation for Vapor-Liquid Equilibrium
The composhion of the vapor-phase >', in mole fraction methanol, in terms of the
liquid-phase composition x, in mole fraction methanol, is given by the equation
>; = 0.0207 + 5.6509x-20.2753x^+37.8756x^-33.4747x^+11.2092x^ (C.l)
Figure C. 1 shows the fit obtained by the above equation.
C.1.2. Correlation for Saturation Temperature
The saturation temperature T, in degrees Fahrenheit, at any liquid-phase composition
X (mole fraction methanol), is given by the equation
r=210.76-243.45x + 515.74x^-547.70x^+213.33x\ (C.2)
Figure C.2 shows the fit obtained by the above equation.
207
Table C.l. Vapor-Liquid Equilibrium for Methanol-Water System at latma (from Henley and Seader, 1981).
X Y T (mf MeOH) (mf MeOH) (deg. C)
0.00 0.02 0.04 0.06 0.08 0.10 0.15 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95 1.00
0.00 0.13 0.23 0.30 0.37 0.42 0.52 0.58 0.67 0.73 0.78 0.83 0.87 0.92 0.96 0.98 1.00
100.00 96.40 93.50 91.20 89.30 87.70 84.40 81.70 78.00 75.30 73.10 71.20 69.30 37.60 66.00 65.00 64.50
l>
208
Table C.2. Enthalpy Data for Methanol-Water System at 1 atma, (Henley and Seader, 1981).
XorY HV hL (mf MeOH) (BTU/lbmol) (BTU/lbmol)
0.00
0.05 0.10 0.15
0.20 0.30
0.40
0.50 0.60
0.70
0.80 0.90
1.00
20720.00
20520.00 20340.00 20160.00
20000.00 19640.00
19310.00 18970.00
18650.00 18310.00
17980.00 17680.00
17930.00
3240.00
3070.00 2950.00 2850.00 2760.00 2620.00
2540.00 2470.00 2410.00 2370.00
2330.00 2290.00
2250.00
209
Table C.3. Transport Property Data for Methanol-Water System at 1 atma.
X AMW T rho_L rho_V rho_L_SC (mf MeOH) (Ib/lbmol) (deg. F) (lb/ft^3) (lb/ft^3) (lb/ft^3)
0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00
18.02 18.72 19.42 20.12 20.82 21.52 22.22 22.93 23.63 24.33 25.03 25.73 26.43 27.13 27.83 28.54 29.24 29.94 30.64 31.34 32.04
212.00 198.84 190.01 183.65 178.80 174.94 171.75 169.04 166.67 164.55 162.63 160.85 159.19 157.62 156.12 154.98 153.29 151.94 150.62 149.33 148.07
59.18 58.12 57.07 56.07 55.12 54.24 53.41 52.65 51.93 51.26 50.64 50.07 49.53 49.03 48.57 48.13 47.73 47.36 47.01 46.69 46.40
0.0367 0.0383 0.0399 0.0414 0.0431 0.0447 0.0463 0.0480 0.0497 0.0514 0.0532 0.0550 0.0568 0.0586 0.0605 0.0625 0.0644 0.0664 0.0683 0.0703 0.0722
61.75 60.37 59.11 57.96 56.90 55.93 55.04 54.21 53.45 52.73 52.07 51.46 50.89 50.35 49.85 49.39 48.95 48.55 48.17 47.81 47.48
210
1.1
1.0 —
o
(4-r
a
Y - 0.0207 + S.6509X - 20.2753X-2 + 37.8756X-3 33.4747X-4 • I1.2092X-5 r-2 - 0.9991
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
X (mf MeOH)
Figure C.l. Vapor-Liquid equilibrium for methanol-water system at 1 atma.
211
220
210
fc 200
^3
190 —
ex
XT
180 —
J 170 —
4)
ea
P
CO 160
150
140
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
X. (mf MeOH)
Figure C.2. Saturated liquid temperature versus liquid-phase composition for methanol-water system at 1 atma.
212
c.l.3. Correlation for Saturated Liquid Density
The saturated liquid density p , in Ib/ft , at any liquid-phase composhion x (mole
fraction methanol), is given by the equation
Pi=59.215-23.01x + 13.298x^-3.104x^ (C.3)
Figure C.3 shows the fit obtained by the above equation.
C.l.4. Correlation for Saturated Vapor Density
The saturated vapor density Py, in Ib/ft , at any vapor-phase composhion>' (mole
fraction methanol), is given by the equation
Py = 3.68x10"^ +3.02x10-^ + 5.57x10-^2 -2.02x10-^/. (C.4)
Figure C.4 shows the fit obtained by the above equation.
C. 1.5. Correlation for Liquid Density at 120°F
The liquid density at 120°F p 120, in Ib/ft at any liquid-phase composhion x (mole
fraction methanol), is given by the equation
p J20 = 61.696-27.477x + 19.492x2-6.269x^ (C.5)
Figure C.5 shows the fit obtained by the above equation.
C. 1.6. Correlation for Liquid Enthalpy
The liquid enthalpy h^, in BTU/lbmol, at any liquid-phase composition (mole fraction
methanol) is given by the equation
/7^=3218.5-2918.9x + 3631.7x2-1692.5x^ (C.6)
Figure C.6 shows the fit obtained by the above equation.
213
60
m
Ui
Q
p ST
«} 4-1 «
P 4-1 cd
CO
50
d e n L - 5 9 . 2 1 5 - 2 3 . 0 1 X • 1 3 . 2 9 8 X - 2 - 3 . 1 0 4 X ' 3 - t - 2 - 1 .000
40
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
X. (mf McOH)
Figure C.3. Saturated Uquid density versus Uquid-phase composhion for methanol-water system at 1 atma.
214
0.080
0.070 —
<n
>; 0.060 —
o a, «s >
CO
0.050
0.040 —
0.030
den_V - 3.68B-2 + 3.02B-2X + 5.57E-3X-2 -2.02B-4X*3 r-2 - 1.00
H
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
X. (mf McOH)
Figure C.4. Saturated vapor density versus liquid-phase composhion for methanol-water system at 1 atma.
215
70.0
d e n _ S C L - 6 1 . 6 9 6 - 2 7 . 4 7 7 X + 19 .492X*2 - 6 .269X*3 t'l - 1 .000
GO 6 0 . 0 o
o
CS
>s 4-»
(A
o Q
9 S 50.0
40.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
X. (mf MeOH)
Figure C.5. Subcooled liquid density versus Uquid-phase composition for methanol-water system at 1 atma.
216
3300
H hL
3200 —
3 1 0 0
3218.5 - 2918.9X • 3631.7X*2 - 1692.5X'3 r'2 - 0.9987
OU
I
Xi
.«^ p «<J
CQ . ««
>s
a a
j i 4->
C ti
•o p O"
3000
2900
2800
2700
2600 —
2 2500 — c3
CO
2400 —
2300 —
2200
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
X, (mf McOH)
Figure C.6. Saturated liquid enthalpy versus liquid-phase composhion for methanol-water system at 1 atma.
217
C. 1.7. Correlation for Vapor Enthalpy
The vapor enthalpy Hy, in BTU/lbmol, at any vapor-phase composhion >' (mole
fraction methanol) is given by the equation
/ / j . =20669.1-3338.3>'. (C.7)
Figure C.7 shows the fit obtained by the above equation.
C. 1.8. Correlation for Average Molecular Weight
The average molecular weight AMW, in Ib/lbmol, at any liquid-phase composition x
(mole fraction methanol), is given by the equation
/}A/W^ = 18.015+14.027X. (C.8)
Figure C.8 shows the fit obtained by the above equation.
C.2. Correlations for the High-Purity Column Dynamic Simulator
The high-purity column operates at approximately two atmospheres absolute pressure
condition, and the vapor-liquid equilibrium for the methanol-water system for two
atmospheres absolute pressure (HYSIM using a NRTL thermodynamic model) is reported
in Table C.4, along whh the corresponding equilibrium temperature. Table C.5 shows
enthalpy data for the same system at two atmospheres absolute. Data for the saturated
liquid and vapor density, and density of liquid subcooled to 120°F were obtained from the
steady-state process simulation package (HYSIM) using the NRTL thermodynamic model
for the methanol-water system at two atmospheres absolute pressure. Table C.6 shows
the above mentioned data.
218
21000
o B
U3 W^
••"•^B
P * j
CQ S i . * '
a, «d .c 4 - J
p
i->
o tx <d
>
•o «
•ta
«d • - I
P 4 > *
«d CO
20000
19000
18000
17000
HV - 2 0 6 6 9 . 1 - 3338 .3Y r '2 - 0 9 9 9 3
[^
[^
-[©I
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Y, (mf MeOH)
Figure C.7. Saturated vapor enthalpy versus vapor-phase composition for methanol-water system at 1 atma.
219
40.0
o B
Xi 30.0 —
Xi 60
<B
P O
c "o
« 20.0 CO C0
« >
10.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
X, (mf McOH)
Figure C.8. Average molecular weight versus hquid-phase composhion for methanol-water system.
220
Table C.4. VLE for Methanol-Water System at 2 atma.
X Y T (mf MeOH) (mf MeOH) (deg. F)
0.000
0.025
0.050
0.075 0.100
0.150 0.200 0.250 0.300
0.350 0.400
0.450
0.500
0.550 0.600
0.650 0.700
0.750
0.800
0.850
0.900
0.925
0.950
0.975
1.000
0.0000
0.1442
0.2517
0.3346 0.4003
0.4977 0.5669 0.6194 0.6612
0.6962 0.7263
0.7533 0.7781
0.8014
0.8236
0.8454 0.8668
0.8881
0.9096
0.9315 0.9537
0.9651
0.9765
0.9882
1.0000
242.26
235.04
229.10
224.14 219.94
213.22 208.06 203.95 200.55
197.67 195.17
192.94 190.91
189.04
187.29
185.63 184.04
182.51
181.03
179.60
178.20
177.51
176.83
176.16
175.49
221
Table C.5. Enthalpy Data for Methanol-Water System at 2 atma.
XorY HV hL hL_SC (mf MeOH) (BTU/lbmol) (BTU/lbmol) (BTU/lbmol)
0.000
0.025 0.050
0.075 0.100 0.150 0.200
0.250 0.300
0.350 0.400
0.450 0.500
0.550
0.600
0.650
0.700
0.750
0.800
0.850
0.900
0.925 0.950
0.975 1.000
21301.89
21246.98 21191.66 21135.91 21079.71 20965.94 20850.24
20732.43 20612.40
20490.01
20365.11 20237.52 20107.09
19973.70 19836.14
19698.13
19556.36
19412.15 19269.19
19126.26
18985.26
18915.70 18846.84
18778.67
18711.21
3799.75 3723.40 3664.80
3622.16 3591.20 3555.07 3542.90
3546.72 3561.36
3583.56
3610.82 3641.58 3674.63
3709.23 3745.47
3780.43
3816.28
3851.94 3889.61
3922.06
3956.29
3973.18
3989.92
406.53 4022.88
1588.99 1609.98 1630.98
1651.97 1672.96 1714.95 1756.93
1798.92 1840.90
1882.89
1924.87 1966.86 2008.84
2050.83 2092.81
2134.80
2176.78 2218.77
2260.75
2302.74
2344.72
2365.71 2386.71
2407.70 2428.69
222
Table C.6. Transport Property Data for Methanol-Water System at 2 atma.
X rho_L rho_V rho_L_SC (mfMeOH) (lb/ft' 3) (lb/ft' 3) (lb/ft' 3)
0.000 0.025 0.050 0.075 0.100 0.150 0.200 0.250 0.300 0.350 0.400 0.450 0.500 0.550 0.600 0.650 0.700 0.750 0.800 0.850 0.900 0.925 0.950 0.975 1.000
58.28 57.75 57.21 56.68 56.16 55.15 54.20 53.31 52.47 51.69 50.96 50.29 49.65 49.07 48.50 48.01 47.53 47.09 46.69 46.29 45.93 45.77 45.60 45.45 45.30
0.0622 0.0635 0.0648 0.0662 0.0675 0.0702 0.0729 0.0757 0.0785 0.0813 0.0842 0.0872 0.0901 0.0932 0.0963 0.0994 0.1026 0.1059 0.1092 0.1125 0.1157 0.1174 0.1190 0.1206 0.1222
59.38 58.94 58.52 58.10 57.68 56.87 56.10 55.35 54.62 53.93 53.26 52.63 52.02 51.44 50.89 50.36 49.87 49.40 48.96 48.55 48.17 47.99 47.81 47.65 47.49
223
c.2.1. Correlation for Vapor-Liquid Equilibrium
The composition of the vapor-phase >', in mole fraction methanol, in terms of the
liquid-phase composhion x, in mole fraction methanol, for the range 0.0 < x < 0.1 is given
by the equation
>/ = 0.000142 + 6.596215x-36.315685x2+103.809028x^ (C.9)
Figure C.9 shows the fit obtained by the above equation. For the range 0.1 < x < 0.98, the
relationship is given by the equation
>' = 0.118525 + 3.658772x-9.584915x^ +
14.42337x^-10.870606x^^-3.256565x^ (CIO)
and for the range 0.98 < x < 1.0, the relationship is
>' = 0.758018 + 0.006244x + 0.23573x^ (C.ll)
Figures CIO and C.ll show the fit obtained by the Equations CIO and C11,
respectively.
C.2.2. Correlation for Saturation Temperature
The saturation temperature T, in degrees Fahrenheit, at any liquid-phase composition
X (mole fraction methanol), is given by the equation
r = 240.96-249.05x + 518.71x^-545.46x^+210.95x\ (C.12)
Figure C12 shows the fit obtained by the above equation.
C.2.3. Correlation for Saturated Liquid Density
The saturated liquid density p , in Ib/ft , at any liquid-phase composhion x (mole
fraction methanol), is given by the equation
p^=58.322-23.133x + 13.125x^-3.017x^ (C.13)
Figure C13 shows the fit obtained by the above equation.
224
0.50
0.40 —
X
o <i
0.30 —
0.20
0.10 —
0.00 ^ •
0 .000 0.025 0.050 0.075 0.100
X (mf MeOH)
Figure C.9. Vapor-Liquid equilibrium for 0.0 < x < 0.1 for methanol-water system at 2 atma.
225
1.0
X
o
B
0.9 —
0.8 —
0.7 —
0.6 —
0.5
0.4
f*'
!•]
[?:
L ^ [•I
Y - 0 . 1 1 8 5 2 5 + 3 . 6 5 8 7 7 2 X - 9 . 5 8 4 9 1 5 X * 2 + 1 4 . 4 2 3 3 7 0 X ^ 3 - 1 0 . 8 7 0 6 0 6 X * 4 + 3 . 2 5 6 5 6 5 X * 5
r '2 - 0 . 9 9 9 9 6 1
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
X (mf MeOH)
Figure C10. Vapor-Liquid equilibrium for 0.1 < x < 0.98 for methanol-water system at 2 atma.
226
X
o
< 4 - l
B
1.000
0.999 —
0.998
0.997 —
0.996 —
0.995 —
0.994 —
0.993
0.992
0.991
0.990
0.98 0.99 1.00
X (mf McOH)
Figure C 11. Vapor-Liquid equiUbrium for 0.98 < x < 1.0 for methanol-water system at 2 atma.
227
250
T - 240.96 - 249.05X • 518.71X*2 - 545.46X'3 • 210.95X*4 r-2 - 0.999
240
230 —
220
E
H c o
ea •-> p
a
210
200
190 —
180 —
170
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
X. (mf McOH)
Figure C.12. Saturated liquid temperature versus hquid composition for methanol-water system at 2 atma.
228
60 58.322 - 23.133X + 13.125X-2 - 3.017X-3
r*2 - 1.000
«n
p
= 50 -
CO
p ei
x/i
40
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
X, (mf MeOH)
Figure C.13. Saturated liquid density versus hquid composition for methanol-water system at 2 atma.
229
C.2.4. Correlation for Saturated Vapor Density
The saturated vapor density py, in Ib/ft , at any vapor-phase composition;; (mole
fraction methanol), is given by the equation
PK =0.0623+ 0.0506>' + 0 . 0 1 1 8 / - 0 . 0 0 2 4 / . (C.14)
Figure C14 shows the fit obtained by the above equation.
C2.5. Correlation for Liquid Density at 120°F
The liquid density at 120°F p 120, in Ib/ft , at any liquid-phase composhion x (mole
fraction methanol), is given by the equation
PL,I2O = 59.380-17.554x + 5.663x^ (C15)
Figure C15 shows the fit obtained by the above equation.
C.2.6. Correlation for Liquid Enthalpy
The liquid enthalpy /? , in BTU/lbmol, at any liquid-phase composhion (mole fraction
methanol) is given by the equation
h^ = 3793.26-3025.26X +12111.98x^-19761.89x^
+15960.15x'*-5058.99x^ (C.16)
Figure C16 shows the fit obtained by the above equation.
C.2.7. Correlation for Vapor Enthalpy
The vapor enthalpy Hy, in BTU/lbmol, at any vapor-phase composition y (mole
fraction methanol) is given by the equation
//^ =21296.22-2059.04>'-786.02y+252.15/. (C.17)
Figure C17 shows the fit obtained by the above equation.
230
0.13
0.12 —
fn 0.11 —
.? 010 p u
Q
o S" 0.09 >
V
C3 • i
P « 0.08
e/3
0.07 —J
den_V - 0 .0623 + 0 .0506Y + 0.0118Y'2 - 0.0024Y-3 r*2 - 1.0000
0.06
r*:
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Y, (mf MeOH)
Figure C.14. Saturated vapor density versus vapor composition for methanol-water system at 2 atma.
231
51
50 —
O
CO
C u
Q
p a"
49
48 —
47
rho_L_SC - 59.380 - 17.554X + 5.663X-2 t'l - 1.000
"\^
' ^ ,
0.6 0.7 0.8 0.9 1.0
X. (mf MeOH)
Figure C.15. Subcooled Uquid density versus Uquid composition for methanol-water system at 2 atma.
232
4100
4000 —
o B
Xi
p 4-1
CQ
« Xi c
p
V
CS
p
3900
3800 - ®
3700 —
3600 —
3500
hL - 3 7 9 3 . 2 6 - 3 0 2 5 . 2 6 X + 1 2 1 1 1 . 9 8 X 2 -1 9 7 6 1 . 8 9 X * 3 + 1 5 9 6 0 . 1 5 X ' 4 - 5 0 5 8 . 9 9 X * 5
r*2 - 1.000
\S] ^
[
®
^
g
®
!•'
0.0 0.1 0 .2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
X. (mf MeOH)
Figure C.16. Saturated Uquid enthalpy versus Uquid composition for methanol-water system at 2 atma.
233
22000
HV - 2 1 2 9 6 . 2 2 - 2 0 5 9 . 0 4 Y - 7 8 6 . 0 2 Y - 2 + 252 .15Y*3 r '2 - 1.000
21000 —
o B
Xi
p 4->
Ui
c 20000
o
>
c
ca
p
« 19000 on
18000
'^
[•
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Y, (mf McOH)
Figure C.17. Saturated vapor enthalpy versus vapor composition for methanol-water system at 2 atma.
234
C.2.8. Correlation for Liquid Enthalpy at 120°F
The liquid enthalpy subcooled to 120°F, Z? 120, in BTU/lbmol, at any liquid-phase
composition x (mole fraction methanol), is given by the equation
L,i20 = 1588.99 + 839.7X. (C.18)
Figure C.18 shows the fit obtained by the above equation.
235
2500
2400
to ii
•o o c< t—(
«-* ti
>s d*
^ i ^
«B XI «.• a]
T3 • ^ 4
P cr
2300
2200
hL_SC - 1588 .99 + 839 .7X r*2 - 1.000
2100
2000
0.6 0.7 0.8 0.9 1.0
X. (mf MeOH)
Figure C.18. Subcooled Uquid enthalpy versus liquid composhion for methanol-water system at 2 atma.
236
APPENDIX D
FORTRAN PROGRAMS FOR EXAMPLES IN
APPENDIX B
Listed below are the FORTRAN program codes for Examples 1, 2 and 3 discussed m
Appendix B, Section 7.1, 7.2, and 7.3, respectively.
D. 1 • FORTRAN Code for Example 1
C
c C PROGRAM : EXAMPLE 1 FOR C DATE : DECEMBER 20, 1993 C VERSION 1.0 C PROGRAMMER : SOUNDAR RAMCHANDRAN, DEPT. OF CHEM. ENG., C TEXAS TECH UNIVERSITY C
c C PROGRAM TO TEST SUBROUTINE MARQUARDT TO ILLUSTRATE C EXAMPLE 1 FROM APPENDIX B, SECTION 7.1 C
c IMPLICIT REAL*8 (A-H,0-Z) DIMENSION B(4),Z(4),Y(2),BV(2),BMIN(2),BMAX(2),P(10),A(2,4),
* AC(2,4),CC(6),INDEX(5),OUTPUT(5) COMMON XN(5)
C
C OPEN (6,FILE = "EXl.PRN")
IFV = 0 KD = 2 KK = 2 NN = 2 DO 10 J = 1,KK
Y(J) = 0.0
237
B(J) = 0.0 BV(J)=1.0 BMIN(J) =-100.0 BMAX(J)= 100.0
10 CONTINUE
INDEX(1) = 0 INDEX(2) = 0 INDEX(3) = KK INDEX(4) = 0
CC(1) = 0.0 CC(2) = 0.0 CC(3) = 0.0 CC(4) = 0.0 CC(5) = 0.0 CC(6)= l.OE-10
WRITE (6,3000) (B(J), J = 1,KK) WRITE (6,3001)
125 CALL MARQUARDT(KK,B,NN,Z,Y,CC,INDEX,BV,BMIN,BMAX, * 0UTPUT,KD,P,A,AC,INDEX1) IF (INDEX(4).GT. 100) THEN
WRITE (6,*) 'ITER GREATER THAN 100' GO TO 200
ENDIF IF (INDEX(3).NE.0) THEN
IF (INDEX(3).LT.O) THEN WRITE (6,*) 'ICON IS LESS THAN ZERO' STOP
ENDIF IF (INDEX(l).EQ.O) THEN
WRITE (6,3002) IFV,INDEX(3),(B(J), J = 1,KK),0UTPUT(1) GOTO 125
ENDIF IF (INDEX(3).GT.O) THEN
CALL FNTX (B,Z,IFV) GOTO 125
ENDIF ENDIF
200 WRITE (6,3002) IFV,INDEX(3),(B(J), J = 1,KK),0UTPUT(1) WRITE (6,3003) (XN(J), J = 1,5)
238
STOP C
3000 FORMAT (18H STARTING VALUES = 6F8.3//) 3001 FORMAT (66H EVALUATIONS ICON Bl B2 ERROR ) 3002 FORMAT (1X,I10,I12,3E15.5) 3003 FORMAT (3X, 18HM0LES PRESENT =/lX,5E14.5)
END C
c C SUBROUTINE FNTX C
c SUBROUTINE FNTX (B,Z,IFV)
C IMPLICIT REAL* 8 (A-H,0-Z) DIMENSION AR(5,3),Y(5),B(2),Z(2) COMMON XN(5) DATAAR/-1.0,-1.0,1.0,3.0,0.0,0.0,-1.0,-1.0,1.0,1.0,6.0,5.0,0.0,0.0,0.0/
c IFV = IFV+1 SUM = 0.0 DO 10 J = 1,5
XN(J) = AR(J,3)+AR(J,1)*B(1)+AR(J,2)*B(2) SUM = SUM+XN(J)
10 CONTINUE C
DO 20 J = 1,5 Y(J) = XN(J)/SUM IF(Y(J).LE.1.0E-10)Y(J)= l.OE-10
20 CONTINUE C
SUM2 = 0.0 SUM3 = 0.0 DO 30 J =1,5
SUM2 = SUM2+AR(J,1)*AL0G(Y(J)) SUM3 = SUM3+AR(J,2)*ALOG(Y(J))
30 CONTINUE C
Z(l) = -ALOG(0.54)+SUM2 Z(2) = -ALOG(2.49)+SUM3
239
RETURN END
D.2. Solution For Example 1
c C SOLUTION FOR EXAMPLE 1 FOR C C SOUNDAR RAMCHANDRAN C DEPT. OF CHEM. ENG., TEXAS TECH UNIVERSITY, LUBBOCK, TX C 79409-3121
C
c
STARTING VALUES = .000 .000 JATIOI 1 4 7 10 13 16 19 22 25 28 31
sfS ICON 2 2 2 2 2 2 2 2 2 2 0
Bl .OOOOOE+00 .14507E-04 .32512E-03 .33480E-02 .26166E-01 .15052E+00 .60114E+00 .15364E+01 .23077E+01 .24200E+01 .24219E+01
B2 .OOOOOE+00 .11834E-04 .65127E-04 .77429E-03 .68779E-02 .45418E-01 .20816E+00 .58457E+00 .85912E+00 .84402E+00 .84218E+00
ERROR .86526E+04 .25993E+04 .14439E+04 .80367E+03 .39366E+03 .15824E+03 .45444E+02 .60978E+01 .71153E-01 .31884E-04 .74327E-11
MOLES PRESENT = .35781E+01 .17359E+01 .15797E+01 .81079E+01 .84218E+00
240
D.3. FORTRAN Code for Example 2
C
c C PROGRAM : EXAMPLE2.F0R C DATE : DECEMBER 20, 1993 C VERSION : 1.0 C PROGRAMMER : SOUNDAR RAMCHANDRAN, DEPT. OF CHEM. ENG., C TEXAS TECH UNIVERSITY C
c C PROGRAM TO TEST SUBROUTINE MARQUARDT TO ILLUSTRATE C EXAMPLE 2 FROM APPENDIX B, SECTION 7.2 C QilftJtftJtitilfifiit * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
c IMPLICIT REAL*8 (A-H,0-Z) DIMENSION B(6),Z(6),Y(3),BV(3),BMIN(3),BMAX(3),P(18),A(3,5),
* AC(3,5),CC(6),INDEX(5),OUTPUT(5),XO(3) C
OPEN (6, FILE = "EX2.PRN") C
IFV = 0 KD = 3 KK = 3 NN = 3 DO 10 J = 1,KK
Y(J)=1.0 XO(J)=1.0 B(J) = XO(J) BV(J)=1.0 BMIN(J) = -20.0 BMAX(J) = 20.0
10 CONTINUE C
INDEX(1) = 0 INDEX(2) = 0 INDEX(3) = KK INDEX(4) = 0
C CC(1) = 0.0
241
c
c
CC(2) = 0.0 CC(3) = 0.0 CC(4) = 0.0 CC(5) = 0.0 CC(6)= l.OE-10
WRITE (6,401) (B(J), J = 1,KK)
125 CALL MARQUARDT(KK,B,NN,Z,Y,CC,INDEX,BV,BMIN,BMAX,OUTPUT, * KD,P,A,AC,INDEX1)
IF (INDEX(4).GT. 100) THEN WRITE (6,*) 'ITER GREATER THAN 100' GO TO 200
ENDIF IF (INDEX(3).NE.O) THEN
IF (INDEX(3).LT.O) THEN WRITE (6,*) 'ICON IS LESS THAN ZERO; ICON = ',INDEX(3) STOP
ENDIF IF (INDEX(l).EQ.O) THEN
WRITE (6,150) IFV,INDEX(3),(B(J), J = 1,KK),0UTPUT(1) 150 FORMAT (I5,I5,3(F15.5),E15.5)
(30 TO 125 ENDIF IF (INDEX(3).GT.O) THEN
CALL FNTX (B,Z,IFV) GOTO 125
ENDIF ENDIF
C 200 WRITE (6,*)
WRITE (6,403) INDEX(3) WRITE (6,404) OUTPUT(l) WRITE (6,405) 0UTPUT(2) WRITE (6,406) IFV WRITE (6,407) DO 3001= 1,KK
WRITE (6,408) I,XO(I),B(I),Z(I) 300 CONTINUE
C STOP
C 401 FORMAT (18H STARTING VALUES = 3F8.3/) 403 FORMAT ('ICON =', 112)
242
404 FORMAT ('SUM OF SQUARES =',E12 5) 405 FORMAT ('ANGLE =', F12.2) 406 FORMAT (• NUMBER OF FUNCTION EVALUATIONS =' 112/) 407 FORMAT (• NUMBER INITIAL X FINAL X VALUE OF Z'/) 408 FORMAT (I5,2X,E12.5,2X,E12.5,2X,E12.5)
END C
c C SUBROUTINE FNTX C C*********************************************t:^tii^it*******************
c SUBROUTINE FNTX (B,Z,IFV)
C IMPLICIT REAL*8 (A-H,0-Z) DIMENSION B(3),Z(3)
C IFV = IFV+1 Z(l) = (3.0*B(l)+B(2)+2.0*B(3)**2)/3.0 Z(2) = (-3.0*B(1)+5.0*B(2)**2+2.0*B(1)*B(3)) Z(3) = (25.0*B(l)*B(2)+20.0*B(3))/(-12.0)
C RETURN END
D.4. Solution For Example 2
C Q:^iltiHf:ti:tliilf>tf:tiilliift ********************* **************************************
c C SOLUTION FOR EXAMPLE2.F0R C C SOUNDAR RAMCHANDRAN C DEPT. OF CHEM. ENG., TEXAS TECH UNIVERSITY, LUBBOCK, TX C 79409-3121 C Qitf******************:^^llf^Hi,li^l^ii^iliHfiiHHt^fiit **********************************
c
243
STARTING VALUES = 1.000 1.000 1.000
1 5 12 16 20 24
3 3 3 3 3 3
1.00000 -1.47169 -1.60991 -2.23996 -2.41790 -2.41352
1.00000 .20160 1.19071 .89063 .91513 .91465
1.00000 2.25949 1.86049 2.12826 2.16088 2.15939
.32563E+02 .27396E-H)2 .24308E+02 .18164E+00 .14245E-03 .17442E-09
ICON = 0 SUM OF SQUARES = .46675E-16 ANGLE = 44.98 NUMBER OF FUNCTION EVALUATIONS = 28
NUMBER INITIAL X FINAL X VALUE OF Z
1 2 3
.lOOOOE+01
.lOOOOE+01
.lOOOOE+01
-0.24135E+01 .91464E+00 .21594E+01
.lOOOOE+01
.lOOOOE+01
.lOOOOE+01
D.5. FORTRAN Code for Example 3
r^^^l^i^tHi,li;^i3^i^i^iHii|i^i:^i}^iilf^liilli^cilliil^il^>^i:tf^litf * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
c C PROGRAM : EXAMPLE3.F0R C DATE : DECEMBER 20, 1993 C VERSION: 1.0 C PROGRAMMER : SOUNDAR RAMCHANDRAN, DEPT. OF CHEM. ENG., C TEXAS TECH UNIVERSITY C r^**********************************************************************
c C PROGRAM TO TEST SUBROUTINE MARQUARDT TO ILLUSTRATE C EXAMPLE 3 FROM APPENDIX B, SECTION 7.3 C ^^iili********:^L^^lil^:^lii^^^tilfill^si^ii^i^fililHl^^^lil^l^li(fii^ii it ***********************************
c
244
IMPLICIT REAL*8 (A-H,0-Z) DIMENSION B(8),Z(18),Y(9),BV(4),BMIN(4),BMAX(4),P(53),A(4,9),
* AC(4,9),CC(6),INDEX(5),OUTPUT(5),B0(4),X(9)
DATA Y/0.173,0.292,0.369,0.429,0.465,0.486,0.504,0.521,0.535/ DATA BO/0.11400253,0.6856597E-3,-1.7036566,-0.53485967E-3/ DATA BMIN/0.0,0.0,-1.8,-0.1/ DATA BMAX/0.5,0.005,0.0,0.0/
OPEN (6, FILE = "EX3.PRN")
X(l) = 540.0 X(2) = 900.0 X(3)= 1260.0 X(4)= 1800.0 X(5) = 2340.0 X(6) = 2880.0 X(7) = 3600.0 X(8) = 4500.0 X(9) = 5400.0
IFV = 0 KD = 4 KK = 4 NN = 9 DO10J=l,KK
B V(J) = 1.0 B(J) = BO(J)
10 CONTINUE
INDEX(1) = 0 INDEX(2) = 0 INDEX(3) = KK INDEX(4) = 0
CC(1) = 0.0 CC(2) = 0.0 CC(3) = 0.0 CC(4) = 0.0 CC(5) = 0.0 CC(6)= l.OE-10
WRITE (6,401) (B(J), J = 1,KK)
245
c 125 CALL MARQUARDT(KK,B,NN,Z,Y,CC,INDEX,BV,BMIN,BMAX,OUTPUT,
* KD,P,A,AC,INDEX1) IF (INDEX(4).GT.100) THEN
WRITE (6,*) 'ITER GREATER THAN 100' GO TO 200
ENDIF IF (INDEX(3).NE.O) THEN
IF (INDEX(3).LT.O) THEN WRITE (6,*) 'ICON IS LESS THAN ZERO; ICON = ',INDEX(3) STOP
ENDIF IF (INDEX(l).EQ.O) THEN
WRITE (6,150) IFV,INDEX(3),(B(J), J = 1,KK),0UTPUT(1) 150 FORMAT (I4,I4,4(E12.5),E12.5)
GO TO 125 ENDIF IF (INDEX(3).GT.0) THEN
CALL FNTX (X,B,Z,IFV,KK,NN) GOTO 125
ENDIF ENDIF
C 200 WRITE (6,*)
WRITE (6,403) INDEX(3) WRITE (6,404) OUTPUT(l) WRITE (6,405) 0UTPUT(2) WRITE (6,406) IFV WRITE (6,407)
C DO300I=l,KK
WRITE (6,408) I,BO(I),B(I) 300 CONTINUE
C STOP
C 401 FORMAT (18H STARTING VALUES = 4F8.3/) 403 FORMAT (' ICON = ', 112) 404 FORMAT (' SUM OF SQUARES = ', E12.5) 405 FORMAT (' ANGLE = ', F12.2) 406 FORMAT (' NUMBER OF FUNCTION EVALUATIONS = ', 112/) 407 FORMAT (• NUMBER INITIAL A FINAL A'/) 408 FORMAT (I5,2X,E12.5,2X,E12.5)
C
246
END C C**************************^**^Hlfi^^f^f;i,fi^^,^^^^^^^^^^^^^^^^^.^^.^^.^.^^,^^,^^,^^^^^^
c C SUBROUTINE FNTX C Q*******************:tf*,^*^f^f^f^,^f^f^:ff^,^,^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
c SUBROUTINE FNTX (X,B,Z,IFV,K,N)
C IMPLICIT REAL*8 (A-H,0-Z) DIMENSION B(K),Z(N),X(N)
C IFV = IFV+1 DO 20 I = 1,N
Z(I) = B(1)*EXP(B(2)*X(I))+B(3)*EXP(B(4)*X(I)) 20 CONTINUE
C RETURN END
D.6. Solution For Example 3
C r^il^^s^i:iliiliiiii3lli*it: ************************************************************
c C SOLUTION FOR EXAMPLE3 FOR C C SOUNDAR RAMCHANDRAN C DEPT. OF CHEM. ENG., TEXAS TECH UNIVERSITY, LUBBOCK, TX C 79409-3121 C r^************************************ * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
c
STARTING VALUES = .114 .001 -1.704 -.001
I 4 .11400E+00 .68566E-03 -0.17037E+01 -0.53486E-03 .24099E+02 6 4 .47283E-01 .62574E-03 -0.11165E+00 -0.82820E-03 .12072E+01 II 4 .81681E-01 .37763E-03 -0.34022E+00 -0.79134E-02 .34520E+00 16 4 .27235E+00 OOOOOE+OO O.OOOOOE+00 O.OOOOOE+00 .31137E+00
247
21 3 .26610E+00 31 4 .24239E+00 36 4 .34488E+00 42 3 .50000E+00 48 4 .46836E+00 53 4 .40264E+00 58 4 .44786E+00 63 4 .45331E+00 68 4 .45537E+00 73 4 .45541E+00
.22271E-03
.10595E-03
.87630E-04
.87169E-04
.OOOOOE+00
.48475E-04
.32769E-04
.30995E-04
.30176E-04
.30141E-04
-0.62521E-02 -0.13442E-01 O.OOOOOE+00
-0.23513E+00 -0.28848E+00 -0.52887E+00 -0.58655E+00 -0.60999E+00 -0.61610E+00 -0.61616E+00
O.OOOOOE+00 -0.10335E-01 O.OOOOOE+00 O.OOOOOE+00
-0.89157E-03 -0.17857E-02 -0.12414E-02 -0.13904E-02 -0.13966E-02 -0.13966E-02
.18267E+00
.13343E+00
.47967E-01
.39940E-01
.29281E-01
.45120E-02
.20659E-02
.18546E-04
.90993E-05
.90933E-05
ICON = 0 SUM OF SQUARES = .90933E-05 ANGLE = 44.98 NUMBER OF FUNCTION EVALUATIONS = 84
NUMBER INITIAL A FINAL A
1 2 3 4
0.11400E+00 0.68566E-03
-0.17037E+01 -0.53486E-03
0.45541E+00 0.30139E-04
-0.61616E+00 -0.13966E-02
248