ACKNOWLEDGMENTS

I wish to extend my sincere thanks to Dr. Russ Rhinehart for his mentorship and

guidance in this work. I appreciate his patience and the calm disposition he displayed at

times when I was too digressive for his liking. He, in my eyes, is a superb combination of

a great researcher and a true teacher~a rare combination these days. Thank you, Russ!

I also wish to extend my sincere thanks to Dr. Jim Riggs for his valuable insights

during the course of this work, and his support and guidance during my entire stay at

Texas Tech. His free-spirit and "never-say-die" attitude will be remembered for a long

time. My thanks also go out to my other committee members Drs. Ray Desrosiers, Terry

Tolliver, and especially to Dr. Oldham, who taught me that it was not OK to be sloppy

because I was an engineer, and that anything that can be known, must be known! Many

thanks to all the faculty and staff members of the Department of Chemical Engineering for

all their help and support during my stay at Tech. Thanks are also extended to the

members of the Process Control and Optimization Consortium at Texas Tech University

for financial support of this work. I would like to thank Jim Deam for his constant

support and encouragement, and Stan Proctor and his Engineering Technology Group at

Monsanto Company, St. Louis, MO, where this work was conceived and got started.

This dissertation marks an important point in my life that began on August 17, 1988,

my first day in Lubbock, and in the past nearly six years I have had the opportunity to

grow and learn a number of new things, and experience the American way-of-life. There

are a number of people who have touched my life and left an impression that would last

this lifetime. I would like to thank the Heichelheims, for giving me a home away from

home; Vikram Singh, who tried very hard to teach me the "Meaning of Life;" Hoshang

Subawalla, for reasons I shall never know; June Heichelheim, for being such a wonderfiil

fiiend; Anand Laxminarayan, my cousin, the ultimate realist; Don Quixote, the ultimate

ii

idealist who taught me that life is not what really it is, but what really it should be; Tammy

"Tamara" Kent, for her constant support and willingness to help with a smile (she forced

me to write these words!!), and also all the entertainment in the Chem. E. Office with

replays of "Seinfeld" and "Married... With Children!"; Mary Beth Abemathy, for patiently

listening to my philosophy about life (much to her chagrin, unfortunately!); Vikram

Gokhale and Suhas Kulkami, from whom I learnt a lot; Paul Bray, for his daily morning

"Far Side" cartoons and introducing me to the brilliant flagon of his "cyanide-special"

cocktails; Bertie Wooster, the weedy butterfly who provided me with many hours of pure

unadulterated English "humour" at the times I needed it most; and to many many dear

friends who sought my ashram in Room 107, Chemical Engineering Building, and whose

company I shall cherish for many years to come.

In each and every step along this journey, the two people who beamed me their love

and unrelentless support "from across the seven seas" are my Appa and Amma back home

in India. All that they taught me and gave me has brought me a long way, indeed. And

finally, many thanks to the family that probably means more to me than anything else in my

life, and to the two wonderfiil people who have influenced me in more ways than I

possibly can count, who taught me the way of unconditional love, and to dream the

impossible dream, and to live life to the fiillest. Thank you. Aunt Dahlia and Uncle Tom!

In all these times, I have tried always to enjoy life and all of its offerings; to live a little, to

learn a little, to laugh a lot, and revel till my ribs squeaked (a la Bertie Wooster), and this

work is a testimony to that.

ui

TABLE OF CONTENTS

ACKNOWLEDGMENTS ii

ABSTRACT vii

LIST OF TABLES ix

LIST OF FIGURES xi

CHAPTER

L INTRODUCTION 1

1.1 Distillation Control and Its Importance 1

1.2 Basics of Distillation Control 1

1.3 Neural Networks and Its Relevance to Process Control 5

IL NEURAL NETWORKS 7

2.1 Feedforward Neural Networks 8

2.2 Computational Aspects in a Feedforward Neural Network 12

2.3 Training of Neural Networks 15

2.4 Backpropagation Training Algorithm 17

2.5 Analysis of the Backpropagation Algorithm 18

2.6 Optimization Approach to Neural Network Training 20

2.7 The Levenberg-Marquardt Method for Nonlinear Least Squares 25

2.8 Examples Using the Marquardt Method for Training Neural Networks 29

m. STEADY-STATE MODELS FOR DISTILLATION 35

3.1 Process Models and Process Inverse Models 35

3.2 Distillation Column Test Cases 36

3.3 Development of Steady-State Inverse Models for Distillation 37

iv

3.4 Optimal Training of Neural Networks 48

IV. DYNAMIC PROCESS SIMULATIONS 50

4.1 Mathematical Model for Nonideal Multicomponent Distillation 50

4.2 Additional Features of the Dynamic Process Simulators 57

4.3 Open-Loop Response Characteristics of the Processes 57

4.4 Steady-State Analyses of Distillation Column Operation 71

V. MODEL-BASED CONTROL STRATEGY 96

5.1 Nonlinear Process Model-Based Control (Nonlinear PMBC) 97

5.2 Nonlinear Process Model-Based Control of Distillation Columns 99

VI. CONTROL RESULTS 104

6.1 Lab-Column Controller Tests 104

6.2 High-Purity Column Controller Tests 126

Vn. PROCESS-MODEL MISMATCH 149

7.1 Process-Model Mismatch 149

7.2 Process-Model Mismatch for the Distillation Columns 151

7.3 "It's the Gain Prediction, Stupid!" 157

Vm. DISCUSSION, CONCLUSION, AND RECOMMENDATIONS 163

8.1 On Using Neural Network Steady-State Process Inverse

Models 163

8.2 On Optimal Training of Neural Networks 164

8.3 In Conclusion 166 8.4 Recommendations 168

BIBLIOGRAPHY 176

APPENDICES

A. ERROR BACKPROPAGATION TRAINING ALGORITHM 185

B. THE MARQUARDT ALGORITHM 193

C. EMPIRICAL CORRELATIONS FOR THE METHANOL-WATER SYSTEM 206

D.FORTRAN PROGRAMS FOR EXAMPLES IN APPENDIX B... 237

VI

ABSTRACT

Distillation control is difficult because of its nonlinear, interactive and nonstationary

behavior; but, unproved distillation control techniques can have a significant impact on

improving product quality and environmental resource protection. Advanced control

strategies use a model of the process to select the desired control action. While

phenomenological models have demonstrated efficient control of highly nonlinear and

interactive distillation colunms, they can get complicated and computationally intensive.

Further, these models may require frequent reparametrization to eliminate any process-

model mismatch that may have accrued with time.

Neural networks provide an alternate approach to modeling process behavior, and

have received much attention because of their wide range of applicability, and their ability

to handle complex and nonlinear problems. The main advantage in using neural networks

is that neural network models are computationally simple, and possess enormous

processing power, speed, and generality.

In this study, neural network process-inverse models were developed for two

different methanol-water distillation columns: (i) a lab-scale column; and (ii) an industrial-

scale high-purity column. The data required for "training" and "testing" the neural

networks for the two distillation columns were obtained from steady-state simulations of

the two distillation columns developed using a commercial steady-state simulation

package. The neural networks were trained using a very efficient nonlinear optimization

algorithm based on the Levenberg-Marquardt method.

The neural network steady-state process-inverse models were used in conjunction

with a reference system synthesis based on first-order dynamics. The neural network

model-based controllers were tested on dynamic simulations of the two distillation

vu

columns for both servo and regulatory modes of operation, and their performances were

compared with conventional static feedforward Proportional-Integral controllers.

The simplicity and directness of the novel approach presented in this study addresses

issues such as obtaining training and testing data from steady-state simulation packages,

training the neural networks with a more robust and efficient nonlmear optimization

algorithm, the use of steady-state process-inverse neural network models, and

incorporating the model with a reference system synthesis to formulate a very simple

multivariable control structure that make it distinct and better when compared with

conventional Proportional-Integral controllers. The methodology offers the advantages of

easy implementation and a practical solution to difficult control problems.

vm

LIST OF TABLES

3.1 Design and Operating Conditions for the Two Distillation Columns 39

4.1 Process Gains for Overhead and Bottom Compositions for the Lab Distillation Column 89

4.2 Process Gains for Overhead and Bottom Compositions for the High-Purity Distillation Column 90

4.3 First-Order Plus Deadtime Models for Overhead and Bottom Compositions for the Lab Distillation Column 91

4.4 First-Order Plus Deadtime Models for Overhead and Bottom Compositions for the High-Purity Distillation Column 92

4.5 Relative Gain Array for the Lab Distillation Column using the Average Process Gains 94

4.6 Relative Gain Array for the High-Purity Distillation Column using the Average Process Gains 95

6.1 Description of the Controller Tests for the Lab Distillation Column 105

6.2 Description of the Servo-mode Controller Tests for the High-Purity Distillation Column 106

6.3 Description of the Regulatory-mode (Feed Flowrate Upsets) Controller Test for the High-Purity Distillation Column 107

6.4 Description of the Regulatory-mode (Feed Composition Upsets) Controller Test for the High-Purity Distillation Column 108

6.5 Comparison of Controller Performance for the Lab Distillation Column 118

6.6 Neural Network Model-Based Controller Performance for the High-Purity Distillation Column 147

6.7 Conventional Feedback PI plus Feedforward Controller Performance for the Lab Distillation Column 148

B.l Values for INDEX(l) and their Corresponding Meaning 197

IX

B.2 Values for INDEX(2) and their Corresponding Meaning 198

B.3 Values for INDEX(3) and their Corresponding Meaning 199

B.4 Values for INDEX(3) Under Error Returns and their Corresponding Meaning 202

C. 1 Vapor-Liquid Equilibrium for Methanol-Water System at 1 atma 208

C.2 Enthalpy Data for Methanol-Water System at 1 atma 209

C. 3 Transport Property Data for Methanol-Water System at 1 atma 210

C.4 VLE for Methanol-Water System at 2 atma 221

C. 5 Enthalpy Data for Methanol-Water System at 2 atma 222

C. 6 Transport Property Data for Methanol-Water System at 2 atma 223

LIST OF FIGURES

1.1 Schematic of a simple distillation column 2

2.1 Feedforward neural network architecture 9

2.2 Signal processing within a neuron 10

2.3 Computational aspects in a feedforward neural network 13

2.4 Basic types of neural network learning mechanisms 16

2.5 Mapping a nonlinear fiinction with a neural network 33

3.1 Reflux rate predictions from neural networks for lab column 40

3.2 Boilup rate predictions from neural networks for lab column 42

3.3 Reflux rate predictions from neural networks for high-purity column 44

3.4 Boilup rate predictions from neural networks for high-purity

column 46

4.1 Schematic of a distillation column with details on the '/'th stage 52

4.2 Open-loop response to boilup rate changes in lab column 59

4.3 Open-loop response to reflux rate changes in lab column 62

4.4 Open-loop response to feed flowrate changes in lab column 65

4.5 Open-loop response to feed composition changes in lab column 68

4.6 Open-loop response to boilup rate changes in high-purity column 72

4.7 Open-loop response to reflux rate changes in high-purity column 76

4.8 Open-loop response to feed flowrate changes in high-purity column 80

4.9 Open-loop response to feed composition changes in high-purity

column 84

5.1 The neural network model-based control strategy 103

6.1 Neural network model-based controller without dynamic compensation on lab column 110

xi

6.2 Static feedforward Pl-controller without dynamic compensation on lab column 116

6.3 Neural network model-based controller with dynamic compensation on lab column 120

6.4 Response of controlled variables to neural network model-based controller without dynamic compensation on lab column 123

6.5 Setpoint changes with neural network model-based controller without dynamic compensation on high-purity column 127

6.6 Setpoint changes with static feedforward PI controller without dynamic compensation on high-purity column 132

6.7 Feed flowrate changes with neural network model-based controller without dynamic compensation on high-purity column 134

6.8 Feed flowrate changes with static feedforward PI controller without dynamic compensation 138

6.9 Feed composition changes with neural network model-based controller without dynamic compensation on high-purity column 140

6.10 Feed compensation changes with static feedforward PI controller without dynamic compensation 145

7.1 Steady-state process-model mismatch for the lab column 153

7.2 Steady-state process-model mismatch for the high-purity column 155

7.3 Steady-state process gains for the lab column 159

7.4 Steady-state process gains for the high-purity column 161

8.1 Proposed structure for constrained neural network model-based control 170

A.1 A 3-layered feedforward neural network 186

A.2 Processes in the 7'th neuron in the hidden layer 187

A.3 Processes in the 'yfth neuron in the output layer 188

C. 1 Vapor-Liquid equilibrium for methanol-water system at 1 atma 211

C.2 Saturated liquid temperature versus liquid-phase composition for methanol-water system at 1 atma 212

Xll

C.3 Saturated liquid density versus liquid-phase composition for methanol-water system at 1 atma 214

C.4 Saturated vapor density versus liquid-phase composition for methanol-water system at 1 atma 215

C.5 Subcooled liquid density versus liquid-phase composition for methanol-water system at 1 atma 216

C.6 Saturated liquid enthalpy versus liquid-phase composition for methanol-water system at 1 atma 217

C.7 Saturated vapor enthalpy versus vapor-phase composition for methanol-water system at 1 atma 219

C.8 Average molecular weight versus liquid-phase composition for methanol-water system at 1 atma 220

C.9 Vapor-Liquid equilibrium for 0.0 < x < 0.1 for methanol-water system at 2 atma 225

CIO Vapor-Liquid equilibrium for 0.1 < jc < 0.98 for methanol-water system at 2 atma 226

C. 11 Vapor-Liquid equilibrium for 0.98 < x < 1.0 for methanol-water system at 2 atma 227

C. 12 Saturated liquid temperature versus liquid-phase composition for methanol-water system at 2 atma 228

C. 13 Saturated liquid density versus liquid-phase composition for methanol-water system at 2 atma 229

C. 14 Saturated vapor density versus vapor-phase composition for methanol-water system at 2 atma 231

C. 15 Subcooled liquid density versus liquid-phase composition for methanol-water system at 2 atma 232

C. 16 Saturated liquid enthalpy versus liquid-phase composition for methanol-water system at 2 atma 233

C. 17 Saturated vapor enthalpy versus vapor-phase composition for methanol-water system at 2 atma 234

C. 18 Subcooled liquid enthalpy versus liquid-phase composition for methanol-water system at 2 atma 235

xui

CHAPTER I

INTRODUCTION

11. Distillation Control and Its Importance

For many reasons, distillation remains the most important separation technique in

chemical process industries around the world and constitutes a significant fraction of their

capital investment. The operating costs of distillation columns are often a major part of

the total operating costs of many processes. Within the U.S., there is an estimated 40,000

columns which consume approximately 3% of the total U.S. energy usage (Humphrey et

al., 1991). For these reasons, improved distillation control can have a significant impact

on reducing energy consumption, improving product quality, and protecting environmental

resources. However, distillation columns present challenging control problems because

their behavior is usually nonlinear, nonstationary, interactive, and their operation is often

subject to constraints and disturbances.

1.2 Basics of Distillation Control

A simple distillation column such as the one shown in Figure 1.1 will be used to

present some fiandamental aspects of distillation control. Even though the principles may

seem quite obvious and trivial, oversights of the basic process behavior are often reasons

for poor control. The column shown in Figure 1.1 has one single feed, and produces two

products. Heat is added to the column in the reboiler and removed in the condenser.

Reflux is introduced on the top tray. One of the best sources for details on the operational

and design aspects of distillation columns is found in the text by Treybal (1980).

The degrees of freedom, from a process control standpoint, is the number of variables that

can be or must be controlled. Mathematically speaking, the degrees of freedom can be

calculated by subtracting the total number of independent equations from

1

Feed NF

9

Condenser

LK— Overhead Distillate

Product

Boilup

Reboiler

Bottoms Product

Figure 1.1. Schematic of a simple distillation column

the total number of variables. An easier approach suggested by Luyben (1990), is to add

the total number of rationally placed control valves, where the term "rationally placed"

disqualifies poorly conceived designs that use two control valves in series, etc. In

Figure 1.1, there are five control valves, one on each of the following streams: distillate,

reflux, coolant, bottoms, and heating medium (usually steam). It is assumed that the feed

flowrate is set by the upstream process. Therefore, the simple distillation column has five

degrees of freedom. In any process, inventories, which include the liquid levels and the

pressures, must be controlled. Subtracting three variables from the five that must be

controlled gives two degrees of freedom. Therefore, there are two and only two

additional variables that can be controlled in this distillation column. Note that no

assumption has been made with regard to the number or the type of chemical components

being separated. So, irrespective of whether the distillation column is a simple binary

column or a complex multicomponent column, it has only two degrees of freedom. Of

course, this is true only for a systems under unconstrained operating conditions.

The two variables that are chosen as controlled variables depend on many factors.

Some common situations are:

(1) Control the composition of the light-key impurity in the bottom product and the

composition of the heavy-key impurity in the overhead product (distillate).

(2) Control a temperature in the rectifying section (the section above the feed tray) of

the column and a temperature in the stripping section (the section below the feed tray) of

the column.

(3) Control the reflux flowrate and a temperature somewhere in the column.

(4) Control the flowrate of the heating medium to the reboiler and a temperature near

the top of the column.

(5) Control the reflux ratio (ratio of the reflux flowrate to the distillate flowrate) and

a temperature in the column.

3

The above discussion shows that (a) only two things can be controlled, and (b)

normally at least one composition (or one temperature) somewhere in the column must be

controlled Once the five variables that must be controlled are specified (e.g., two

compositions, two levels, and pressure), we still have the task of deciding the choice of

controlled variable-manipulated variable pairing. The "pairing" problem is known as

determining the structure of the control system. Volumes have been written addressing

the issue of the distillation control structure (Luyben, 1992; Skogestad et al., 1990;

McAvoy, 1983). The pairing issue is of extreme importance because of the highly coupled

nature of the interactions between the controlled variables and the manipulated variables.

As a result, simple single-loop control using conventional Proportional-Integral-Derivative

(PID) controllers cause the control loops to interact leading to deterioration in control

performance. This is especially true if the objective is to control the two compositions at

both ends of the column using the reflux and steam flowrates as the manipulated variables

(Wood and Berry, 1973).

Accordingly, much research and development, in both private and public sector, has

focused on control methods that use modem computing power to cope with these control-

related difficulties (Luyben, 1992). Advanced control techniques require either the use of

conventional PID controllers with complex configurations and elaborate tuning procedures

(Papastathapoulou and Luyben, 1991; Ding and Luyben, 1990; Muhrer et al., 1990; Finco

et al., 1989; Elaahi and Luyben, 1985; Tyreus and Luyben, 1976) or the use of nonlinear

multi-variable models (Hokanson and Grestle, 1992; Pandit et al., 1992; Rheil, 1992;

Patwardhan et al., 1990; Riggs, 1990) for efficient control of distillation columns.

Comprehensive reviews on the use of nonlinear multi-variable models for advanced

process control strategies can be found in the papers by Bequette (1990), Bosley et al.

(1992), and Seborg et al. (1986).

1.3. Neural Networks and its Relevance to Process Control

The nonlinear models used in the nonlinear multi-variable control strategies

generally tend to become rigorous and computationally intensive as the process behavior

becomes complex. While control success has been demonstrated (Riggs et al., 1993;

Pandit et al., 1992; Pandit and Rhinehart, 1992; Cott et al., 1985), it is often at the

expense of computational power, operator-fiiendly interaction, or ease of controller

development and maintenance.

Neural networks offer an alternate approach to modeling process behavior as they do

not require a priori knowledge of the process phenomena. They "learn" by extracting pre

existing patterns from data that describe the relationship between the inputs and the

outputs in any given process phenomena. When appropriate inputs are applied to the

network, the network acquires "knowledge" from the environment in a process known as

"learning." As a result, the network assimilates information that can be recalled later.

Neural networks are capable of handling complex and nonlinear problems, process

information rapidly, and reduce the engineering effort required in model development.

Neural networks have been applied successfiilly to a variety of problems, such as

process fault diagnosis (Venkatasubramanian et al., 1990; Venkatasubramanian and Chan,

1989), modeling of semiconductor manufacturing processes (Himmel and May, 1993;

Reitman and Lory, 1993), system identification (MacMurray and Himmelblau, 1993;

Pottman and Seborg, 1992, Narendra and Parthasarathy, 1990), pattern recognition and

adaptive control (Hinde and Cooper, 1993; Cooper et al., 1992a,b; Cooper et al., 1990),

process modeling and control (You and Nikolaou, 1993; Nahas et al., 1992; Psichogios

and Unger, 1992; Bhat and McAvoy, 1990; Bhat et al., 1990; Narendra and Parthasarathy,

1990; Guez et al., 1988), and statistical time series modeling (Poll and Jones, 1994;

Weigand et al., 1990). In the area of distillation control, neural networks have found

application in identification and control of a packed distillation column (MacMurray and

5

Himmelblau, 1993) where a neural network model was used as the model in model

predictive control. Neural network control of distillation in a multi-variable model

predictive control framework also include studies on dynamic simulations (Willis et al.,

1990) and pilot plants (Megan and Cooper, 1993; Willis et al., 1992). The papers by

Thibault and Grandjean (1992) and Astrom and McAvoy (1992) provide in-depth reviews

on neural networks application in chemical process control.

CHAPTER II

NEURAL NETWORKS

Comparison of neural networks to conventional data processing and expert systems

allows a better understanding of the technology and its applications. Conventional

processing techniques apply explicit procedures or steps to numerical data in order to

arrive at an output. Expert systems, on the other hand, use logical facts as input and

employ a set of explicitly specified rules in a knowledge base to arrive at a decision. In

contrast, neural networks use no explicitly specified knowledge or procedure to analyze

new data. They extract pre-existing patterns from statistically-based data. When

appropriate inputs are applied to the network, the network acquires "knowledge" from the

environment in a process known as "learning." As a result, the network assimilates

information that can be recalled later. The above statements do not imply that the person

using neural networks for process modeling can entirely ignore understanding the process

behavior and its interactions. It simply means that the neural network does not need

understand the underlying phenomena of the process being modeled.

There are several different types of neural networks that are being studied and/or

used in applications. Some of the network models used commonly in process control

systems are: multi-layer perceptron, Kohonen's self-organizing map, adaptive resonance

network, the Hopfield network, the Boltzmann machine, and the cerebellar model

articulation controller (CMAC). Some of the above mentioned networks are called

feedforward networks and, others, feedback (or recurrent) networks, based on whether

the information flow occurs only in the forward direction or in, both, forward and

backward directions. Neural networks are also fiirther classified as either static or

dynamic systems, based on whether the stored mapping can be recalled instantly or

involves some delay or time-domain characteristics. Each type of neural network with

7

specific differences in internal structure and fimction have specific uses and advantages in

control applications. In this study, our focus is restricted to only feedforward neural

networks. Details on the structure and applications of some the other types of neural

networks can be found elsewhere (Astrom and McAvoy, 1992; Zurada, 1992; Bhat and

McAvoy, 1990).

2.1. Feedforward Neural Network.s

A network is a dense mesh of nodes and connections. The basic processing element

of a neural network is the neuron. The neurons operate collectively and simultaneously on

most or all data. Figure 2.1 illustrates the general structure of a feedforward network in

which the information flow occurs only in the forward direction, i.e., from the input to the

output. Feedforward neural networks are organized in layers and, typically, consist of at

least three layers: an input layer; one or more hidden layers; and an output layer. In

addition, there may be a bias neuron, that provides a constant and invariant output, say +1

or -1. The connections are the means for information flow. Each connection has an

associated weight, Wj, which is expressed by a numerical value that can be modified. The

weight is a measure of the connection strength between two neurons. Each neuron in the

hidden and output layers accepts information from other neurons in the network and

performs a specific computational task. Inputs to any neuron in the network are first

multiplied by the corresponding connection weight. The weighted inputs are then summed

up. The sum of the weighted inputs are finally modified by a transfer function to obtain

an output from the neuron. Information flows from the input layer to the output layer of

the neural network with the above mentioned processes occurring in each neuron in the

hidden and output layers till a network output is obtained. Figure 2.2 presents a graphical

picture of the processes occurring in each neuron in the hidden and output layers with^

8

Network Output i

Weights

Neurons

Output Layer

Hidden Layer

Input Layer

Network Inputs

Figure 2.1. Feedforward neural network architecture

Output from the Neuron

w(i+2)

x(i+l)

Inputs from other Neurons

Figure 2.2. Signal processing within a neuron

10

bemg the transformed output from the neuron, and z being the weighted sum of all inputs,

X, to the neuron.

The transfer fimction is also known as a mapping function or an activation function.

In eariy neural network models (Rosenblatt, 1958), the transfer fiinctions were discrete

and discontinuous fiinctions with binary outputs. This led to the conclusion that layered

neural networks had limited potential in solving more complex problems (Minsky and

Pappert, 1969). It was not until the mid-1980s that continuous differentiable transfer

fiinctions were discovered (Hopfield, 1984, 1982). It was then shown that a continuous

valued neural network with a continuous differentiable nonlinear transfer fiinction can

approximate any continuous fiinction arbitrarily well in a compact set (Cybenko, 1988).

Cybenko (1989) also demonstrated the fact that any arbitrary decision region can be well

approximated by a continuous neural network with only one single internal hidden layer

and any continuous sigmoidal transfer fiinction. Typically, transfer fiinctions are

sigmoidal (5-shaped), and can be either unipolar (output range from 0 to 1) or bipolar

(output range from -1 to +1) fiinction. The transfer fiinctions can also be linear, and can

consist of algebraic or differential equations. In a discrete transfer fiinction neuron, the

bias provides a threshold limit that triggers the neuron. In a continuous-valued transfer

fiinction, its role is not quite clear and many feedforward network formulations do not

have the additional bias neuron. In this research work, a hyperbolic tangent {y = tanh(2)),

which is a bipolar, continuously differentiable fiinction, was used as the transfer fiinction

for all neurons in the hidden and output layers. Also, a bias neuron with a constant output

of+1 was used. In our experience the hyperbolic tangent performs extremely well in

mapping the input-output relationships of many complex processes. More details on the

types of transfer fiinctions used in neural networks can be found elsewhere (Zurada,

1992).

11

2.2. Computational Aspects in a Feedforward Neural Network

It is important to understand the fiinctioning of a feedforward neural network from a

mathematical standpoint in order to understand how a neural network can be trained to

"learn" a particular process phenomenon. Figure 2.3 shows a feedforward neural network

similar to that in Figure 2.1 but with more details to allow a mathematical treatment.

Let x^ and Xj be two inputs to the network. Assume that they represent the

normalized values of some "real-worid" data, scaled to a range of ±1. Let the bias neuron

have a constant output of+1, i.e., ^ = 1. Let us number the weights in the following

sequence: w^ is the weight for the connection between the bias node and the first node in

the hidden layer; Wj is the weight for the connection between the first node in the input

layer and the first node in the hidden layer; w^ is the weight for the connection between the

second node in the input layer and the first node in the hidden layer. Start again with the

bias node and connect the second node in the hidden layer, and so on. After all the

neurons in the hidden layer have been connected, start with the bias node and connect all

the neurons in the output layer. Figure 2.3 shows all the weights, 13 in total, for the two-

input, three-hidden node, one-output (abbreviated as 2-3-1) feedforward neural network.

The general formula that defines the total number of weights in a feedforward network

with a bias neuron is

k = (A2,„ + \)n^,j + (%^ +1)«^„ (2.1)

where k is the total number of weights in the neural network, n,„ is the number of inputs to

the network, n^^^ is the number of neurons in the hidden layer, and « „ is the number of

outputs from the network.

The numbering scheme chosen herein is just one of the several that are used

commonly. The scheme shown here translates into an efficient algorithm that can be easily

coded in any computer programming language. Some algorithms use three subscripts to

12

Figure 2.3. Computational aspects in a feedforward neural network

13

denote each weight, w.j,^ which represents the connection weight between the 'fth node in

the ' -I'th layer and the 'fth node in the ' th layer in a multi-layered feedforward network.

If JC, and JCj are the two inputs to the network, then z„ the summed input to the Vth

node in the hidden layer is calculated as

2j =Z>.VVj +Xy.W2 +Jf2-^3>

22 =b.W^ +Jt^lW5+JC2-^6» ^ ^

z-^=b.w^+x^.w^+X2.Wg. (2.2)

The summed input to each neuron is then transformed by the transfer fimction, the

hyperbolic tangent in this case, to produce the transformed output. If/i, is the transformed

output of the 7'th node in the hidden layer, then

/i, = tanh(zi),

hj = tanh(22), and

/J3 = tanh(z3). (2.3)

If o, is the summed input to the '/'th node in the output layer, then

O, =*.M',0+/j,.H'j,+/l2.W,2+/?3.H',3, (2.4)

and the transformed output, y„ which also happens to be the network output, will be

j'l =tanh(o,). (2.5)

The above scheme describes the mathematical fiinctioning of a feedforward neural

network, and can easily be extended to any number of nodes in the input, hidden and

output layers. The above discussion also shows that a feedforward neural network is a

static system because once trained, the recall is instantaneous and the output is obtained in

one single-pass through the network. On the other hand, dynamic systems require

iterations with time before an output can be obtained.

14

2.3. Training of Neural Networks

A neural network model development involves training the neural network to "learn"

the input-output mapping from a set of examples. In general, learning can be defined as a

permanent change in behavior brought about by experience. In human beings, learning is

an inferred process; it can be assumed to have occurred by observing changes in

performance. In neural networks, learning is a more direct process, and typically, it can be

captured by distinct cause-effect relationships. For a neural network to perform any of the

tasks mentioned earlier, it has to learn the input-output mapping from a set of examples.

The process of learning corresponds to changes in the weights.

Training of a neural network can be either supervised or unsupervised. Figure 2.4

gives a graphical picture of the two basic learning modes. In supervised learning, the error

between the actual, K, and desired, D, responses is used to correct the network parameters

(the weights) externally so that the error decreases. Therefore, a set of input and output

patterns called a training set is required for supervised learning. In many situations, the

inputs, outputs, and the computed gradients are deterministic, and the minimization

proceeds over all its random fluctuations. As a resuh, most supervised learning algorithms

reduce to stochastic minimization of error in a multi-dimensional weight space.

In unsupervised learning, the desired response is not known; therefore, explicit error

information cannot be used to improve network behavior. Suitable weight self-adaptation

mechanisms have to be embedded in the network. The topic of unsupervised learning is

an area of active research interest in pattern recognition where it is used to perform

clustering of objects when there is no a priori information available about the classes.

More details on unsupervised learning can be found in the text by Zurada (1992).

An important part of neural network training is the learning rule. Learning rules are

algorithms that govern the modification of the internal representation (the weights) of the

15

Network Input, X

Network

w Actual

Network Output, Y

1 Teacher Desired

Output, D

(a) Supervised Learning

Network Input, X

Network

w Actual

Network Output, Y

(b) Unsupervised Learning

Figure 2.4. Basic types of neural network learning mechanisms

16

network in response to the inputs and the transfer fiinction. The learning rules fall into

seven basic categories: Hebbian Learning Rule (Hebb, 1949); Perceptron Learning Rule

(Rosenblatt, 1958); Delta Learning Rule (McClelland and Rumelhart, 1986); Widrow-

Hoff Learning Rule (Widrow, 1962); Correlation Learning Rule; Winner-Take-All

Learning Rule (Hecht-Neilsen, 1987); and Outstar Learning Rule (Grosberg, 1982, 1977).

Details regarding how each rule accomplishes the weight adjustment, the learning type

(supervised or unsupervised) involved, and the characteristics of the neurons for these

different learning rules are discussed in the text by Zurada (1992).

2.4. Backpropagation Training Algorithm

A popular training algorithm for multi-layered feedforward networks is called the

error backpropagation training algorithm or, simply, backpropagation. Backpropagation

uses the Generalized Delta Learning Rule (Rumelhart et al., 1986; Werbos, 1974), and

has been used extensively by researchers for network training. The algorithm is called so

because even though the network operates in the forward manner (i.e., from input to

output) during the classification stage, the weight adjustments enforced by the learning

rules propagate exactly backward, from the output layer toward the input layer. Classical

backpropagation is a gradient approach to optimization which is executed iteratively with

the implicit bounds on the distance moved in the search direction in the weight space fixed

via the learning rate, which is equivalent to a step size.

The problems of learning an input-output mapping from a set of P examples can be

transformed into the minimization of a suitably defined error fiinction. Although different

definitions of the error have been used, we will consider the "traditional" sum-of-squared-

differences error fiinction defined as

£ = -Z£p=-ZI (^p , -^p , ) ' . (2.6) 2 p=\ 2 p=h=i

17

where E is the total sum-of-squared error for all P patterns, Ep is the sum-of-squared error

for the '/7'th pattern, K , and D , are the desired and network predicted responses for the

'/'th output of the '/?'th pattern. The backpropagation algorithm is composed of two

stages. In the first, the contributions to the gradient coming from each pattern {dEJdw^j)

are calculated "backpropagating" the error signal. The partial contributions are then used

to correct the weights, after every pattern presentation. If 6;^ the gradient of the error

fiinction, is defined as

5 t = V £ , ( W , ) , (2.7)

where W is the weights matrix, the weight update is given as

AW,=-Ti5„ (2.8)

where r\ is the learning rate.

The backpropagation algorithm defines an important step in the history of neural

networks and understanding the mechanisms of weight adjustment using backpropagation

leads to a better appreciation of the general procedure of training neural networks. The

backpropagation algorithm is presented in detail in Appendix A.

2.5. Analvsis of the Backpropagation Algorithm

The essence of the backpropagation algorithm is the evaluation of the contribution of

each particular weight to the output error. This evaluation is possible because the

objective fiinction of a neural network contains continuously differentiable fiinctions of the

weights. It would not have been possible with discrete neurons.

Even though the backpropagation algorithm is a breakthrough in supervised learning

of layered neural networks, in practice, however, implementation of the algorithm may

encounter several difficulties. These difficulties are typical of those arising in multi

dimensional optimization approaches. Backpropagation has the advantage of being readily

adaptable to parallel hardware architectures. However, most current studies of artificial

18

neural networks (ANNs) are conducted on primarily serial rather than parallel processors.

On serial machines, backpropagation can be very inefficient because the choice of initial

weights affects the algorithm severely in terms of its convergence and the rate at which it

converges (Kramer and Leonard, 1990).

One of the problems is that the error minimization procedure may get "hung up" in a

local minimum of the error fiinction. The on-line procedure has to be used if all the

patterns are not available before learning starts, and a continuous adaptation to a stream of

input-output signals is desired. One of the reasons in favor of the on-line approach is that

it possesses some randomness that may help in escaping from local minimum. The

objection to this is that the method may, for the same reason, miss a good local minimum.

The on-line update is useftil when the number of patterns is so large that the errors

involved in the computation of the total gradient may be comparable to the gradient itself

The fact that many patterns possess redundant information has been cited as an argument

in favor of on-line backpropagation, because many of the contributions to the gradient are

similar, so that waiting for all contributions before updating can be wasteful (Le Cun,

1986). This is especially true mainly in pattern classification.

The effectiveness and convergence of the backpropagation algorithm depends

significantly on the learning constant, r]. In general, however, the optimum value of r|

depends on the problem being solved, and there is no single learning constant value

suitable for different training cases. This is a problem common to all gradient-based

optimization techniques. While gradient descent can be an efficient method for obtaining

the weight values that minimize an error, error surfaces frequently possess properties that

make the procedure slow to converge. In the original formulation, the learning constant,

r|, was taken as a fixed parameter. Unfortunately, if the learning constant is fixed in an

arbitrary way, there is no guarantee that the network will converge. But even if r| is

19

chosen appropriately so that the error decreases with a reasonable speed and oscillations

are avoided, gradient descent is not always the fastest method to employ.

In practice quite a few simple improvements have been used to speed up convergence

and improve the robustness of the backpropagation algorithm (Kramer and Leonard,

1990; Leonard and Kramer, 1990a,b; Hush and Sales, 1988). One such improvement is

the addition of a momentum term that helps to accelerate the convergence by

supplementing the current weight adjustment with a fraction of the most recent weight

adjustment. The weight update can be written as

AW,=-Ti5,+aAW,_„ (2.9)

where k and k-\ are used to indicate the current and the most recent training step,

respectively, and a is a user-selected positive momentum constant. The second term in

Equation 2.9 is called the momentum term and, typically, is chosen between 0.1 and 0.8.

The momentum term keeps the direction of the descent from changing too rapidly from

step to step. It has been shown that inclusion of the momentum term can considerably

speed up convergence when comparable r| and a are employed when compared with the

standard backpropagation technique.

If all the patterns are available, collecting the total gradient before deciding the next

step can be usefiil in order to avoid a mutual interference of the weight changes. This

strategy is common known as batching and has been used with the standard

backpropagation algorithm (Rumelhart et al., 1986). But still, the batch technique

requires that the user select the values for the learning and momentum constants.

2.6. Optimization Approach to Neural Network Training

One of the competitive advantages of neural networks is the ease with which they

may be applied to novel or pooriy understood problems. It is, therefore, essential to

consider automated and robust learning methods with good average performance on many

20

class of problems. The current trend is to use optimization tools and strategies that exhibit

distinctly superior performance (Peel et al., 1992; Barnard, 1992; Battiti, 1992; Hsiung et

al., 1991) and, furthermore, are easier to apply because they do not require the choice of

critical parameters (such as TI and a) by the user. Several researchers (Kramer and

Leonard, 1990; Kollias and Anastassious, 1988; Kung and Hwang, 1988; Ricotti et al.,

1988; Parker, 1987; Watrous, 1987; White, 1987) have shown that optimization

algorithms employing modem unconstrained optimization techniques based on the secant

or conjugate gradient methods together with the backpropagation concept are much

better than classical backpropagation itself

2.6.1. Conjugate Gradient Methods

One of the difficulties in using the steepest descent method is that a one-dimensional

minimization in some arbitrary direction a followed by a minimization in another direction

b does not imply that the fiinction is minimized on the subspace generated by a and b.

Minimization along direction b may in general spoil a previous minimization along

direction a (this is why the one-dimensional minimization in general has to be repeated a

number of times larger than the number of variables). On the contrary, if the directions

were noninterfering and linearly independent, at the end of A' steps the process would

converge to the minimum of the quadratic fiinction. The concept of noninterfering

(conjugate) directions is the basis of the conjugate gradient method for minimization. A

major difficulty with the above form is that, for a general fiinction, the obtained directions

are not necessarily the descent directions and numerical instability can result.

The use of the momentum term to avoid oscillations in the backpropagation method

(Rumelhart et al., 1986) can be considered as an approximated form of conjugate gradient.

In both cases, the gradient direction is modified with a term that takes the previous

direction into account, the important difference being that the parameter in the conjugate

21

gradient technique is automatically defined by the algorithm, while the momentum rate has

to be "guessed" by the user. More details on the conjugate gradient method are found

elsewhere (Press et al., 1992; Battiti, 1992; Leonard and Kramer, 1990a). Conjugate

gradient methods have been used by Barnard (1992) and Leonard and Kramer (1990a) for

training feedforward neural networks.

2.6.2. Newton's Method

Newton's method can be considered as the basic local method using second-order

information. It is important to stress that its practical applicability to multi-layered neural

networks is hampered by the fact that it requires calculation of the Hessian matrix, a

complex and expensive task. If the Hessian matrix (V^E) is positive definite and the

quadratic model is correct, one iteration is sufficient to reach the minimum. Assuming

that the Hessian can be obtained in reasonable computing times, the main practical

difficulties in applying the "pure" Newton's method arise when the Hessian is not positive

definite, or when it is singular and ill-conditioned. It is worth observing that, although

troublesome for the above reasons, the existence of the directions of negative curvature

may be used to continue from the saddle point where the gradient is close to zero. Battiti

(1992) has reviewed in detail Newton's method and some of its modifications to deal with

global convergence, indefinite Hessian, and iterative approximations for the Hessian itself

Modifications of Newton's method have been used by Poli and Jones (1994) and White

(1989) for training feedforward neural networks.

2.6.3. Secant Methods

When the Hessian is not available analytically, secant methods are widely used

techniques for approximating the Hessian in an iterative way using information only about

the gradient. Historically, these methods are also called as quasi-Newton methods. The

22

suggested strategy is to update a previously available approximation instead of

determining a new approximation. The Broyden-Fletcher-Goldfarb-Shanno (BFGS)

update (Broyden et al., 1973), a positive definite secant update, has been the most

successfiil update in a number of studies performed during the years. Secant methods for

learning in muhi-layer neural networks have been used, for example, in Watrous (1987).

The 0{hP-) complexity of BFGS is clearly a problem for very large networks, but the

method can still remain very competitive if the number of examples is very large, so that

the computation of the error fiinction dominates. Hsiung et al. (1991) and Parker (1987)

have used modifications of the secant method for training of feedforward neural networks.

2.6.4. Special Method for Least Squares

One drawback of the BFGS method is that it requires storage for a matrix of size A x

A' and a number of calculations of order 0{N^). While available storage is less of a

problem now than it was a decade ago, the computational problem still exists when A

becomes of the order of one hundred or more. It has also been shown that it is possible to

use a secant approximation with 0{N) computing and storage time that uses second-order

information in methods that are known as one-step secant (OSS) methods (Battiti, 1989).

But, if the error fiinction that is to be minimized is the usual fiinction described in

Equation 2.6, learning a set of examples is reduced to solving a nonlinear least-squares

problem, for which special methods have been devised. The question then is what makes

this problem different from that dealt in the general nonlinear fiinction minimization in

Sections 2.6.1-2.6.3? In a broad sense, it is not different at all!

Let us consider a model that depends nonlinearly on a set of A/ unknown parameters

a^k=\,2, ...,N. Using standard regression analysis notation to define an objective

fiinction (j), we wish to determine best-fit parameters by minimizing the function (j). The

minimization has to proceed iteratively because of the nonlinear fiinctional dependencies.

23

Given initial estimates for the parameters, a procedure that improves the initial solution

can be developed. The procedure is repeated until (j) stops or effectively stops decreasing.

The (|) fiinction can be expected to be well approximated by a quadratic form, suflBciently

close to the minimum, and can be written as,

<t>(a)«Y-d a+-a .D.a , (2.10)

where y is the vector of desired (expected) values, d is a A -vector of gradients, a is the N-

vector of parameters, D is the A x A/'Hessian matrix (Press et al., 1992). If the

approximation is a good one, the new estimates, a ,„, can be determined from the current

estimates, a „ ^ in a single step, from the following relationship

a.,„ =a^^ + D-^[-Vx'(a^^)]. (2.11)

On the other hand. Equation 2.10 could be a poor local approximation to the shape of

the fiinction that we are trying to minimize at a^ . In that case, using the steepest descent

method we can take a step down the gradient, i.e.,

Kext =a^^-c .Vx^a^^) , (2.12)

where the constant c is small enough not to exhaust the downhill direction.

It is imperative that the gradient of the objective fiinction (j) be computed at any set of

parameters a in order to use either Equation 2.11 or 2.12. In addition, the matrix D,

which is the second derivative matrix (the Hessian) of the (j) fiinction, is also needed in

order to use Equation 2.11. The crucial difference between the second-order methods

discussed in Sections 2.6.1-2.6.3 and the method discussed here is that there was no way

of directly evaluating the Hessian matrix in the second-order methods. One could only

evaluate the fiinction to be minimized and (in some cases) its gradient. Therefore, iterative

methods are required not just because the fiinction is nonlinear, but also in order to

generate information about the Hessian matrix. In the present method, the form of ^ is

known exactly, since it is based on a user-specified fiinction. Therefore, the Hessian

24

matrix is known, and Equation 2.11 can be used whenever needed. Equation 2.12 will be

used whenever Equation 2.11 fails to improve the fit, signaling failure of Equation 2.10 as

a good local approximation. More details on the least-squares method is available in the

text by Press et al. (1992).

2.7. The Levenberg-Marquardt Method for Nonlinear Least Squares

The general strategy for supervised learning of an input-output mapping is based on

combining a quickly convergent local method with a globally convergent one. We use the

Levenberg-Marquardt method (also known as the Marquardt method) (Marquardt, 1963)

for solving the nonlinear least-squares problem. The Marquardt method is a trust-region

modification of a Gauss-Newton method. Line-search optimization techniques are based

on finding a search direction and moving by an acceptable amount in that direction (step-

length based methods). While in line-search algorithms the direction is maintained and

only the step length is changed, there are alternative strategies based on choosing a step

length first, and then using a fiill quadratic model to determine the appropriate direction.

These methods are called model-trust-region methods with the idea that the model is

trusted only within a region, that is updated using the experience accumulated during the

search process.

The Marquardt method switches smoothly between the extremes of the Gauss-

Newton method (inverse-Hessian method based on simplifying the computation of the

second derivatives) and the steepest descent or gradient method (model-trust-region

modification). The latter is used far from the minimum, switching continuously to the

former as the minimum is approached. The algorithm for the Marquardt method is

presented in detail in the original paper by Marquardt (1963) and the text by Press et al.,

(1992). The algorithm is presented here in a simple manner for the sake of clarity.

25

Let us consider a set of w equations in m unknown variables of the form

/ 2 ( X j , X 2 , . . . , X „ ) = >'2.

f„{x^,X2,...,x„) = y„, (2.13)

where r are the unknown variables, y^ are the known values, and J are the known

fiinctions. The algorithm seeks to find a set of x that will minimize a user-defined

fiinction, such as the sum of squares error, (j), given by

(|>= ! ( / - > ' , ) ' . (2.14) /=i

The fiinctional form of/ are assumed to be known and the^, are constant. The

gradient of with respect to the parameters x has components

'^=-2hf,-yA (2.15) dxj^ ;=i ' dXf^

where k= 1,2, ..., w. The gradient is zero at (j) minimum

Taking an additional partial derivative gives

^ * -2h^^-if>-y,)^l (2.16) dx^dxi /=i dXf^ dxi dx^dxi

Removing the factors of 2 by defining

and

1 d% ^ki = 2 dx^dxi'

26

and making A = D/2 in Equation 2.11. Equation 2.11 can now be rewritten as the set of

linear equations

m Za,,5x,=p,. (2.17)

/=i

This set could be solved for the increments 6x that when added to the current estimates

give the next estimates. In the context of least-squares, the matrix A, equal to one-half

times the Hessian matrix, is usually called the curvature matrix.

The steepest descent formula, given in Equation 2.12, translates to bxi=c.Pi. (2.18)

Note that the components a ^ of the Hessian matrix A (Equation 2.16) depend on

both the first and second derivatives of the basis fiinctions with respect to their

parameters. The second derivatives occur because the gradient (Equation 2.15) already

has a dependence on df/dx^ so the next derivative simply must contain terms involving

d^f/dx^dxi. The second derivative term can be omitted when it is zero or small enough to

be negligible when compared to the term involving the first derivative. It also has the

additional possibility of being negligibly small in practice: the term multiplying the second

derivative in Equation 2.16 is (/] -y,). For a successfiil model, this term should be the

random measurement error of each point. The error can have either sign, and should, in

general, be uncorrelated with the model. Therefore, the second derivative terms tend to

cancel out when summed over /, giving a new definition for a^, as

a,, = i^^. (2.19) i=lOX^ OXi

Marquardt (1963) developed an elegant method for varying smoothly between the

extremes of the inverse-Hessian method (Equation 2.17) and the steepest descent method

(Equation 2 18). The method is based on two simple, but important, insights. There is no

clue about the order of magnitude or the scale of the constant c in Equation 2.18. The

27

gradient only gives the slope, and does not give the extent of that slope. Marquardt's first

insight is that the components of the Hessian matrix gives some information regarding the

order-of-magnitude scale of the problem. If (}) is taken to be non-dimensional, and p has

the dimensions of 1/x^ the constant of proportionality between p and Sx ^ must therefore

have dimensions of xl. Scanning the components of A yields only one component with

these dimensions, and that is 1/a^ the reciprocal of the diagonal element. Dividing the

constant by some non-dimensional factor, X, to reduce the scale, and with the possibility

of setting > . » 1 to cut down the step. Equation 12.18 can be rewritten as

^x,=-—P/,

or

A,a„5x/=p/. (2.20)

It is necessary that a„ be positive. But, by definition in Equation 2.19, it is guaranteed.

Marquardt's second insight is that Equations 2.20 and 2.17 can be combined to define

a new matrix A' such that

a ^ =aj^(l + X),

ttyt = a,it, (/'^ X (2.21)

and then replace both Equations 2.20 and 2.17 by

i : a > ; = p , . (2.22)

When X is very large, the matrix A is forced into being diagonally dominant, so

Equation 2.22 goes over to be identical to Equation 2.20 (the inverse-Hessian method).

On the other hand, as X approaches zero. Equation 2.22 goes over to Equation 2.17 (the

steepest descent method). In this manner, the Marquardt method uses the latter method

far from the minimum, switching continuously to the former as the minimum is

28

approached. The Marquardt method works very well in practice and has become the

standard of nonlinear least-squares routines.

The learning rule proposed here is that applicable to a standard nonlinear least-

squares analysis (Hsiung et al., 1991). The entire set of weights are adjusted at once

instead of adjusting them sequentially from the output layer to the input layer. The weight

adjustment is done at the end of each epoch (one exposure of the entire training set to the

network), and the sum of squares of all errors for all patterns is used as the objective

fiinction for the optimization problem. More details on the Marquardt method are found

elsewhere (Battiti, 1992; Press et al., 1992; Henley and Rosen, 1969). A description for

usage of the computer program that uses the Marquardt method for nonlinear estimation

and equation solving is presented in Appendix B.

2.8. Examples Using the Marquardt Method for Training Neural Networks

In this section we compare the performance of two training algorithms: the

Marquardt method and the backpropagation method, with the help of two different

examples from literature. The first example is a simple pattern recognition problem, and

the second one involves mapping of a nonlinear fiinction that was presented by Namatane

and Kimata (1989).

2.8.1. The "Iris" Classification Problem

It is desired to classify three different types of iris flowers into Class A, B, or C based

on four common attributes, say X^, X^, X^, and X^. The training set comprises 75 data

points with four inputs and three outputs. The outputs are either "0" or "1." Only one of

the three outputs for any given set of inputs (X,, X^, X-^, and X^ can be "1" with the other

29

two outputs being "0," implying that each flower belongs to a unique class (A, B, or C).

The test set also consists of 75 data points different from the ones in the training set.

A 4-4-3 network was trained using the Marquardt method, and was able to classify all

75 patterns without any error in 25 iterations through the nonlinear optimizer. Of the 75

data points in the test set, the network was able to classify 72 patterns correctly, yielding a

96% correct classification. Another 4-4-3 network was trained using a backpropagation

algorithm with fixed learning and momentum rates of 0.1 and 0.5, for a 1000 data

presentations. The "backprop network" was unable to make a "clean" classification by

producing outputs of "0" or " 1." In order to enable some comparison, it was decided to

consider the largest of the three outputs to correspond to "1," and the others as "0." With

this ad hoc aid to speed up the classification process, it was found that the "backprop

network" was able to classify all of the 75 patterns in the training set correctly. For the

test set, the network was able to classify 72 out of the 75 patterns correctly, giving once

again a 96% correct classification.

In case of the backpropagation network, training had to be re-started several times

before an "acceptable progress" in decrease in the normalized root mean square error was

noticed. Each time the learning rate and momentum rate had to changed by trial-and-error

till "good" values were found. In comparison, the Marquardt algorithm always gave

"clean" classifications by producing outputs that were either "0" or "1," and always

converged to 100% classification in 25 iterations or less. The initial weights were selected

at random to be small positive and negative values in the range ±0.1. Also, the inputs and

outputs were scaled using the same scaling function for both training methods.

2.8.2. Mapping a Nonlinear Function

Hsiung et al. (1991) used successive quadratic programming for a nonlinear optimizer

on a function reported by Namatane and Kimata (1989) to demonstrate the relative

30

effectiveness of the nonlinear optimization strategy over backpropagation for training

neural networks. The function that was mapped is given by

1 >'=^—(12 + 3x-3.5r+7.2jc^)(l + cos47cc)(l + 0.8sin37cc). (2.23)

Hsiung et al. (1991) considered mapping y as a function of five inputs: x, x , x ,

cos(7cx), and sin(7Dc). 60 random values of x in the range 0 to 1 were chosen, and the

other four inputs (x , x , cos(7cx), sin(7cx)) were then computed. The output ^ was

calculated for each value of x. The test data set was made up in a similar manner and

comprised 40 different points. Hsiung et al. (1991) report the resuhs of training 2i fully

connected five-input, three-hidden node, one-output feedforward neural network. A fiilly

connected network is a network in which every layer is connected to each layer ahead of

it, i.e., the nodes in the input layer are not just connected to all nodes in the hidden layer,

but are also connected to all the nodes in the output layer. Hsiung et al. (1991) used a

successive programming code^ with no constraints, and hence it defaulted to the BFGS

algorithm. They report that to reduce the mean square error to 3.17x10-^ (corresponds to

a normalized root mean square error of 2.298x10* ) for 60 training patterns took 200

iterations of the optimizer.

We followed the method suggested by Hsiung et al. (1991) to prepare the training

and test data sets for the function given by Equation 2.23. A 5-4-1 feedforward neural

network was trained using the Marquardt method and was allowed to run for 200

iterations. The normalized root mean square error after 200 iterations was calculated to

be 5.41x10- . Another 5-4-1 network was trained using the backpropagation algorithm

with fixed learning and momentum rates of 0.01 and 0.25, respectively, for 5000 data

presentations after which the normalized root mean square error was reduced to

'NPSOL from Stanford University, Department of Operations Research.

31

2.93x10-3. Figure 2.5a compares the mapping obtained from the 5-4-1 network trained by

the Marquardt method for the training and test data sets, and the analytical value of the

function. Figure 2.5b shows a similar comparison for the 5-4-1 network trained using the

backpropagation algorithm.

It is to be noted that the normalized root mean square error reported for the network

trained using the backpropagation algorithm is the "best" value obtained after running

several trials. It is by no means an optimum training performance. As mentioned earlier,

the backpropagation algorithm is extremely sensitive to initial values of the weights. It is

also to be noted that we did not use the exact same data set as used by Hsiung et al.

(1991). Also, they used Gaussian transfer functions for all their hidden nodes, and a linear

transfer function for the output node. We used hyperbolic tangent functions for all the

nodes in the network, in both the networks. The purpose of the comparison is purely to

show that a more robust training algorithm, such as the Marquardt method, is able to

handle the turning points in a complex nonlinear fiinction quite well when compared to a

backpropagation algorithm. The comparison with the method used by Hsiung et al.

(1991) is made only from the standpoint of affirming the approach of using optimization as

a tool for training neural networks, and not as a benchmark for the two optimization

techniques.

32

0 0.2 0.4 0.6 0.8

Figure 2.5. Mapping a nonlinear function with a neural network, (a) Network trained using Marquardt method.

33

Figure 2.5. Continued, (b) Network trained using backpropagation algorithm.

34

CHAPTER III

STEADY-STATE MODELS FOR DISTILLATION

In the past two decades, a great number of model-based control algorithms have been

proposed to achieve better performance and robust controllers. In-depth reviews on

model-based control strategies are presented in the papers by Bequette (1990), Bosley et

al. (1992) and Seborg et al. (1986). All these advanced techniques rely heavily on the

availability of a mathematical model that is a good representation of the dynamics of the

process being controlled. A vast majority of the techniques use linear or nonlinear

dynamic empirical models comprised of past values of the inputs and outputs of the

process. More recently, neural network dynamic models have been used in place of the

conventional empirical dynamic models in model-based control strategies (You and

Nikolaou, 1993; Raich et al, 1991; Bhat and McAvoy, 1990). These control strategies

fall under a general class known as Model Predictive Control (MPC).

Another model-based control technique developed by Lee and Sullivan (1988),

known as Generic Model Control (GMC), uses a controller based on a steady-state

"process inverse" model and a reference system synthesis (Bartusiak et al., 1989) based on

first-order dynamics.

3.1. Process Models and Process Inverse Models

Before discussing the concept of using steady-state models for control purposes, it is

important to differentiate between "process" models and "process inverse" models. A

process model refers to a mathematical equation, or a set of equations, that could

determine the estimated output of the process when given the process inputs. For

instance, in the case of distillation, a process model would predict the compositions of the

overhead and bottom products given the feed flowrate, feed composition, reflux rate,

35

boilup rate (or steam flowrate to the reboiler), the number of ideal stages, the stage

efficiency, etc. Kprocess inverse model refers to a mathematical equation, or set of

equations, that could determine the values of the manipulated variables that would

produce the desired process outputs. Once again, in the case of distillation, a process

inverse model would predict the reflux rate and boilup rate required to produce the desired

overhead and bottom product compositions, given all other pertinent input data.

Most of MPC strategies use both forms of the model, the process model for system

identification, and the process inverse model for the control action. If the process model

happens to be an empirical model, then the same model can be inverted to obtain the

desired control action. If the process model is a neural network model, then a separate

neural network model has to be developed to represent the process inverse.

For chemical process industries, it is highly desirable to use models that predict

directly the manipulated variables that would produce the desired outputs (Bhagat, 1990).

More particularly for chemical process control, the use of a process inverse model to

calculate explicitly the manipulated variables in order to follow a reference system or to

bring the process back to its set-point is extremely appealing. If the process dynamics can

be approximated to be first-order, then the process inverse dynamic models can be

replaced by process inverse steady-state models to obtain the control action. This is

precisely what the GMC strategy is based on. More details on GMC will be presented in

Chapter V.

3.2. Distillation Column Test Cases

It is desired to demonstrate the neural network model-based controllers on dynamic

simulations of two different methanol-water distillation columns: one is a 7-stage column,

and represents an experimental system at Texas Tech University (Pandit et al., 1992); and

the other is a high-purity industrial column (Rhinehart, 1994). The lab-scale column is an

36

atmospheric column, and produces products with approximately 4-5% impurity in the

overhead product, and 1-2% impurity in the bottom product. The reboiler hold-up in the

lab-column is roughly 30 times more than that in the condensate receiver, and hence

creates vastly different dynamics at the two ends. By contrast, the high-purity column is

an industrial column producing products with less than 1000 parts per million impurity,

and is typical of the "refining" column in a 3-column industrial methanol separation

process (Fruehauf and Mahoney, 1994; Mehta and Pan, 1971; Mehta and Ross, 1970).

These two cases will enable us to perform control studies to evaluate the use of neural

network models in a process model-based control environment.

3.3. Development of the Steadv-State Inverse Models for Distillation

The foremost requirement for development of any neural network model is the

availability of data that captures the relationship between the inputs and outputs of the

process. The data for training the neural networks were obtained from steady-state

process simulations for the two methanol-water distillation columns. The steady-state

simulations of the two methanol-water distillation columns were developed using a

commercially-available steady-state process simulation (CAD) package (HYSEM®' ). The

feed flowrate, F, the feed composition, z, the overhead composition, x ,, and the bottoms

composition, XQ were specified, and the steady-state simulators were used to determine the

reflux rate, L, and the boilup rate, F, needed to meet the specifications. The steady-state

process simulations were based on the NRTL thermodynamic model for vapor-liquid

equilibrium (VLE), and a Murphree stage efficiency of 75% for all the stages (chosen

arbitrarily), except the reboiler which is ideal, and both columns use a total condenser.

2HYSIM® is the registered trademark of Hyprotech, Calgary, Canada.

37

Table 3.1 gives the design specifications and operating conditions for the two methanol-

water distillation columns. "Experiments" performed on the steady-state simulators were

designed to cover a full square design.

The training set for the lab column comprised 81 data points, while that for the high-

purity column comprised 375 data points. The 81 data points for the lab column were

obtained by considering three data points for each of the four network inputs: two data

points to mark the minimum and maximum limits in the ranges specified in Table 3.1, and

one data point in between giving 3x3x3x3 = 81 data points. In the case of the high-purity

column, five data points each for F, x , and x over the range specified in Table 3.1, and

three data points for z were chosen giving 5x3x5x5 = 375 data points. Two separate four-

input, five-hidden nodes, two-output neural networks (abbreviated as 4-5-2 network)

were trained on data sets representing the lab and high-purity columns. Two separate test

data sets, consisting of 100 data points for the lab column and 250 data points for the

high-purity column, were prepared using the same steady-state process simulations by

considering values for F, z, X£>, and Xg intermediate to the ones used to make up the

training data set. The trained 4-5-2 networks were then tested to see how well they

predict the values for reflux and boilup rates for data in the test set.

Figures 3.1a and b show the comparison between the actual (CAD package data) and

the network predicted values for the reflux rate for the training data set and the test data

set, respectively, for the lab column. Figures 3.2a and b show similar comparisons for the

boilup rate for the training and test data sets, respectively, for the lab column. Figures

3.3a and b, and 3.4a and b show the corresponding resuhs for the high-purity column.

From the above comparisons, it can be seen that the neural networks have been able to

capture the operational characteristics of two distillation columns after the training

process, which took approximately 25 iterations of the nonlinear optimization routine

(approximately 5-10 minutes on a 486/50 MHz PC).

38

Table 3.1. Design and Operating Conditions for the Two Distillation Columns.

Specifications Lab Column High-Purity Column

CAD Simulation Design Conditions

1. No. of Stages (includes reboiler)

2. Feed Stage (from the reboiler)

3. Feed Quality 4. Reflux Quality 5. Pressure 6. Murphree Stage Efficiency

45

19

Subcooled to 120°F Saturated Liquid Subcooled to 120°F Subcooled to 120°F

1 atma. 2 atma. 75% 75%

Nominal Operating Conditions

1. Feed Rate (Ibmols/hr) 2. Feed Composition

(mole fraction methanol) 3. Overhead Product Composition

(mole fraction methanol) 4. Bottom Product Composition

(mole fraction methanol) 5. Reflux Rate (Ibmols/hr) 6. Boilup Rate (Ibmols/hr) 7. Reflux Ratio

Normal Operating Range

1. Feed Rate (Ibmols/hr) 2. Feed Composition

(mole fraction methanol) 3. Overhead Product Composition

(mole fraction methanol) 4. Bottom Product Composition

(mole fraction methanol) 5. Reflux Rate (Ibmols/hr) 6. Boilup Rate (Ibmols/hr)

0.35 0.3

0.92

0.025

0.132 0.243 1.17

0.164-0.588 0.2-0.4

0.85-0.95

0.02-0.07

0.0321-2.243 0.0832-2.166

800 0.12

0.999

0.001

180 258 1.89

600-1000 0.1-0.14

0.9985-0.9995

0.0005-0.0015

132.82-263-64 180.06-375.09

39

0 0.5 1 1.5 2 Reflux Rate (CAD Package), (Ibmols/h)

2.5

Figure 3.1. Reflux rate predictions from neural networks for lab column, (a) Results from training data set.

40

0 0.2 0.4 0.6 0.8 Reflux Rate (CAD Package), (Ibmols/h)

Figure 3.1. Continued, (b) Results from test data set.

41

2.5 1

0 0.5 1 1.5 2 Boilup Rate (CAD Package), (Ibmols/h)

2.5

Figure 3.2. Boilup rate predictions from neural networks for lab column, (a) Results from training data set.

42

0 0.2 0.4 0.6 0.8 Boilup Rate (CAD Package), (Ibmols/h)

1.2


43

280

120 140 160 180 200 220 240 Reflux Rate (CAD Package), (Ibmols/h)

260 280

Figure 3.3. Reflux rate predictions from neural networks for high-purity column, (a) Results from training data set.

44

280

260 -I

•5 240 B

^220 -I

^200

B 180 -I

G 160 -

140

120 120 140 160 180 200 220 240

Reflux Rate (CAD Package), (Ibmols/h) 260 280


45

400

i^350 -o B

5 300 -

I ?250 -

I 200

150 150 200 250 300 350

Boilup Rate (CAD Package), (Ibmols/h) 400

Figure 3.4. Boilup rate predictions from neural networks for high-purity column, (a) Results from training data set.

46

400

«5 350 "S B

Xi

300 -ka

o

a

a. 3 c§200

250

150 -« 150 200 250 300 350

Boilup Rate (CAD Package), (Ibmols/h) 400

Figure 3.4. Continued, (b) Results from test data set

47

3.4. Optimal Training of Neural Networks

The issue of optimal training of neural networks deals with determination of the

"best" model that solves a given problem. We define "best" as the lowest normalized root

mean squared error based on the test data set. The basic idea is to improve

generalizations, reduce the number of training examples required, and improve speed of

learning and/or classification using minimum number of hidden nodes. It is analogous to

model parsimony in classical statistical regression pariance. The key issue here is to

realize that as applications become more complex, the networks become larger (i.e., more

connections, and hence, more weights). More importantly, as the number of parameters

increase, overfitting problems may arise, with devastating performance on generalizations.

Overfitting is the term applied to describe the extent of the fit achieved by the neural

network as measured by the number of free parameters (the weights) of the network. As

in the case of other methods for function approximation, such as polynomial, too many

free parameters will allow the network to fit the training data arbitrarily closely, but will

not necessarily lead to optimal generalization.

Several methods have been developed and studied to optimize network training (Le

Cun et al., 1991; Weigand et al., 1990). While overfitting is an issue that cannot be

ignored, it becomes of increasing importance when the number of weights is of the order

of the number of training examples. Such networks are referred to as oversized networks.

Also, overfitting becomes more important when the gradient learning process

(backpropagation) is used for weight adjustment. In gradient learning, initially the hidden

units in the network all do the same work, i.e., they all attempt to fit the major features of

the data. As those features are accounted for, the major source of error in the network is

determined by the second most important feature of the training data. The units then start

to differentiate with some of them beginning to fit this second most important aspect of

48

the data. As the process of differentiation continues, the effective number of degrees of

freedom start to increase. Assuming that sampling error is small relative to other sources

of variation in the data, early network training seeks to fit the significant features of the

data. It is only at later times that the network tries to fit the noise.

A solution to stop overfitting is to stop training just before the network starts to fit

the sampling noise. We followed the technique proposed by Weigand et al. (1990) which

uses a separate validation data set to guide when to stop training. The validation data set

can be made up of an arbitrary number of data points from the training data set, say 10%

of the points in the training data set. At the end of each epoch (each time a new set of

weights have been determined), the validation data set is presented to the network, and the

prediction error on the validation set is obtained. The data points selected to be in the

validation set are not a part of the training data set anymore, and are not seen by the

network while training. Training is stopped when the normalized root mean square error

on the validation data set starts to increase.

However, with the optimization technique used for determining new set of weights,

the presence or absence of the validation set did not make much difference to the overall

network prediction characteristics. It is our opinion that with a more robust learning tool,

overfitting was not that serious a problem as long as the network architecture was selected

to ensure training errors are reduced in a reasonable number of passes through the training

data set. Other network configurations with three, four, five, six, and seven hidden nodes

were also trained, but the network with five hidden nodes gave the best overall

performance based normalized root mean squared error for the test data set.

49

CHAPTER IV

DYNAMIC PROCESS SIMULATIONS

To study and compare the performance of the neural network model-based

controllers with conventional PI controllers, the controllers have to be implemented on the

two methanol-water distillation columns. Before implementing the neural network model-

based controllers on the "real" process systems, it is advisable to perform the tests on

dynamic simulations of the "real" processes. Dynamic simulations facilitate better

understanding of the dynamic behavior of the processes, and provide insights into the

nature of the interactions between the inputs and the outputs. Also, performing the tests

on dynamic simulators enables studying the control issues without interference from other

operational aspects such as safety, economics, etc. Dynamic simulators for the two

methanol-water distillation columns were developed from first principles based on the

design data given in Table 3.1. The dynamic simulator for each distillation column is a

tray-to-tray formulation based on the multicomponent distillation structure developed by

Luyben (1990), and involves solving ordinary differential equations and algebraic

relationships on each stage.

4.1. Mathematical Model for Nonideal Multicomponent Distillation

The mathematical model for the nonideal multicomponent distillation is based on the

following assumptions:

(1) One fixed feed plate is used to introduce the vapor and liquid feed regardless of the

feed or operating conditions.

(2) Pressure is constant and known on each tray.

50

(3) Coolant and heating media dynamics are negligible in the condenser and the reboiler,

respectively.

(4) The condenser is a total condenser.

(5) Liquid hydraulics are calculated from the Francis weir formula (Luyben, 1990).

(6) Perfect level control in the reflux drum and the reboiler allows a constant holdup in

the reflux drum and reboiler by changing flowrates of the bottoms product, B, and

liquid distillate product, D.

(7) Dynamic response of the internal energies on the trays are much faster than the

composition or total holdup changes, and therefore energy balances on each tray are

just algebraic.

(8) Reflux rate, L, and the boilup rate, F, are the manipulated variables.

(9) An empirically-correlated polynomial equation obtained from regressing experimental

data is used for thermodynamic VLE.

(10) A single Murphree stage efficiency is used for all the stages, except the reboiler which

is ideal.

Consider the '/'th stage in a A -stage distillation column separating a feed containing n^

components as shown in Figure 4.1. Let the '/'th stage represent the feed stage which

allows for a feed containing both vapor and liquid fractions. The equations describing the

time-domain behavior on this stage are comprised essentially of an overall material

balance, component material balances, an energy balance, and the thermodynamic

equilibrium.

4.1.1. Overall Material Balance (one per stage)

The overall material balance on the feed stage can be written as

^ = z,,„+/;^+/-r,+i^.,-A-f^. (4.1) at

51

V. N

N+\

F,z

N

NF

3

"/v+i

D,x D

7'th Stage

B,x D

Figure 4.1. Schematic of a distillation column with details on the Tth stage

52

where M, is the liquid holdup (Ibmoles) on the '/'th stage; L, and L,_^ are the flowrates of

the liquid leaving the '/'th and '/-I'th stage, respectively; ^ and F,.i are the flowrates of the

vapor leaving the '/'th and '/-I'th stage, respectively; F,^ is the flowrate of the liquid

fraction of the feed entering on the '/'th stage; and FI^_^ is the flowrate of the vapor fraction

of the feed entering on the '/-I'th stage.

4.1.2. Component Material Balance (w -1 per tray)

The individual component balance on the feed stage can be written as

^ ^ ^ ^ = A.i',>u + F'<j + F!-,yU, + >-^yi-^., - A*,; - v,y>.i^ (4.2)

where x,j and x, j^ are the compositions of the yth component in the liquid leaving the '/'th

and '/+rth stage, respectively; >', and>',.i are the compositions of the yth component in

the vapor leaving the '/'th and '/-I'th stage, respectively; xfj is the composition of the yth

component in the liquid fraction of the feed entering the '/'th stage; and y^_^j is the

composition of the yth component in the vapor fraction of the feed entering on the '/-I'th

stage.

4.1.3. Energy Balance (one per stage)

The energy balance on the feed stage can be written as

^ ^ ^ = L,.AM + F'hf + FlX-^ + V,-A.^ - kh - KH„ (4.3) at

where //, and //,+, are the enthalpies of the liquid leaving the 7'th and 'i+Vth stage,

respectively; //, and //,., are the enthalpies of the vapor leaving the '/'th and '/-I'th stage,

respectively; hj" is the enthalpy of the liquid fraction of the feed entering the 'fth stage;

and //,^j is the enthalpy of the vapor fraction of the feed entering on the '/-I'th stage.

53

4.1.4. Thermodynamic Equilibrium (w per tray)

The thermodynamic vapor-liquid equilibrium is given by the functional dependence

that can be expressed as

y:,j=fip:.PTj.,x,j), (4.4)

where y,j is the composition of the yth component in the vapor phase in equilibrium with

the yth component in the liquid phase for the '/'th stage; P* is the saturation vapor

pressure at temperature 7,; and Pj- is the total system pressure.

Equations 4.1, 4.2, and 4.3 are applicable to any stage in the distillation column. If

the '/'th stage under consideration is not the feed stage, then the contribution due to the

feed are neglected. In terms of the dynamic process behavior, the liquid rates through out

the column are not the same (assumption #5). They depend on the fluid mechanics of the

tray, and often a simple relationship such as the Francis weir formula can be used to relate

the liquid holdup on the stage, M,, to the liquid flowrate leaving the tray, Z,,. The Francis

weir formula is given as

Q,=3.33ljh^j'\ (4.5)

where Q^ the is the liquid flowrate over the weir (ft /s), / is the length of the weir (ft),

and h^^ is the height of the liquid over the weir (ft). More rigorous relationships can be

obtained by considering detailed tray hydraulics to include effects of vapor flowrate,

densities, composition, etc.

A Murphree vapor-phase efficiency is used to describe the departure from equilibrium

(assumption #10) and is given as

E,^ = 2kZ2izkL, (4.6)

54

where >', is the actual composition of the vapor leaving the '/'th stage; >',.i is the actual

composition of the vapor leaving the '/-I'th stage; £,^ is the Murphree vapor efficiency for

the yth component in the '/'th stage.

The reboiler is considered to be the first stage and is an ideal stage (100% efficiency),

and the condenser is a total condenser and hence, it is not an equilibrium stage. For

calculation purposes, we designate the condenser as the "A/ l"th stage. The equations

describing the reboiler and the condenser are slightly different due to the perfect level

control assumption (assumption #6). Perfect level control assumes that the holdup in the

condenser and reboiler is constant and does not change, and therefore

dM, d/

and

0

d/

Under this assumption, the overall material balance in the reboiler then becomes an

algebraic equation that can be written as

L,.,-L,-V,=0. (4.7)

Similarly, for the condenser, the overall material balance gives

K „ - I „ „ - Z ) = 0. (4.8)

The reflux and boilup rates are the two variables that have to be specified by the operator

(see discussion in Section 1.2 for the degrees of freedom analysis). Knowing the reflux

and boilup rate. Equations 4.7 and 4.8 can be solved explicitly to calculate the overhead

and bottom product draw rates, D and I , (commonly denoted as B). The component

balances still remain ordinary differential equations that need to be solved to determine the

rate of composition change for the overhead and bottom products.

55

Also, assumption #7 allows us to substitute the differential equation for the energy

balance with an algebraic relationship that can be solved explicitly on each stage to

determine the vapor flowrate leaving each stage inside the column. Therefore, for any

general stage '/' taking feed into account. Equation 4.3 becomes

H, ^^-^^

Empirically correlated polynomial equations were used for the thermodynamic vapor-

liquid equilibrium. For the lab-column, empirical correlations were obtained by regressing

the experimental data for a methanol-water system at 1 atmosphere absolute pressure

(Henley and Seader, 1981) to obtain polynomial relationships for vapor-phase

composition, liquid- and vapor-phase enthalpies, and temperature as a function of liquid-

phase composition. For the industrial methanol-water column, similar empirical

polynomial correlations were obtained, but the data for VLE, liquid and vapor enthalpies,

and temperature were obtained from HYSIM® for a methanol-water system with the

NRTL thermodynamic model. Details for the empirical correlations are presented in

Appendix C.

The dynamic simulations give the response of the overhead and bottom product

compositions from the distillation columns under various operating conditions. The

differential equations given in Equations 4.1 and 4.2 were integrated with respect to time

using an explicit Euler integrator (Riggs, 1994) along with the algebraic energy balance

and the thermodynamic VLE correlations. The accuracy of the explicit Euler integrator

was checked against a more rigorous fourth-order Runge-Kutta integrator (Riggs, 1994).

It was found that the explicit Euler integrator yielded a comparable performance when

compared to the fourth-order Runge-Kutta integrator, and hence, the explicit Euler

integrator was used for both the dynamic simulations of both distillation columns.

56

4.2. Additional Features of the Dvnamic Process Simulators

First-order autoregressive drifts were added to all process inputs (F, z) to create

disturbances, and Gaussian noise was added to all measured variables (F, z, x^ x , A B,

L, V)to simulate instrument noise. Unmeasured process disturbances have a great effect

on the process behavior, and can affect the controllability of the process. Almost all

unmeasured disturbances in a distillation column can be simulated by changing stage

efficiencies. Accordingly, another first-order autoregressive drift was added to the

Murphree stage efficiency to simulate unmeasured process disturbances. The Gaussian

noise and autoregressive drifts provide realism to the simulated data. Also, all measured

data are filtered through a first-order filter before use in any calculation or historical

trending, thus introducing a dynamic lag. A nominal Murphree stage efficiency of 80%

and 85% were used in the dynamic simulators for the lab column and the high-purity

column, respectively. In addition, a 5-minute analyzer delay was added to all the

composition measurements (z, x^ and x^) for the high-purity column. Since the purity

levels in the lab column are not high, the lab column uses temperature to infer

compositions and, hence, no analyzer delays were used.

4.3. Open-Loop Response Characteristics of the Processes

The open-loop response of any process enables the study of the degree of

nonlinearity, nonstationarity, and level of interaction between the various inputs and

outputs of the process from both a quantitative as well as a qualitative viewpoint. Open-

loop studies, typically, involve changing one of the input variables by a known amount

while keeping all other inputs about their base case values, and noting the response of the

process outputs over a period of time to study the effect of the change. The dynamic

simulators were used to study the open-loop responses of the two distillation columns.

57

4.3.1. Open-Loop Responses for the Lab-Column

The open-loop characteristics of the lab-column were studied by making ±10%

change in F, z, I , and F, one variable at a time, from their corresponding base case values

shown in Table 3.1, and noting the response of the overhead and bottom product

composition.

Figure 4.2a shows the response of the overhead and bottom product compositions to

the ±10% change in boilup rate; Figure 4.2b shows the sequence of the'+' and '-' 10%

changes in boilup rate and the essentially "constant" reflux rate during the period of the

test; Figure 4.2c shows the variation in the feed flowrate, feed composition, and the

Murphree stage efficiency (the disturbances) affecting the process during the same time

period. The nonlinear nature of the process is observed from the fact that the magnitude

of the change in the overhead and bottom product compositions for the +10% change in

boilup rate is vastly different from that due to the -10% change. Also, the nonstationary

behavior is noticed from the fact that the overhead and bottom compositions do not

necessarily return to the "same" base case values when the boilup rate is brought back to

its base case value.

Figures 4.3a-c show similar results for a ±10% change in the reflux rate;

Figures 4.4a-c show the results from ±10% change in the feed flowrate; and

Figures 4.5a-c, the resuhs for ±10% change in the feed composition. The "seed" for the

pseudo-random number generator used in the algorithm for the autoregressive drift and

Gaussian noise was set to a different value for each open-loop test to study the influence

of random changes in the disturbances on the open-loop responses.

58

0 10 I I

20 30 Time, (hours)

40 0.65

50

Figure 4.2. Open-loop response to boilup rate changes in lab column, (a) Overhead and bottom product compositions.

59

0.2

0.18

«6

io.i6

o

X 0.14

3

0^ 0.12

0.1 0

Boilup Rate

-vv-

Rcflux Rate

10 20 30 Time, (hours)

40

Figure 4.2. Continued, (b) Reflux and boilup rates.

rO.3

-0.28

0.26

fO.24 I

F0.22 i

0.2 "?

F0.18 o.

F0.16 cS

-0.14

-0.12

0.1 50

60

0.38 0.9

Feed Flowrate, (Ibmols/h)

c o 0.36 -

O Ou

o 0.34

I 0.32

8

0.86

0.82

0.78 Efficiency

c u e o 00

2

I 0.3 . .M^vvv-VVN'W^^-"^^

Feed Composition, (mf MeOH)

-0.74

0.28 ^ 0

0.7


40 50

Figure 4.2. Continued, (c) Process disturbances.

61

0.2

§0 .16

E tf0.12

CO

O

§0.08

c/>

o §0.04

CQ

Overhead Composition

0

0

Bottoms Composition


40

0.95

0.85

0.8

-0.75

0.7

50

X o

0.9 *g

o :5 o a B o U

I >

o

Figure 4.3. Open-loop response to reflux rate changes in lab column, (a) Overhead and bottom product compositions.

62

0.2

0.18

I 016

Boilup Rate

y Reflux Rale

A iWY- rV^Ar l f—TVVWV-

0.14

—ww-SMJwsr^rvV * nW-^VS./SWW-i

0.12

l v-'*»rnw-w-VTvVV

0 1

0 10 40 20 30 Time, (hours)


0.3

0.28

0.26

0.24 «5 in

0.22 I

0.2 ^

0.18 'S,

0.16 1

0.14

0.12

fO.l 50

63

0.36 rO.9

IJU

0.86

0.82

0.78

-0.74

e u '3 B m 00 2 in

I

Composition, (mf MeOH)

0.28 0 10 20 30

Time, (hours) 40


0.7 50

64

0 10 20 30 Time, (hours)

40

hO.65

50

Figure 4.4. Open-loop response to feed flowrate changes in lab column, (a) Overhead and bottom product compositions.

65

0.2

0.18 -

«5

i o . i 6

X 0.14 3

G

0.12 -

0.1 4

0

Boilup Rate

\

/—vW-»*W^\/IM^*^Sy-r-V

Reflux Rate

10

I I

20 30 Time, (hours)

40


0.3

0.28

0.26

V0.24 5

0.22 i

0.2 5 3

0.18 o,

0.16 CO 0.14

ho. 12

0.1 50

66

0.4

c o

o

0.37

^ 0.34 Efficiency

o

4 - 1 CO

-o

0.31

0.28 -

0.25

' \,yj/w-vvvr^ ,Eowrate, (Ibmol/h)

0.92

0

W'^'^*''*''''''*%v«*\^

0.84

./,v Y JKAvw-vv/•'VA/*' ^ K•* * ^



40


0.76

e w u 00 CO

I 0.68

0.6 50

67

3C O o

2

0.16

o

o o. B o

B o o

CQ

0.12

0.08

0.04

0 10 20 30 Time, (hours)

40 50

Figure 4.5. Open-loop response to feed composition changes in lab column, (a) Overhead and bottom product compositions.

68

0.2 -rO.3

0.18

i o . i 6

§0.14

Boilup Rate

Reflux Rate

0.28

0.26

0.24 2^ I 10.22 I

0.2 B s

0.18 a * -.—v 0.16 m

0.12 - 0.14

0.1 0 10 20 30

Time, (hours) 40


0.12

0.1 50

69

0.4

0.36 c .2 Ui O O.

I 0.32

CO

8

0.28 -

0.24 -

0.2

0

/ Flowrate, (Ibmols/h)

A'


».--«-*V>''~"M<\4/.y,^/Ay\-^ -

Efficiency

r l

0.96

0.92

0.88 I"

0.84 I

0.8 I

V*»Vv^v^^^.^^^^^,y»SV'y/»^


40


CO

0.76 I

-0.72 f

0.68

0.64

0.6 50

70

4.3.2. Open-Loop Responses for the High-Purity Column

The high-purity column produces products with less than 1000 parts per miUion

impurities, and its operation is more nonlinear than the lab-column. Therefore, the open-

loop step tests were performed by making ±1% changes in F, 2,1, and F, one variable at a

time, from their corresponding base case values shown in Table 3.1.

Figure 4.6a shows the response of the overhead and bottom product compositions to

the ±1% change in the boilup rate; Figure 4.6b shows the sequence of the '+' and *-' 1%

changes in boilup rate and the essentially "constant" reflux rate during the period of the

test; Figure 4.6c shows the variation in the feed flowrate and feed composition (the

measured disturbances) affecting the process during the same time period; and Figure

4.6d, the variations in the Murphree stage efficiency, the unmeasured process disturbance.

Once again the nonlinear and nonstationary behavior of the process is easily noticeable

from the open-loop response.

Figures 4.7a-d show the results from the ±1% change in the reflux rate; Figures 4.8a-

d, the resuhs from the ±1% change in the feed flowrate; and Figures 4.9a-d, the resuhs

from the ±1% change in the feed composition.

4.4. Steadv-State Analyses of Distillation Column Operation

The open-loop responses provide a qualitative picture of the process behavior in

terms of the nonlinearity and the extent of interaction between the inputs and outputs.

The process nonlinearity and degree of interaction influence the level of difficulty of a

control problem. Interactions arise whenever the control problem is multivariable, and

when each manipulated variable affects more than one controlled variable. While the

dynamic behavior of a process is of great importance to the selection of the control

strategy, oftentimes, a steady-state analysis of the process can yield valuable insights about

the nonlinearity and degree of interaction involved.

71

0.005

B o ti m 0.001 -I

0 J

0


10 40

0.986

0.984

20 30 Time, (hours)

Figure 4.6. Open-loop response to boilup rate changes in high purity column, (a) Overhead and bottom product compositions.

50

72

"ix^-

300

290 -

«6

§280 x> o CO

0.270

'o CD

260

- . . .w . , . . , * , , , ^ * . . . .™, , . . - . . ^^

250

Reflux Rate

Boilup Rate

lyVv^jn^/Wr/ryy^MA/

0 10 20 30 Time, (hours)

40


200

190

(/]

180 I

170

-160

c 3

150 50

73

802 1

801.2

«6

J 800.4 o CO

i 799.6

798.8 -

798

Flowrate

0.13

-0.128

0.126 g

HO. 124 g

%>/^47V^VWvM^^^ 1-0.122 I

,A>A.Y^.CM,W^^^^S^,

Composition

-0.116

0.114

0 10 20 30 Time, (hours)

40 50

Figure 4.6. Continued, (c) Measured process disturbances.

74

0.9

0.85

c u o e u 00 CO

CO

0.8 -

I 0.75

0.7 -

0.65 0 10 40 20 30

Time, (hours)

Figure 4.6. Continued, (d) Unmeasured process disturbance.

50

75

0.005 -1

cS 0.001

0 10 20 30 Time, (hours)

40

0.997

0.994

0.991

h 0.985 50

X o

d o Jo O

a B o CO t>

0.988 ^

Figure 4.7. Open-loop response to reflux rate changes in high-purity column, (a) Overhead and bottom product compositions.

76

300

290 -

s, 0,270 a, o

OQ

260 -

250

Reflux Rate

200

190

C/i

180 i

Boilup Rate

Wv'>W\A*' >>vV''Mv ^

0 10 40 20 30 Time, (hours)


-170

Si

c

-160

150

50

77

802 1

801.2

O

I 800.4

CO

^ 799.6

798.8

798 -J

Flowrate

/ 'k/vfN^WVwVHO^^^

0.13

0.128

0.126 g

0.124 g

0.122 I

>A/V~*S>-''^V»*<-<*YA,^

•^^'Yw'^^X',

0.116

0.114

0 10 20 30 Time, (hours)

40 50


78

0.9

0.85 -

0.8 -

c u u s u 00

a

I 0.75 "W. '

0.7

0.65 0 10 20 30

Time, (hours) 40 50


79

0.005 1

m 0.001

0 10 20 30 Time, (hours)

40

0.997 g

0.994

0.991

0.988

0.985 50

u

e 'S Vi O O

B o U -u

Figure 4.8. Open-loop response to feed flowrate changes in high-purity column, (a) Overhead and bottom product compositions.

80

300

290

o OQ

260

t^ Reflux Rate

T 2 0 0

190

«5 C/l

B 280 -| 'AHW>V.^Â>VÂWMA^^^ IgO

CO

§.270 -I ^ ^ 1 1-170

y Boilup Rate

^VWMîM^WMv^^^:|/y^^

250

160

150 0 10 20 30

Time, (hours) 40 50


o B

Si

s.

81

810 -1

^ 8 0 5 -I

O

B

I 800

795

790 -! 0

k" Flowrate

0.13

-0.127

-0.124

*••» ^•*9*S^'%-^'^"

Composition

^'Aj\,^rv,^.,,^^^^^r-r^.^^ -0.121

X o u

2 1 tn O O,

i 1 -0.118

0.115


40 50


82

0.9 1

I 0.75

0 10 20 30 Time, (hours)

40 50


83

0.005 1

xO o

o

^ 0 B O

'<—» O

o. fo

CJ E o cS 0.

004

003

002

001

0

0 10 20 30 Time, (hours)

40 50

Figure 4.9. Open-loop response to feed composition changes in high-purity column, (a) Overhead and bottom product compositions.

84

300

290 -

Ui

§280 JD

D.270

O CQ

200

190

- •*^\HM^'W*wv^/^*^fi(\;yvA^^^

Ui

180 i Xi

260

Boilup Rate

y'Tf^y^Kfr-'^^^tJ^^V^^

170

160

3 3

250 150 0 10 20 30

Time, (hours) 40 50


85

802

801.2

Ui

J 800.4 Flowrate

^

o ^vwnAv^HVv^^/w^^ 0.124

^ArvwVWA.0_12

Composition ..wVAVV^

798

0.13

0.128

X 0.126 O

2

o

0.122 o B o

^

0.118

0.116 0 10 20 30

Time, (hours) 40 50


86

0.9

0.85

c

00 CO

00

0.8 -

I 0.75

0.7 -

0.65 -! 0 10 20 30 40

Time, (hours)


50

87

The most important piece of information from a steady-state viewpoint is the process

gain, which is defined as the ratio of the magnitude of the change in an output variable

with respect to the magnitude of the change in any given input variable, the changes being

calculated from some base case value, when all other inputs are held constant. In essence,

all the open-loop responses shown in Figures 4.2a-4.9a give a qualitative picture of the

steady-state process gains. For example, in Figure 4.2a, the overhead product

composition steady-state process gain for the +10% change in the boilup rate, K ^^^y, is

calculated as

*^ BC '^+\0

i.e.,

0.87615-0.77424 ^px +K= = -4.1938,

P,x„^v 0.243-0.2673

and the steady-state process gain for the overhead product composition for the -10%

change in boilup rate, K^^^^y, is calculated as

,. ^D,BC-^D,-mv 0.87615-0.90910 ^ _ .^ A _y = = = -1.3560.

^BC-^-io 0.243-0.2187

The steady-state process gain for the bottom composition can be calculated for the ±

10% change in the boilup rate in a similar manner. Table 4.1 shows the steady-state

process gains for the overhead and bottom product compositions in the lab column,

calculated for the ±10% changes in F, z, L, and F along with the average values for the

steady-state process gains. Table 4.2 shows similar results for the high-purity column.

Both processes show nonlinear behavior with up to 100% process gain changes over the ±

10% and ±1% ranges. Tables 4.3 and 4.4 show the first-order plus dead time models for

the open-loop responses for the lab column and the high-purity distillation column

88

Table 4.1. Process Gains for Overhead and Bottom Compositions for the Lab Distillation Column

K P.XB

K P^D

+10% 10% Ave. +10% 10% Ave.

L V F z

3.03 -1.50 0.15 1.29

1.88 -3.59 0.79 0.76

2.46 -2.55 0.47 1.03

2.21 -4.20 0.40 0.71

3.93 -1.36 1.37 1.24

3.07 -2.78 0.89 0.97

89

Table 4.2. Process Gains for Overhead and Bottom Compositions for the High-Purity Distillation Column

K p,XB

K P^D

+ 1% -1% Ave. +1% 1% Ave.

L V F z

9.1e-4 -3.9e-4 1.7e-4 9.6e-l

3.5e-4 -1.4e-3 9.3e-5 3.1e-l

6.3e-4 -8.8e-4 1.3e-4 6.3e-l

2.7e-4 -5.2e-3 2.7e-5 2.6e-l

5.2e-3 -2.0e-4 5.0e-5 2.6e+0

2.7e-3 -2.7e-3 3.9e-5 1.4e+0

90

Table 4.3. First-Order Plus Deadtime Models for Overhead and Bottom Compositions for the Lab Distillation Column

Bottom Composition Overhead Composition

2.5 3.1 0.3155+1 0.1725 + 1

-2.5 -2.7 0.2295 + 1 0.1865+1

1.0 0.9 0.2865+1 0.2725 + 1

1.0 0.9 0.3295+1 0.2435 + 1

91

Table 4.4. First-Order Plus Deadtime Models for Overhead and Bottom Compositions for the High-Purity Distillation Column

Bottom Composition Overhead Composhion

0.00063e -0.149 J

0.5685+1 0.00275e -0.189 J

0.7035+1

-0.00088e -0.0785

0.3135 + 1 -0.0027e -0.2825

01.4935+1

F 0.00013^ -0.054J

0.2975+1 0.000039e -0.1575

0.7995 + 1

0.63 0.2975+1

1.45e -0.1895

1.8115+1

92

expressed in terms of the average values for the steady-state process gains, open-loop tune

constants and the dead times.

Another tool for steady-state analyses of process systems is the relative gain array

(RGA) and its uses in analyzing control loop interactions (Bristol, 1966). The original

RGA development involved steady-state considerations only. Since then, however, the

analyses have been extended to include the dynamic considerations to study the control

system stability and design (McAvoy, 1981). The RGA is a matrix of numbers, X^, where

each element X^j represents the ratio of the steady-state gain between the '/'th controlled

variable and the yth manipulated variable when all other manipulated variables are

constant to the steady-state gain between the same two variables when all the other

controlled variables are constant. More details on the properties and calculations involved

in determining the elements of the RGA are found elsewhere (McAvoy, 1983; Luyben,

1990).

Tables 4.3 and 4.4 show the RGAs for the lab column and the high-purity column

calculated using the average values for the overhead and bottom product steady-state

process gains for the reflux and boilup rate changes. The elements in the RGA can be

numbers that can vary from very large negative values to very large positive values. The

closer the number is to 1.0, the less difference closing the other loop makes on the loop

under consideration implying less interaction. In the two cases considered presently, it can

be seen that the elements of the RGA indicate a strong interaction between the controlled

variable-manipulated variable pairs. The large values for X,j in the RGAs is rather typical

for the chosen controlled variable-manipulated variable pairings.

The open-loop responses, the process gains, and the RGAs provide a qualitative and

quantitative assessment, respectively, of the nonlinear, nonstationarity, and interactive

nature of the two distillation columns.

93

Table 4.5. Relative Gain Array for the Lab Distillation Column using the Average Process Gains

x^ 7.86 -6.86 XQ -6.86 7.86

94

Table 4.6. Relative Gain Array for the High-Purity Distillation Column using the Average Process Gains

Xj^ 3.3658 -2.2658 XQ -2.2658 3.3658

95

CHAPTER V

MODEL-BASED CONTROL STRATEGY

The inherent nonlinearities in the behavior of chemical process systems, such as

distillation columns, present a challenging control problem. In spite of this knowledge,

chemical processes have traditionally used linear system analysis and tools for design of

controller structure because the demands for linear system analysis and implementation are

usually quite small. Also, the fact that there is an analytical basis for the linear systems

theory lends itself to more rigorous stability and performance proofs. However, the use of

linear system techniques can be quite limiting if the process behavior is highly nonlinear.

During the past decade, there has been a significant increase in the number of control

system techniques that are base on nonlinear system concepts (Bequette, 1991). Model-

based controllers are not a new concept; the Zeigler-Nichols tuning rules utilizing a

process response curve for identification of the tuning parameters {KQ, T;, X ,) of a PID

controller are based on the model parameters of a first-order plus dead time model

{Kp. T, e).

Some of the most significant developments to model-based control include algorithms

that use linear models such as Dynamic Matrix Control (DMC) (Prett and Garcia, 1988;

Cutler and Ramaker, 1980), Model Algorithmic Control (MAC) (Richalet et al., 1978),

Internal Model Control (IMC) (Garcia and Morari, 1982), and some of their extensions

that used nonlinear models. An in-depth review of some of the above techniques and their

related extensions is available in the papers by Bosley et al. (1992) and Bequette (1991).

The above techniques are all similar in the sense that they rely on dynamic models to

predict the behavior of the process over some fijture time interval, and control actions are

based on these model predictions. These techniques are, therefore, classified under the

broad category of Model Predictive Control (MPC). Another technique that uses

96

steady-state models with a reference system based on first-order dynamics has also been

studied and implemented extensively (Lee, 1993; Pandit et al., 1992; Ramchandran et al.,

1992; Rhinehart and Riggs, 1990; Riggs and Rhinehart, 1990; Lee and Sullivan, 1988).

We shall examine this technique in more detail.

51. Nonlinear Process Model-Based Control (Nonlinear PMBC;)

The basis for nonlinear PMBC lies in the concept of what is called as Generic Model

Control (GMC) (Lee and Sullivan, 1988), and its closely aligned relatives known as

Reference System Synthesis (RSS) (Bartusiak et al, 1988) and Internal Decoupling

(Balchen et al., 1988). The strategy is to find values of the manipulated variables that

force a model of the process to follow a desired reference system or trajectory.

Consider a dynamic model of a process described by a set of differential equations:

y =f{y,u,d,p,t), (5.1)

where y, the change in the process outputs with respect to time, /, is some nonlinear

fiinction,/, of>', the vector of process outputs of dimension «; w, the state vector (vector

of manipulated variables) of dimension w; d, the vector of process disturbances of

dimension /; and/7, the vector of model parameters of dimension q. In this simplified case,

we have considered a square system, i.e., the number of outputs and inputs are the same.

However, the technique is not limited to only such systems (Lee and Sullivan, 1988).

When the process is away from its desired setpoint, y^p, we would like the rate of

change of>', i.e.,y, to be such that the process is returning towards the setpoint, i.e.,

y'sp'^^x^ysp-y)^ (5-2)

where K, is a diagonal matrix. In addition, we would like the process to have zero offset,

i.e.,

y'sp^^iliysp-y)^^^ (5-3) 0

97

where Kj is another diagonal matrix. Therefore, a suitable reference system that can yield

satisfactory control performance will be some combination of the above objectives, i.e.,

y'sp = ^liysp -y)+^2Uysp -y)^^ (54) 0

Since it is desired that the control algorithm ensure the rate of change of the outputs

follow the selected reference system, i.e.,

y=ysp- (5-5)

Therefore, combining Equations 5.1 and 5.4 yields

f{y,u,d,pj) = K^(ysp-y) + K2]iysp-y)dt. (5.6) 0

The control law in Equation 5.6, to be solved at every sample time for the

manipulated variables, is a set of nonlinear algebraic equations in unknown variables. In

the control law described in Equation 5.6 it is possible to obtain a solution only when the

manipulated variable chosen to control a particular output appears in the model equation.

Henson and Seborg (1990) have shown that the RSS methods, such as GMC, are based

on principles of differential geometry and are known as systems of relative degree 1.

The process model in Equation 5.1 assumes that a dynamic model of the process can

be derived. However, steady-state models that describe the nonlinear, interactive behavior

of the process are available more easily. Also, the exact nature of the process is rarely

known. In the face of these uncertainties, an approximate model of the form

fssiy.t^,d,p,t) = 0, (5.7)

represents the steady-state behavior of the process. Although these models describe the

steady-state nonlinear, interactive behavior of the process, some estimate of the process

dynamics is required. The most likely estimates available to the designer are the average

time constants of the process obtained from step response curves. Although these

estimates may be inaccurate at different operating conditions, the degree of approximation

is often sufficient to obtain good control performance. Assuming that the dynamics of the

98

process can be represented by a first-order model, a simple estimate of the time response

of the output variables in moving from one steady-state to another can be given as

ysp=T^~\yss-y)> (5.8)

where T is a diagonal matrix of the estimated open-loop time constants, sind yss are the

steady-state values of the output variables if no fiirther control action is taken (Lee, 1993).

The diagonal elements of the matrix T are averaged time constants of the output variables

based on step changes of all input variables. Combining this approximate description of

the process dynamics whh the reference system in Equation 5.4, the uhimate response can

be calculated as

yss=y + T(K^iysp-y) + K^]{ysp-y)dt). 0

Note that T-K, and T-Kj are simply two other diagonal matrices, and so the form of the

control law becomes

yss =y+Ki(ysp-y) + K2Uysp-y)dt. (5.9) 0

The control action required to achieve this performance can be determined by

replacing^y with ^ ^ in the nonlinear steady-state model described in Equation 5.7, and

solving for the manipulated variables, u.

5.2. Nonlinear Process Model-Based Control of Distillation Columns

Control of any process involves selection of the manipulated variables, and there are a

number of choices or control configuration schemes for a given process. Distillation

columns also have their share of these control schemes (McAvoy, 1983). We have chosen

to use the Xi~,-L, x^-F configuration (also known as the energy balance scheme). The

energy balance scheme is a simple scheme wherein the overhead and bottom product draw

rates, D and B, respectively, are on level control, and the reflux rate, L, and the boilup

99

rate, V, are the manipulated variables that control the overhead and bottom product

compositions, respectively. Large values for the relative gains are typical of the energy

balance scheme (McAvoy, 1983). Even though the energy balance scheme gives the

highest degree of interaction for the controlled variable-manipulated variable pairing (high

values for the elements of the RGA), it has the advantage of excellent disturbance

rejection features, and is the simplest scheme commonly used in industrial practice. In

distillation control, disturbance rejection is the more important requirement as most

columns often operate over fixed operating ranges, and are frequently subject to

disturbances.

If an approximate steady-state model is a phenomenological model, it is not necessary

for the model to be explicit in either the output or manipulated variables. But, if the

approximate steady-state model is a neural network model, then it is advantageous to have

a process inverse model because the manipulated variables that will give the desired

performance can be calculated directly.

5.2.1. Using Neural Networks for Distillation Control

The neural network models already developed are the plant-inverse models of the

distillation columns which take feed flowrate, F, feed composition, z, overhead

composition, x , and bottoms composition, x^ as inputs, and calculates the reflux rate, L,

and the boilup rate, F required to maintain the distillation column overhead (x^) and

bottom (xg) compositions at their desired setpoints, x^sp and Xg^p, respectively.

Using the neural network steady-state model and the reference system of Equation

5.4, the control law in Equation 5.9 can be rewritten for the overhead and bottom

compositions as

XD,SS =XD+ f^lD iXD,SP - ^D ) + ^2D 1 (^D.SP " ^D ) ^ ^ (5- ^0)

100

and t

XB,SS =XS+K^ (X5 5P - X5 ) + K^B J(XB,SP - XB )dt, (5.11)

0

where x^^^s and Xg^s are the steady-state target values, and x^^p and x^^p are the desired

setpoints for x , and x , the current values for the overhead and bottom compositions,

respectively. K^^, K2P, K^B and K2B are the control law tuning constants which are

actually the product of the estimated average open-loop time constant, ip, and the

elements of the diagonal matrices K, or Kj, respectively. The diagonal elements of the

matrices are chosen for each output independently to obtain a "reasonable" response for

the process system, the term "reasonable" response implying a close match to the natural

dynamic response of the process system.

It is important to ensure a bumpless transfer from the "manual" mode (open loop) to

the "auto" mode (closed loop). Here, the process simulators start up on open loop. The

initial reflux and boilup rates are determined from the respective neural network inverse

models given actual values of the feed flowrate and composition along with some desired

XDSS and x ^ , and the processes are allowed to settle to near steady-state conditions.

When the controller is switched on, it is brought on-line with the intention of maintaining

the overhead and bottom product compositions at the last measured values. This prevents

an old setpoint "bump." Under this condition, x^^ « x^ and x^^ « x , which implies that

the contributions due to the error and cumulative error terms in Equations 5.10 and 5.11

will be negligible, and a bias can be calculated for each controlled variable as follows: K,=^D^S-^D (5 12)

and

bx,=XB,ss-XB^ (513)

where b^ and b^ are the biases on the overhead and bottom product compositions,

respectively. The steady-state target set-points, x ^ ^ and x ^ , are operator-specified

101

values. For the start-up operation, they are not calculated using the control law in

Equations 5.10 or 5.11. The overhead and bottom product composhions, x^ and x , are

measured from the process. The biases represent the mismatch between the process and

the neural network model, and is calculated only once, whenever the controller is switched

to automatic. The control law with the bias term included then reads as follows: r

XD,SS = ^x^ + ^D + ÎD (ÂSP -XD) + K2DJ ( ^ A S P " ^D )^^ (5- H ) 0

and t

XB,SS = *x, +XB+ fîB ixB,sp -XB) + K2B J{XB^SP " ^B )d/• (5.15) 0

Figure 5.1 gives a schematic description of the nonlinear PMBC control strategy that

uses the neural network steady-state model. The nonlinear PMBC controller "looks" at

the process at every controller time interval and calculates target values Xj^^s and x^^s

based on Equations 5.14 and 5.15. The steady-state target values along with the

measured values for feed flowrate, F, and feed composition, z, are used in the neural

network model for the distillation column. The neural network then calculates the reflux

rate, I , and the boilup rate, V, that will drive the process to the temporary steady-state

targets, x ^ and x ^ .

Changes in the disturbances (feed flowrate and feed composhion) are fed directly into

the model which enables the nonlinear PMBC controller to provide a nonlinear

feedforward response as well as nonlinear feedback. The nonlinear PMBC law serves to

linearize the outputs from the process with the help of the steady-state approximate model

assuming first-order process dynamics. This technique has also been referred to as

external (input-output) linearization (Isidori, 1989; Isidori et al., 1981). Also, the PMBC

controller has the advantage of using the process model to provide direct decoupling of

the manipulated variables for a multiple-input multiple-output system.

102

Distillation Column

D. Y

Ysp

Xsp

Trained Neural Network

Figure 5.1. The neural network model-based control strategy

103

CHAPTER VI

CONTROL RESULTS

The neural network model-based controller includes two elements: the reference

systems defined by Equations 5.14 and 5.15, and the steady-state neural network process

inverse model. The neural network controllers were tested for both servo (setpoint

changes) as well as regulatory (disturbance rejection) modes of operation on the dynamic

simulators of both columns. The results of the neural network controllers were bench-

marked against conventional PI controllers with a feedforward element for feed flowrate

and feed composition changes. Decouplers were not included in the conventional strategy

because the cross gain changes required such extensive gain scheduling that they could not

be structured as per conventional industrial practice.

Several controller tests were performed to check for setpoint changes and for

disturbance rejection capabilities. The lab column has a faster response time than the high-

purity column, therefore, one 60-hour run enabled study of both servo and regulatory

modes of operation for the lab-column. The high-purity column with its slower response

time required separate tests to present effectively the servo and regulatory modes. Table

6.1 gives a description of the controller tests for the lab column, while Table 6.2 gives the

description of the servo-mode controller tests, and Tables 6.3 and 6.4 describe regulatory-

mode controller tests for the high-purity column.

6.1. Lab Column Controller Tests

Figures 6. la-e show the resuhs from the controller tests described in Table 6.1 for the

lab column with the neural network model-based controller. Figure 6.1a shows the

response of the controlled variables, i.e., the overhead and bottom product compositions,

to setpoint changes and the variations in the process disturbances; the measured

104

Table 6.1. Description of the Controller Tests for the Lab Distillation Column

Time Description of the Changes (hours)

0.0 Open-loop start up with the following nominal values: F= 0.35 Ibmoles/h; z = 0.3 mole fraction methanol; L = 0.26 Ibmoles/h; V= 0.37 Ibmoles/h; T] = 80%

10.0 Controller switch on after bumpless transfer operation ^D.sp = 0^9 mole fraction methanol; XQ^P = 0.014 mole fraction methanol

15.0 Dual Composition Setpoint Change ^D.sp = 0-91 niole fraction methanol; x^^ = 0.025 mole fraction methanol

20.0 Dual Composition Setpoint Change XDSP^ 0.92 mole fraction methanol; x p = 0.035 mole fraction methanol

25.0 Dual Composhion Setpoint Change ^DSP " ^^3 mole fraction methanol; x p = 0.025 mole fraction methanol

30.0 Feed flowrate upset F = 0.42 Ibmoles/h (+20% change from nominal value)

35.0 Feed flowrate upset F = 0.28 Ibmoles/h (-20% change from nominal value)

40.0 Feed flowrate upset F = 0.35 Ibmoles/h (brought back to nominal value)

45.0 Feed composition upset z = 0.4 mole fraction methanol (+33% change from nominal value)

50.0 Feed flowrate upset z = 0.2 mole fraction methanol (-33% change from nominal value)

55.0 Feed flowrate upset z = 0.3 mole fraction methanol (brought back to nominal value)

60 0 End of Controller Tests

105

Table 6.2. Description of the Servo-mode Controller Test for the High-Purity Distillation Column

Time Description of the Changes (hours)

0.0 Open-loop start up with nominal values: F = 800.0 Ibmoles/h; z = 0.12 mole fraction methanol; L = 180.0 Ibmoles/h; V= 258.0 Ibmoles/h; r| = 80%

10.0 Controller switch on after bumpless transfer operation ^DSP = 0.99915 mole fraction methanol (850 ppm impurity) Xgsp = 0.0022 mole fraction methanol (2200 ppm impurity)

15.0 Dual Composition Setpoint Change ^Dsp = 0-99^ " ol fraction methanol (1000 ppm impurity) XBSP = 0.001 mole fraction methanol (1000 ppm impurity)

25.0 Dual Composition Setpoint Change XDSP = 0 9995 mole fraction methanol (500 ppm impurity) XBSP = 0.0005 mole fraction methanol (500 ppm impurity)

35.0 Dual Composition Setpoint Change XDSP" 0.999 mole fraction methanol (1000 ppm impurity) Xoco = 0.001 mole fraction methanol (1000 ppm impurity)

40.0 Dual Composition Setpoint Change XDSP = ^.9985 mole fraction methanol (1500 ppm impurity) Xo^o = 0.0015 mole fraction methanol (1500 ppm impurity)

50.0 End of Controller Test for Servo mode of operation

106

Table 6.3. Description of the Regulatory-mode (Feed Flowrate Upsets) Controller Test for the High-Purity Distillation Column

Time Description (hours)

0.0 Open-loop start up with nominal values: F = 800.0 Ibmoles/h; z = 0.12 mole fraction methanol; L = 180.0 Ibmoles/h; V= 258.0 Ibmoles/h; r| = 80%

10.0 Controller switch on after bumpless transfer operation XD.SP ^ 0.99915 mole fraction methanol (850 ppm impurity) Xp.sp ^ 0 0022 mole fraction methanol (2200 ppm impurity)

15.0 Dual Composition Setpoint Change ^D.sp ^ 0 999 mole fraction methanol (1000 ppm impurity) ^B.sp ~ 0 001 niole fraction methanol (1000 ppm impurity)

25.0 Feed Flowrate upset F = 900.0 Ibmoles/h (+12.5% change from nominal value)

35.0 Feed Flowrate upset F= 800.0 Ibmoles/h (brought back to nominal value)

45.0 Feed Flowrate upset F= 700.0 Ibmoles/h (-12.5% change from nominal value)

55.0 End of Controller Test for Feed Flowrate Upsets

107

Table 6.4. Description of the Regulatory-mode (Feed Composition Upsets) Controller Test for the High-Purity Distillation Column

Time Description (hours)

0.0 Open-loop start up with nominal values: F = 800.0 Ibmoles/h; z = 0.12 mole fraction methanol; L = 180.0 Ibmoles/h; V= 258.0 Ibmoles/h; TI = 80%

10.0 Controller switch on after bumpless transfer operation ^D.sp " 0.99915 mole fraction methanol (850 ppm impurity) ^B.sp " 0.0022 mole fraction methanol (2200 ppm impurity)

15.0 Dual Composition Setpoint Change XD.SP ^ 0.999 mole fraction methanol (1000 ppm impurity) ^B.sp " 0.001 mole fraction methanol (1000 ppm impurity)

25.0 Feed Composition upset z = 0.14 mole fraction methanol (+16.6% change from nominal value)

35.0 Feed Composition upset z = 0.12 mole fraction methanol (brought back to nominal value)

45.0 Feed Composition upset z = 0.10 mole fraction methanol (-16.6% change from nominal value)

55.0 End of Controller Test for Feed Composition Upsets

108

disturbances, feed flowrate and feed composhion, are shown in Figure 6.1b; the

unmeasured disturbance, Murphree stage efficiency, is shown in Figure 6. le. Figure 6. Ic

shows the changes in the manipulated variables, and Figure 6.Id shows the steady-state

target values for the overhead and bottom product composhions with time. The neural

network model presumes that the steady-state targets are the "true" setpoints to which h

tries to control the process. The difference between the actual setpoint and the steady-

state target is one indication of the mismatch between the neural network model and the

actual process.

After the open-loop process start-up, the neural network controller is brought on-line

(bumpless transfer) to maintain the overhead and bottom product composhions at the

value measured at 10 hours. The reflux and boilup rates are changing (Figure 6. Ic) to

maintain the overhead and bottom compositions at their respective setpoints. Even

though there is no nominal change, the random drift in the disturbances and Murphree

stage efficiency requires a noticeable change in the manipulated variables. The changing

Murphree stage efficiency is an unmeasured disturbance, and the neural network controller

corrects for this change purely on feedback. Feed flowrate and composition influences are

fed forward through the steady-state neural network model without any dynamic

compensation. Dual composition setpoint changes are made at 15, 20 and 25 hours (see

Figure 6. la). The setpoint changes are filtered to give a reference trajectory for the

setpoints. Feed flowrate disturbances are introduced at 30, 35 and 40 hours, and feed

composhion disturbances are introduced at 45, 50 and 55 hours (see Figure 6.1b). Each

change was designed to make the lab column operate under different condhions. Note

from Figure 6. Ic that the manipulated variables work much harder for the second feed

composhion change yet the noise on the controlled variables is the same. This shows the

ability of the controller to understand the process gain changes, and to reflect them in the

manipulated variable action.

109

0 10 20 30 40 Time, (hours)

50 60

Figure 6.1. Neural network model-based controller without dynamic compensation on lab column, (a) Overhead and bottom product compositions.

110

0.45

0.41 -

CO

0.37

CQ

J 0.33

u u

0.29

0.25 H 0

0.23 I

f0.15

10 20 30 40 50 60 Time, (hours)

Figure 6.1. Continued, (b) Measured process disturbances.

I l l

0 10 20 50 30 40 Time, (hours)

Figure 6.1. Continued, (c) Manipulated variables.

60

112

-0.02 0 10 50 60 20 30 40

Time, (hours) Figure 6.1. Continued, (d) Steady-state targets for controlled variables.

113

0.9

0.85 -

"o

t 0.8

0.75 --

0.7

0 10 20 30 Time, (hours)

40 50 60

Figure 6.1. Continued, (e) Unmeasured process disturbance.

114

As a benchmark, PI controllers with a static feedforward correction for feed flowrate

and feed composition influences were also implemented on the lab column and tested for

the same servo and regulatory changes described in Table 6.1. Figure 6.2a shows the

response of the controlled variables, and Figure 6.2b shows the changes in the manipulated

variables as determined by the PI controllers. Table 6.5 gives a comparison of the values

of the integrals of squared error (ISE), absolute error (lAE), and valve travel (VT) (a

penalty fimction that quantifies "the amount of work" done by the manipulated variables,

and is defined as the cumulative sum of |Z<, - Z,,+,|, for the '/'th sampling). All three

performance measures are normalized by the time period over which the integrals are

accumulated. Also, the sequence of disturbances, noise and drifts were kept identical to

that used in the neural network controller tests.

Both controllers were tuned to subjectively balance minimization of the ISE, lAE,

and VT for both the servo and regulatory modes, and the same tuning constants were used

throughout all performance tests. The Zeigler-Nichols tuning rules obtained from the

open-loop step response tests were used to calculate the initial values of the tuning

constants for the feedforward PI controllers. While there are methods for obtaining

estimates for the tuning constants in the GMC Law (Lee, 1993), the neural network

controller was tuned heuristically by increasing the proportional gains (A', and A' ^ in

Equations 5 14 and 5.15) till oscillations were observed. Then the integral constants {K2D

and K2B in Equations 5.14 and 5.15) were increased to remove offset in a reasonable time.

While the approach is not the most optimal, h reflects the industrial practice in tuning

controllers on-line, and is simple to implement.

The feedforward PI controller shows good control for the setpoint changes. But when the

operating conditions change it shows its inability to predict the process gain changes, and

implying the need for gain scheduling, and more advanced PI control strategies. The

nature of the changes in the manipulated variable (Figure 6.1c and 6.2b) also show that the

115

0.1

5 0.02 CQ

0 ^ 0


r'M-v^ ,.,, 0.93

Bottom Composition

10 50

0.95

0.91

0.89

-0.87

0.85


Figure 6.2. Static feedforward Pl-controller without dynamic compensation on lab column, (a) Overhead and bottom product composhions.

60

X o

8, B o U

I

116

TO.94

0 10 20 30 40 Time, (hours)

50 60

Figure 6.2. Continued, (b) Manipulated variables.

117

Table 6.5. Comparison of Controller Performances for the Lab Distillation Column

Neural Network Model-Based Controller

Time (hours)

15.0 20.0 25.0 30.0 35.0 40.0 45.0 50.0 55.0 60.0

ISE(x^)

1.161e-4 1.235e-4 1.291e-4 1.196e-4 6.606e-4 2.804e-3 1.107e-3 4.553e-4 4.594e-3 1.031e-3

ISE(x^)

1.371e-4 1.993e-4 1.227e-4 1.059e-4 4.023e-4 l.OOOe-3 4.440e-4 2.379e-4 4.452e-3 8.217e-4

lAE(x,)

1.559e-l 1.602e-l 1.672e-l 1.639e-l 2.869e-l 5.575e-l 3.727e-l 2.482e-l 6.239e-l 3.532e-l

IAE(x^)

1.730e-l 1.973e-l 1.647e-l 1.478e-l 2.368e-l 3.412e-l 2.629e-l 2.07 le-1 6.650e-l 2.999e-l

VT(F)*

0.5193 0.5869 0.5730 0.9302 1.7687 1.1883 1.3471 1.4846 1.3548 1.3746

VT(I)'

0.5300 0.5670 0.5944 0.9596 1.8918 1.1816 1.4232 1.6371 2.6353 1.5011

Time (hours)

15.0 20.0 25.0 30.0 35.0 40.0 45.0 50.0 55.0 60.0

Conventional Feedback PI plus Feedforward Controller

ISE(X5) ISE(x^) IAE(X5) IAE(x^) VT(^* VT(L)

1.251e-4 1.248e-4 1.640e-4 9.432e-4 1.624e-3 4.364e-3 1 746e-3 8.942e-4 1.433e-3 4.716e-4

5.205e-4 3.967e-4 2.704e-4 1.427e-3 1.193e-3 3.258e-3 1.520e-3 2.693e-4 8.646e-4 2.637e-4



0.9024 0.8914 0.9379 0.9086 1.0229 1.1243 1.0591 1.0836 1.0843 0.9883

1.3690 1.1833 1.0184 0.8570 0.9237 1.0084 0.9731 0.9931 1.1811 1.0459

- Valve Travel for the manipulated variable (reflux rate, L and boilup rate, V)

118

neural network controller is a nonlinear controller while the PI controller is a linear

controller. The neural network controller makes aggressive changes in the manipulated

variables when compared with the PI controller because the neural network controller uses

a nonlinear model of the process which enables better prediction of the required changes in

manipulated variable action over the entire operating range.

The neural network model-based controller was also tested with dynamic

compensation for feed flowrate and feed composhion for tests described in Table 6.1.

Figure 6.3 a shows the response of the overhead and bottom product composhions;

Figure 6.3b shows the changes in the manipulated variables; and Figure 6.3c shows the

steady-state target values for the overhead and bottom product compositions calculated by

the control laws. The controller performance did not show any significant improvement

that warranted dynamic compensation of the measured disturbances. Comparison of

Figures 6. Id and 6.3c shows the effect of dynamic compensation on the steady-state target

values. Whh dynamic compensation (Figure 6.3c), the controller does not take corrective

action immediately, and therefore, makes the change in the "right" direction when the

disturbance is noticed. On the other hand, whhout dynamic compensation (Figure 6.1c),

the controller reacts too soon, and therefore, heads in the "wrong" direction initially

before turning around.

Figures 6.4a-c show an expanded time-scale representation of the controller test

results shown in Figure 6. la. Figure 6.4a, b, and c show the setpoint changes, the feed

flowrate disturbances, and the feed composition disturbances, respectively. The resuhs

are for the neural network model-based controller without dynamic compensation for the

measured disturbances.

119

0.1

g0.08

1 0.06 -c o

Ui

O

I" 0.04 o B o O0.02

PQ

0

0


* ^S^'AKA'V"' /VV '^A'*\'V<' "t /v>»>U|Wv>A,

Setpoints

^V>*s/W'- <W' <'V' Mf*|M \ ^^

Bottoms Composition

0.95

0.93

0.91

0.89

0.87

X o

c o *^ Ui O O.

B o

CJ CS

u >

o

0.85 10 20 30 40

Time, (hours) 50 60

Figure 6.3. Neural network model-based controller with dynamic compensation on lab column, (a) Overhead and bottom product composhions.

120

0 10 20 30 40 Time, (hours)

50 60

Figure 6.3. Contmued. (b) Manipulated variables.

121

-0.02 0 10 20 30 40

Time, (hours) 50

fO.85 60

Figure 6.3. Contmued. (c) Steady-state targets for controlled variables.

122

10 20 Time, (hours)

1-0.85 30

Figure 6.4. Response of controlled variables to neural network model-based controller without dynamic compensation on lab column, (a) Setpoint changes.

123

0.1

g0.08 o s 1

0.06 c o Ui

o I" 0.04 o U £ o I 0.02 CD

^

Bottoms Com|x>sition

0 -t 30

• ^ ^ V ' ^ ' ^ V ^ * ^ ' ^ ^ '


ji% r- .1 J M i^"*"^!^*-!

35 40 Time, (hours)

0.95

0.93 O o s 1

Setpoints

^ ^ ^7—

0.91 cf o Crt O Ou

0.89 i u

CO

o 0.87 §

O

0.85 45

Figure 6.4. Continued, (b) Feed flowrate changes.

124

0.1 T

gO.08 -\7^ ^JN.r>. • V * v - ' - '


0.06 -c .2 'w

o

E0.04 o U £ o I 0.02

Bottoms Composition

y^.n,/^^^ .f\jn. y^,.^O^y~^

0

45

0.95

X 0.93 O

50 55 Time, (hours)

60

Figure 6.4. Continued, (c) Feed composition changes.

125

6.2. High-Puritv Column Controller Tests

Figures 6.5a-e show the control resuhs for the controller tests described in Table 6.2

for the high-purity column with the neural network model-based controller. Figure 6.5a

shows the response of the controlled variables; Figure 6.5b shows the variations in feed

flowrate and composition (the measured disturbances) affecting the process; Figure 6.5c

shows the changes in the manipulated variable; Figure 6.5d shows the steady-state target

values for the overhead and bottom product compositions; and Figure 6.5e shows the

variations in Murphree stage efficiency (the unmeasured disturbance). The overhead and

bottom responses show a similar initial open-loop start-up conditions as seen in the case of

the lab column. The controller is swhched on at 10 hours, followed by a series of setpoint

changes. Figures 6.6a and b show the corresponding responses obtained by using a

feedforward PI controller on the same column for the setpoint changes described in Table

6.2, and with the same disturbances, noise, and drifts as used for the neural network

model-based controller.

Figures 6.7a-d show control resuhs for the tests described in Table 6.3, with Figures

6.7a, b, c and d showing the response of the controlled variables, the feed flowrate

disturbances, the changes in the manipulated variables, and the steady-state target values,

respectively, for the neural network model-based controller. Once again, the column is

started-up on open-loop, and the neural network controller is used to bring the column to

the base case operating conditions of 1000 ppm impurity in both the overhead and bottom

product. The feed flowrate disturbances affect the column starting at 25 hours. Figures

6.8a and b show the corresponding results from the feedforward PI controller for the same

feed flowrate upsets at the same base operating conditions.

Figures 6.9a-d show control results for the tests described in Table 6.4, with Figures

6.9a, b, c and d showing the response of the controlled variables, the feed flowrate

126

0.005

g 0.004

":: 0.003 i O

Ui O

10.002 o B o I 0.001 CQ

0 0


10

Bottom Composition


Setpoints

50

-0.999

-0.998

-0.997

-0.996

0.995

X

o 2 1 e

'S Ui O tx B o

U CO <i •e u >•

O

0.994 60

Figure 6.5. Setpomt changes with neural network model-based controller without dynamic compensation on high-purity column, (a) Overhead and bottom product composhions.

127

850 0.15

830 -

o J 810 o

^790

^

770 -

Feed Flowrate 0.14 O u s

0.13 I Ui O (X B o u

0.12 •§

Feed Composition

750 0 10 20 30 40

Time, (hours) 50

fO.ll 60

Figure 6.5. Contmued. (b) Measured process disturbances.

128

330

0 10 20 30 Time, (hours)

40 50

200

60


129

0.005

X ^ 0.0042

0.0034 -

o ^ 0.0026 t—'

CO

•^0.0018 CO u 4—* C/3

0.001

Overhead Composition X o

0.995 ^

-0.99

0.985 ^

CO

-0.98 -^ CO

c/0

f 0.975

0 10 20 30 Time, (hours)

40 50 60

Figure 6.5. Continued, (d) Steady-state targets for controlled variables.

130

0.95

1 8 0.9 S w

20.85 C/D

I 0.8 -

0.75 -

0.7 —r-

10

I I

20 30 Time, (hours)

0 40 50 60

Figure 6.5. Continued, (e) Unmeasured process disturbance.

131

0.005

g 0.004

a o ? 0.001 CD

0 10 20 30 40 Time, (hours)

50

-0.999 X

o

0.998 *g

-0.997

0.996

-0.995

c .2 tn O cx B o •o

CO

-e > O

0.994 60

Figure 6.6. Setpoint changes with static feedforward PI controller without dynamic compensation on high-purity column, (a) Overhead and bottom product composhions.

132

330 -1 r200

310 —"••wv*'---'--.\MfA^Ar ^7^ -180

5 \ - ^ \ c«

2 \ "o ^ Boilup Rate Reflux Rate

270 \ i -140

u 4—> CO

« K—4*^1^,^^ I ^

230 ^ \ \ \ \ \ 1-100 0 10 20 30 40 50 60

Time, (hours)

Figure 6.6. Continued, (b) Manipulated variables

133

0.005

g 0.004

2

":: 0.003 c o *^^ ' Ui

o 1*0.002 o

CJ

I 5 0.001 CQ

0

0

^jt^^-"'^"*/ \''>^'^''.f'iA ^ S ^ ^ ^ ^


Bottom Composition

^ ' V T ^ ' ^ ' * ' V ' ' ^ ' ' ^ / ^ * ' sMU>YV|>r <\>^AftY ifY/

0.999 X O u

0.998 1 c 2

Setpoints ^ 0.997 ' ^ a. I

0.996 ^ CO

o x: ha

o -0.995 (5

0.994 10 20 30 40

Time, (hours) 50 60

Figure 6.7. Feed flowrate changes with neural network model-based controller without dynamic compensation on high-purity column, (a) Overhead and bottom product compositions.

134

950

890 ^

Ui

O

J 830 Q

^ 7 7 0

710

650 0

Feed Flowrate

Feed Composition


40 50

0.15

-0.14 X

o

c o -0.13 -2 Ui

O £ o U

-0.12 1

0.11 60

Figure 6.7. Continued, (b) Measured process disturbance.

135

300

150 4 0 10 20 30 40

Time, (hours) 50 60


136

0.005

X

^ 0.0042

X 0.0034

4>

5 0.0026

4 i ^

CO 4->

-d* 0.0018 CO

0.001


Bottom Composition

0.993

0.986

0.979

0.972

X

o

"S

t C/5

0.965 0 10 20 30 40 50 60


137

0.005

0 10 20 40 50

0.999

-0.998

0.997

0.996

0.995

X o 2 1 c o Ui

O I u T3

CO

>

o

0.994 60 30

Time, (hours) Figure 6.8. Feed flowrate changes with static feedforward PI controller

without dynamic compensation on high-purity column, (a) Overhead and bottom product composhions.

138

300

30 Time, (hours)


139

0.005

-0.999

-0.998

-0.997

0.996

-0.995

0.994

X o

d o Ui O a. B o

T3 CO

O

-£ > O

30 Time, (hours)

Figure 6.9. Feed composition changes with neural network model-based controller without dynamic compensation on high-purity column, (a) Overhead and bottom product compositions.

140

830

0 10 20 30 40 Time, (hours)

50

r0.15

60

Figure 6.9. Contmued. (b) Measured process disturbance.

141

300

Boilup Rate

•§270

£

5 240 CO

X

1 2 1 0

o

I 180

X

Reflux Rate

150 ^• —r-

10 0 20 30 40 Time, (hours)

50 60


142

0.005

X

^ 0.004

§< 0.003 -

o 00

^ 0.002 t—' CO

-^0.001 CO

C/3

0

0

Bottom Composition

10 20


40 50

X

o o

2

fO.96 60 30


143

disturbances, the changes in the manipulated variables, and the steady-state target values,

respectively, for the neural network model-based controller. Figures 6.10a and b show the

corresponding resuhs from the feedforward PI controller for the same feed composhion

upsets at the same base operating conditions.

The neural network model-based controller shows superior performance because of

hs ability to predict the process behavior with a fair degree of accuracy over the entire

operating range. The feedforward PI controller shows good control, in fact one could say,

better control for the setpoint changes, but shows hs deficiency in predicting the process

gain changes when the process moves to different operating conditions. The controllers

were also tested for feed composition changes, and similar performances were observed.

Tables 6.6 and 6.7 give a comparison of the performance of the two controller based on

the normalized values for ISE, lAE and VT for the servo and regulatory tests performed

on the high-purity column.

144

0.005

g 0.004 -I

C M

£ ^0.003 ^ c .2 o I" 0.002 -I o

CJ

S 0.001 -I OQ

0 0


Bottom Composition Setpoints


40 50

rl

0.999 X O o

0.998 1

-0.997

-0.996

0.995

0.994 60

a o

. ^ 4—•

. V i «

Ui

O

f o

T3 CO u •e > O

Figure 6.10. Feed composhion changes with static feedforward PI controller without dynamic compensation on high-purity column, (a) Overhead and bottom product compositions.

145

300

I 270 £

Xi

if 240 \ CO

I §210 o

I 180

Boilup Rate

- f " ^ ^ ^ ^ ^ . ^ ^ Vf^'^fWwi

\tm&^-

Reflux Rate

150 4 0 10 20 30 40

Time, (hours) 50 60


146

Table 6.6. Neural Network Model-Based Controller Performance for the High-Purity Distillation Colunm

Time (hours)

15.0 25.0 35.0 45.0 55.0

ISE(X5)

1.117e-6 2.286e-6 1.844e-6 1.846e-6 2.019e-6

Set

ISE(x^)

2.019e-6 1.269e-5 2.602e-6 3.639e-6 3.060e-6

-point changes

lAE{x^)

1.489e-2 3.084e-2 2.775e-2 2.81 le-2 2.885e-2

IAE(x^)

2.154e-2 6.45 le-2 3.383e-2 3.936e-2 3.629e-2

VT(F)*

113.84 214.88 210.13 220.69 223.67

VT(L)*

89.80 183.64 189.79 182.55 191.86

Time (hours)

15.0 25.0 35.0 45.0 55.0

Feed Composhion Changes

ISE(X5) ISE(x^) IAE(XB) IAE(X^) VT(F) V T ( I )

1.117e-6 2.286e-6 3.905e-6 4.541e-6 3.234e-6

2.019e-6 1.269e-5 7.072e-6 l.lOOe-5 l.lOOe-5

1.489e-2 3.084e-2 3.462e-2 3.638e-2 3.372e-2

2.154e-2 6.45 le-2 4.956e-2 5.868e-2 5.659e-2

113.84 214.88 279.15 241.10 224.74

89.80 183.64 200.41 191.84 200.43

Time (hours)

15.0 25.0 35.0 45.0 55.0

Feed Flowrate Changes

ISEixB) ISE(;c ) IAE{XB) IAE(X^) VT(F) VT(L)

1.117e-6 2.286e-6 2.809e-6 2.952e-6 3.745e-6

2.019e-6 1.269e-5 1.541e-5 1.796e-5 1.510e-5

1.489e-2 3.084e-2 3.260e-2 3.305e-2 3.336e-2

2.154e-2 6.451e-2 6.660e-2 6.753e-2 7.766e-2

113.84 214.88 274.67 254.35 254.41

89.80 183.64 228.63 208.34 211.18

- Valve Travel for the manipulated variable (reflux rate, L and boilup rate, V)

UI

Table 6.7. Conventional Feedback PI plus Feedforward Controller Performance for the High-Purity Distillation Column

Time (hours)

15.0 25.0 35.0 45.0 55.0

ISE(x^)

2.33 le-6 3.594e-6 3.103e-6 3.107e-6 3.797e-6

ISE(x^)

9.110e-7 5.047e-6 3.370e-6 2.250e-6 2.394e-6

Set-point changes

IAE(x^)

2.245e-2 3.917e-2 3.635e-2 3.625e-2 3.933e-2

lAEix^)

1.389e-2 4.572e-2 3.849e-2 3.120e-2 3.184e-2

VT(F)

1119.72 1660.48 1453.63 1549.52 1861.75

VT(I)

283.77 539.19 548.58 548.19 588.38

Feed Composition Changes

Time (hours)

15.0 25.0 35.0 45.0 55.0

ISEix^)

2.33 le-6 3.594e-6 5.637e-5 3.570e-6 5.075e-5

ISE(x^)

9.110e-7 5.047e-6 2.445e-5 1.41 le-5 1.264e-5

IAE(XB)

2.245e-2 3.917e-2 1.601e-l 1.220e-l 1.501e-l

IAE(x^)

1.389e-2 4.572e-2 9.775e-2 7.104e-2 7.359e-2

VT(F)

1119.72 1660.48 1830.71 1539.18 1590.38

VT(Z)

283.77 539.19 573.08 571.32 605.50

Feed Flowrate Changes

Time (hours)

15.0 25.0 35.0 45.0 55.0

ISE(X5)

2.33 le-6 3.594e-6 1.199e-4 6.884e-5 1.327e-4

ISE(:c^)

9.110e-7 5.047e-6 2.119e-4 1.341e-4 1.936e-4

lAE(x,)

2.245e-2 3.917e-2 2.378e-l 1.819e-l 2.591e-l

IAE(x^)

1.389e-2 4.572e-2 2.882e-l 2.416e-l 3.046e-l

VT(F)

1119.72 1660.48 1805.43 1487.70 1550.51

VT(L)

283.77 539.19 568.95 552.94 587.50

148

CHAPTER v n

PROCESS-MODEL MISMATCH

The issue of process-model mismatch is important from a model-based control

viewpoint because the efficacy of the control strategy is dependent critically on the model.

While a "good" model can enhance greatly the performance of the controller, a "bad"

model can easily make things worse. The question then becomes: what makes a model

"good" or "bad" from a process control perspective? It is usefiil to understand the basic

fimction of the model and how it can help improve the control of a process whose

behavior is characterized with high degrees of nonlinear, nonstationary, and interactive

behavior.

7.1 • Process-Model Mismatch

Conventional control structures based on classical PID controllers assume that the

process behavior over a small operating ranges exhibit linear dependence between the

controlled variable-manipulated variable pairings. Hence, in this operating region, linear

control laws are applicable; and as the theory is based on linear mathematical concepts, an

optimum controller configuration can be determined from exact analytical solutions.

However, the assumption of linearity is valid only over very small operating ranges as

most chemical processes are inherently nonlinear in their process behavior. Therefore, as

operating conditions change, the optimal settings also change. It is not practicable to

correct constantly for changing operating condhions. Therefore, some comprise is sought

and the controllers are tuned to obtain a "satisfactory" performance over a wide range of

operating condhions. Also, chemical processes exhibh nonstationary behavior making

identification of the process (an essential step in application of all linear control structures)

difficult because of the inability to identify the exact nature of the process. In addhion,

149

there is the inherent interaction between the controlled variable-manipulated variable pah's.

It is important to note that conventional PID controllers are model-based controllers; the

models used to identify the process are linear models of the process over a sufficiently

small operating range where the assumption of linearity is valid. The simple linear control

laws work fine for a number of cases but as the process behavior starts becoming more

complex in terms of the interactions and nonlinearity, the controller performance

deteriorates because the models are no longer able to account for the nonlinear process

behavior and are unable to decouple the interactions between the controlled variable-

manipulated variable pairs.

The models in advanced model-based control strategies enable decoupling of the

interactions between the controlled variables and manipulated variables. In addhion, if the

models are nonlinear, they provide a better understanding of the process behavior over a

wider range of operating conditions. The controller uses the model to decouple the

interactions between the controlled variables and manipulated variables while taking into

account the nonlinear process behavior, and therefore, they should have the ability to

control the process better. It must be noted that not all advanced model-based controllers

use nonlinear models. Many of the model predictive controllers use linear models (Bosley

etal., 1992).

Theoretically, if a model is an exact representation of the nonlinear, interactive

steady-state and dynamic process behavior, then it should provide the best control

performance under any operating condition. However, this is not true because no model is

ever exact. Chemical processes tend to be nonstationary and show different

characteristics as operating conditions change making the model inexact with respect to

the process. The extent of the inexactness between the actual process and the model of

the process is called the process-model mismatch. The mismatch between the process and

the model is one measure of the ability of the controller to control a process. Oftentimes,

150

as the mismatch increases, the controller performance can deteriorate rapidly. Controller

models may require adaptation (or parameter adjustment), either periodic (Riggs and

Rhinehart, 1990) or on-line (Rhinehart and Riggs, 1991), in order to minimize the process-

model mismatch and yield satisfactory controller performance.

Since it is virtually impossible to come up with an exact representation of the real

process behavior over a wide range of operating condhions, we attempt to capture the

major nonlinear characteristics of the process in the model by choosing a suhable number

of adjustable parameters. The aim in nonlinear PMBC is to develop models that can

approximate the real process behavior with sufficient accuracy. In the nonlinear PMBC

formulation using the GMC law, the integral term in the control law (Equations 5.14 and

5.15) provides an addhional mechanism that adjusts for the process-model mismatch.

7.2. Process-Model Mismatch for the Distillation Columns

The neural network models developed to control the distillation columns were

intentionally kept different from the dynamic simulations of the two methanol-water

distillation columns; the dynamic simulators represent the processes being controlled. The

dynamic simulators for the distillation columns were not used to obtain data to train the

neural network models. All the data for neural network training and testing were obtained

from the CAD simulations. While an empirically correlated polynomial equation obtained

from regressing experimental data were used for the VLE in the dynamic simulators, the

NRTL VLE model were used in the steady-state CAD simulations. In the CAD

simulations a Murphree stage efficiency of 75% was used in obtaining all the data for

network training and testing for both the columns, while in the dynamic simulations

nominal Murphree stage efficiencies of 80% and 85% were used in the lab colunm and the

high-purity column, respectively. Also, in the dynamic simulators, a first-order

autoregressive drift was added to the process inputs, feed flowrate and feed composhion,

151

and to the nominal Murphree stage efficiency. The neural network models do not have

any additional parameters that are adjusted, ehher periodically or on-Une, once the

networks have been trained.

In order to measure the extent of the process-model mismatch, 25 data points from

the two testing data sets were selected at random and presented to the trained neural

networks representing the two distillation columns. The neural networks were used to

calculate the reflux and boilup rates that would be required to achieve the steady-state

conditions defined by the values of F, z, x^ and Xg. The predicted values for L and T were

then used in the dynamic simulators along with values for F and z. The dynamic

simulators were allowed to run till "near" steady-state condhions were identified, and the

steady-state values for x^ and x^ were noted. The fact that the "process" (the dynamic

simulators) and the "model" (the neural networks trained using CAD data) are different is

illustrated in Figures 7.1 and 7.2. Figure 7.1a compares the steady-state values of the

overhead product composition obtained from the dynamic process simulator for the lab

column with that used in the neural network model of the lab column. Figure 7. lb shows

a similar comparison for the bottom product composition in the lab column. Figures 7.2a

and 7.2b shows similar comparisons for the overhead and bottom product composhions

for the high-purity column.

The points indicated by the triangles represent the steady-state values of the overhead

and bottom composition obtained from an idealized dynamic simulator (i.e., no drifts on

feed flowrate, feed composition, and Murphree stage efficiency, and no noise on the

measured variables), while the circles represent the "near" steady-state values for the same

variables from the dynamic simulator with all the "bells and whistles" (noise and drifts

included). It is worth noting that the two neural networks show distinctly different

characteristics for the "noisy" condition when compared with the "ideal" condhions.

Under ideal conditions, the same neural network model is closer to the real process

152

X o ^ 0.95

I 0.9 £

u 0.85 -

£ ^ 0.8

0.75

o With Noise & Drift A W/0 Noise & Drift

o o

0.84 0.86 0.88 0.9 0.92 XD used m Network, (mf MeOH)

0.94 0.96

Figure 7.1. Steady-state process-model mismatch for the lab column, (a) Overhead product composhion.

153

0.08 T

X

o 2 "-: 0.05 O

4-1

-2 3

.£ c^ u

0.02 -

I OQ

X

o With Noise & Drift A W / 0 Noise &D©ft

-0.01 0.01 0.02 0.03 0.04

XB used m Network, (mf MeOH) 0.05 0.06

Figure 7.1. Continued, (b) Bottom product composhion.

154

X o

fa 0.998 -o CO

3

bo u

I 0.996

£ o

cia

0.994

'"8

A A

A

A A

O

1

@ e { i

A ^ ^ ^ ^ A

With Noise & Drift A W / O Noise & Drift

0.994 0.996 0.998 XD used in Network, (mf MeOH)

Figure 7.2. Steady-state process-model mismatch for the high-purity column, (a) Overhead product composhion.

155

0.008

X

o : ! 0.0064

o With Noise & Drift A W / O Noise & Drift

% 0.0048

c/o o

0.0032 -

o 0.0016

CD X

0 ^•

0

o o

0.0005 0.001 0.0015 XB used m Network, (mf MeOH)

0.002

Figure 7.2. Continued, (b) Bottom product composhion.

156

behavior, while under noisy conditions, the predictions are distributed evenly around the

actual process behavior. Even though the randomization introduced in the dynamic

simulator to simulate the "real worid" experience tends to increase the process-model

mismatch at any one time instant, the neural network is able to control the process

satisfactorily because the neural network model is closer to the idealized process behavior.

The noise and other random fluctuations do not have any adverse influence on the

network predictions. The networks have extracted the phenomenological process

characteristics from the training data successfiilly during the learning process.

7.3. "It's the Gain Prediction. Stupid!"

From the above discussion on process-model mismatch it is clear that there is a fair

degree of mismatch between the processes and the models. However, h has been shown

from the results in Chapter VI that the controllers perform satisfactorily under various

operating conditions and were able to control the processes successfiilly. The process-

model mismatch shows only that the model is different from the process, and yet we have

shown good controller performance. It does not qualify why the using the model has

enabled better controller performance. One reason for using a model is to decouple the

interactions between the controlled and the manipulated variables. Another important

reason is that the model captures the nonlinear process behavior, and therefore, is a better

representation of the real process. One of the most important fiinctions of a model that is

critical to its success as a good controller model is hs ability to predict process gain

changes. From a process control standpoint, it is the gain prediction that matters the

most (Riggs, 1993). Gain prediction is defined as the change in the manipulated variable

for a given change in the controlled variable. Gain predictions have two components: the

magnitude and the direction. While its important that the magnitude of the change be

approximate to the real process gain change, it is the direction that is more critical. If a

157

model is able to point the right direction whh a reasonably approximate magnitude of

change, the model has the potential to make good control decisions. For satisfactory

control of any process it is essential that the model used to infer the control action be able

to predict the process gains with sufficient accuracy (Riggs, 1993).

Figure 7.3a shows the process gains for the reflux rate with respect to the overhead

composhion, AL/Ax^ and Figure 7.3b shows the process gain for the boilup rate with

respect to the bottoms composition, AF/Ax , for the lab column. Figures 7.4a and b

shows similar results for the high-purity distillation column. Despite the fact that the

"process" and the "model" are distinctly different from each other, the model is able to

describe the process gain changes. However, h must be noted that even though we

believe that an essential feature for a good process control model is the ability to predict

the gain changes accurately, there could be other aspects that determine the fitness of a

model for control applications.

158

14

X 12 O o s 1 1 0 Ui

i 8

1' • 4

2 -

0

0 4 6 8 10 dL/dXD - Dyn. Sun., (Ibmols/h/mf MeOH)

12 14

Figure 7.3. Steady-state process gains for the lab colunm. (a) Change in reflux rate with overhead product composition.

159

-5 -4 -3 -2 -1 0 dV/dXB - Dyn. Sim., (Ibmols/h/mf MeOH)

Figure 7.3. Continued, (b) Change in boilup rate with bottom product composhion.

160

0 1 2 3 d(L)/d(XD)-Process, (Ibmols/h/mf MeOH)

(Thousands)

Figure 7.4. Steady-state process gains for the high-purity column, (a) Change in reflux rate with overhead product composition.

161

-0.002

-0.01 -0.008 -0.006 -0.004 d(V)/d(XB)-Process, (Ibmols/h/mf MeOH)

-0.002

Figure 7.4. Continued, (b) Change in boilup rate with bottom product composhion.

162

CHAPTER Vin

DISCUSSION, CONCLUSION, AND

RECOMMENDATIONS

8.1. On Using Neural Network Steadv-State Process Inverse Models

The idea of using neural networks to develop steady-state inverse models is simple in

concept. The use of a neural network representing the inverse of the process to calculate

explicitly the manipulated variables in order to follow a reference system is not only

extremely appealing but is also highly practicable. The inverse models can be developed

using neural networks simply by deciding the right set of input-output variables at the

network-training stage. Most process engineers should have access to some approximate

steady-state model (CAD package, analytical equations, etc.) of their process or plant

which they employ to optimize and evaluate its performance. These steady-state models

could be used readily to generate all the data required to develop neural network models.

Once a neural network model is developed, it is ideally suited for control purposes. The

method of implementing the multivariable controller using steady-state inverse models is

simple, and follows standard industrial practice for tuning controllers. Therefore, we think

that the technique can be applied in practice, relatively frequently.

Another issue is that of robustness of the model in the face of data uncertainty, also

known as fault tolerance characteristics of the model. Neural networks are known to be

highly fault tolerant (Harmon, 1992). This is an important advantage in using neural

network models for process control, and even though it is not noticed immediately, h has

been mentioned from time to time in neural network literature. The issue deals with the

ability of models in handling faulty, corrupted, or physically meaningless data. We

experienced one aspect of fault tolerance of neural network models by serendipity.

163

A cursory glance at Figure 6. Id shows that the steady-state target values for the

bottom composition set-point, x^^s^ calculated by the control law in Equation 5.15, takes

values that are not only outside the training range (0.02-0.07 mole fraction methanol), but

are also physically meaningless (< 0.0 mole fraction methanol) for a short period of time.

Despite these obvious infractions, the neural network model does not show any adverse

behavior on receiving such spurious data. The reason being that even when the network

has been trained over specified ranges for the input variables, the network output variables

(L and V) are always bounded between the minimum and maximum values for the outputs

as determined from the training data. Also, the nature of the sigmoidal transfer fiinction

(the hyperbolic tangent, in this case) is such that for any input in the range ±oo, the

transformed output is always bounded (±1, in this case), thus ensuring that even when

"garbage" goes in, "garbage" does not come out. The issue of tolerating faulty or

corrupted data is a real-world problem extremely relevant to process control because most

field instruments are electrical or electronic devices that transmit information from remote

locations to a central data acquisition station and are susceptible to random influences that

can corrupt the information very easily. A phenomenological model, for mstance, would

fail under similar circumstances, and would require proper safeguards to be buih into the

system to prevent the model from "crashing." This is just one aspect to fault tolerance,

and by no means, is an attempt to state that neural networks can handle all types of fauhs.

There are several different types of faults that can adversely affect a muhivariable

controller and this research does not aim to evaluate the fault tolerance of neural network

models. However, the fault tolerance aspect of neural network models is worth exploring.

8.2. On Optimal Training of Neural Networks

Determining the "right" neural network model does require some experience. Like all

other regression techniques, coming up with an appropriate model is a trial-and-error

164

procedure with no guaranteed method to assure the "right" model. While techniques have

been investigated to determine initial network configuration and a set of initial weights

(Scott and Ray, 1993), these are, at best, estimates that enable more efficient training with

an algorithm such as backpropagation. While overfitting is an issue that cannot be

ignored, h becomes of increasing importance when the number of weights is of the order

of the number of training examples (such networks are called oversized networks). Also,

overfitting becomes more important when the gradient learning process (backpropagation)

is used for weight adjustment. However, with the optimization technique used here for

determining each new set of weights, the presence or absence of the validation set did not

make much difference to the overall network prediction characteristics. Other network

architectures with three, four, five, six and seven hidden nodes were also trained, but the

network with five hidden nodes gave the best overall performance based on the

normalized root mean square error for all patterns in the test data set.

While it would be desirable to include as many data points as possible to constitute a

training data set, it would be "best" to obtain a "reasonably good" model using the

minimum amount of data, from an engineering viewpoint. There is a distinct difference for

why we wish to use neural networks when compared to classical Connectionist ideology.

Most Connectionist prefer to use neural networks to model processes where the

phenomenological behavior is not quite clear. In such cases, it is advisable to gather as

much data as possible to train and test the network. In the present case, the process

phenomenon of distillation is older than the science of chemical engineering, and is quite

well understood with volumes written about its design and operation (Kister, 1992; Kister,

1990). We wish to use neural networks as a tool for modeling because h offers the

advantages of simplicity and tremendous computational speed-two important necesshies

for process control applications among the other advantages already mentioned earUer.

Therefore, the aim is to develop as good a model with as little data as possible. The

165

lab-column model required 81 data points, while the high-purity column model required

375 data points. Using the standard rule-of-thumb from statistical regression

methodology, if w is the number of unknowns (weights) then, at least, 2n data points were

selected to consthute the training set. The training of the high-purity column model was

tried with 225 data points (3 data points for F and 2, and 5 for each jc and x ; i.e.,

3x3x5x5 = 225 points), and 375 points (3 for F, and 5 for each z, x^, and Xg, i.e., 3x5x5x5

= 375 points), but these data sets did not yield as good a testing error as that obtained

when the data set comprised 375 data points obtained by selecting 3 data points for 2, and

5 for each F, x^ and x^. The dynamic simulations concurred the fact that upsets due to

feed flowrate changes were more critical than those due to feed composhion changes, and

hence, more data points were needed to capture the changes due to the former. The high-

purity column operation is more nonlinear than the lab column, and hence, a more "fine-

grain" description of the input-output behavior is needed. However, no effort was made

to determine an optimum number of training data points in this study.

8.3. In Conclusion

The novel approach presented in this study shows that neural networks can indeed be

used to model the steady-state process inverse of complex systems (see Section 3.3), such

as distillation columns. The neural network models when coupled whh a simple reference

system synthesis can be used to formulate a very simple multivariable controller (see

Section 5.2). The control strategy and the controller structure is simple to implement and

offers a practicable solution to a difficult control problem. The simplicity and directness

of the approach in addressing issues such as obtaining training and testing data from

CAD packages (see Section 3.3), training the neural networks with a more robust and

efficient nonlinear least-squares algorithm (see Sections 2.7 and 2.8), incorporating the

model in the feedback controller (see Section 5.2), and the use of steady-state models

166

(see Section 5.2.1) make it distinct and better when compared with conventional PI

control strategy.

The study was carried out using dynamic simulations of two different methanol-water

distillation columns-a lab-scale system and a high-purity industrial system. The neural

network models were trained on data obtained using steady-state CAD simulations for the

two methanol-water columns, and the CAD simulations were kept intentionally different

from the dynamic simulations to introduce mismatch between the actual process and the

model of the process. The neural network model-based controllers show good

performance for both servo as well regulatory modes of operation. The neural network

models are extremely portable because the only thing that identifies one model from

another is the number and the numerical value of the weights. There is no difference in the

internal structure and working of two feedforward neural networks once their

architectures (number of input, hidden, and output nodes, and the type of transfer

fiinction) have been specified. // is the set of weights that define the model. This is a

distinct advantage while considering the practical implementation of such systems.

Neural networks are not, by any means, a panacea. They have their advantages and

disadvantages just like any other methodology. Neural networks should be considered as

a tool that has its place among other tools in a modeler's kit. Neural networks offer the

advantages of computational simplicity with the added ability to model complex systems

with enormous processing power, speed and generality, and their development requires

little engineering effort. The disadvantages are that developing neural network models

requires some expertise. Data capturing the essence of the input-output relationship is

critical. In our case, we had a reasonably good understanding of the underlying process

phenomena; but, there are many problems of practical importance where there is little or

no understanding of the physical process. For such situations, neural networks do offer a

possible opportunity for modeling, but the model development is at best a trial-and-error

167

procedure, and will require extensive testing and validation by an experienced "coach"

before it can be used with any confidence. Neural networks come in different shapes and

sizes with respect to neuron structure, architecture, and training techniques, much like ice

cream flavors. What is the best neuron structure? What is the right architectural size?

What IS the best training technique for a given structure and type? These questions are all

highly debatable and much research is focused at finding answers to these questions. We

have not attempted to determine the "best" answer to a given problem. Instead, our

attempt was to determine if a particular type of neural network can be utilized to solve a

practical problem in an efficient manner. It is our opinion that neural networks indeed do

a remarkable job and have great potential, especially in the area of modeling complex

chemical process systems for model-based process control applications.

8.4. Recommendations

No study can ever be said to be complete, and this one certainly is not. The main

purpose of any study is to open fiirther doors and avenues to explore. Here are some

issues that are important, in my opinion, both from a scientific as well as technological

viewpoint in the fiirther development of neural networks and hs application in process

control.

8.4.1. Control of Systems with Higher Degrees of Complexity

The present study was concerned with demonstration of using neural network models

on two methanol-water distillation columns. Even though the binary system has a

nonideal VLE, and distillation operation is nonlinear, interactive, and nonstationary, h still

remains a fairly simple system from a phenomenological viewpoint. As the number of

components increase, the process phenomena becomes increasingly complex. The real

advantage of neural networks can be realized in control of complex process systems such

168

as multi-component distillation with nonideal VLE, and reactor systems with muhi-phase

reactions. Presently, control of such systems is at best nominal, and rehes heavily on

operator experience.

8.4.2. Constraint Control

Most process are affected by operating system constraints. The issue of constraint

control is becoming more important from an industrial practice perspective because almost

all processes are pushed to operate close to their designed capacity. Under such operating

conditions, often most of the "knobs" available to the operator are close to their

"saturation" values. For instance, in case of a distillation column, it may be desirable to

operate the column at the highest possible throughput for a specified separation. The

reboiler heat duty and the overhead condenser cooling duty may be close to their

maximum capacity. The heating medium in the reboiler and the cooling medium in the

condenser may be constrained with respect to their flowrates. When one of the

manipulated variables is constrained then it can no longer be used a manipulated variable,

but remains fixed and invariant until the constraint is no longer valid. This study did not

address the issue of constraint control using neural network model-based controllers.

However, some conceptual thoughts on using neural networks in a model-based control

environment that would permit dealing both constrained as well as unconstrained

operating conditions are presented here.

Figure 8.1 illustrates a possible structure for a neural network model-based control

that enables handling constraints (Rhinehart, 1993). The neural network model in this

case is a "process" model and not an "process inverse" model, i.e., it predicts the process

outputs when given the process inputs. In case of specific distillation column, a process

model could predict the overhead and bottom product compositions given the feed

flowrate and composition, reflux and boilup rates. The neural network model-based

169

Disturbances

Setpoints

r r

Manipulated Variables from

Constrained Optimizer

Neural Network Process Model

L I

Constraints

I I I I

i i i i Constrained

Optimization Routine

Process Outputs (CVs)

J

Steady-State Targets for CVs

I

i

I

Network I Predicted •

CVs f

H

Error between Actual and Network

Predictions for the CVs

I

- ^

J

Figure 8.1. Proposed structure for constrained neural network model-based control

170

controller could comprise the neural network process model along with a reference system

synthesis with assumed first order dynamics such as the GMC law, and a constrained

optimization algorithm such as the Marquardt method. The error between the steady-state

targets for the controlled variables predicted the reference system and that predicted by

the neural network process model can be minimized to determine the manipulated

variables. A suhable objective fimction could be defined as

4) = y\(yi,SS - YlMN ) ' + ( ! - y\W2,SS - y2,NN ) ' , (8- D

where ^ i ^ and Tj s are the steady-state target values for the controlled variables, Kj ^

and Y2JSJM are the neural network predictions for the controlled variables, and T| (0 < TI < 1)

is a weighting factor on the objective function to decide which controlled variable is more

important. The optimization algorithm can be used to search for the manipulated variables

that minimize the error fiinction under both unconstrained or constrained operating

condhions.

There are several issues that need to be addressed with respect to developing and

using the process model in the manner as described above. While development of the

process model should not be much different than that for developing process inverse

models, one has to be carefiil in using the models. Care should taken to check for input

multiplicity which can cause a great deal of trouble. The fauh tolerance and robustness of

such a system also needs to be carefiilly examined. The above control structure, however,

does have the promise of handling both unconstrained as well as constrained operations.

8.4.3. Comparison of Various Model-Based Controllers with Similar Control Structures

Advanced control using conventional PI controllers has been the focus of a number of

studies on distillation columns (Skogestad et al., 1990), and h is understood that while

using PI controllers selection of the control configuration is critical to the success of the

171

application. McAvoy (1983) has shown using steady-state RGA analysis that for dual-

composhion control in distillation columns, the energy balance scheme (:c -L and Xg-V)

gives the most coupled controlled variable-manipulated variable pairings, and the double

ratio {XjyLID, x^-VIB) gives the most decoupled scheme, while the two ratio schemes

{xj^-L/D, x^-Fand Xj^-L, x^-VIB) give intermediate coupling. The PI controller

comparisons made in this study uses the x^-L and x^-F scheme which happens to be the

most difficult PI control configuration. To make a more reasonable comparison one has

to choose a suitable PI control configuration and test it against an model-based control

structure of the same type. Such a study would enable making some qualitative and

quantitative assessment about conventional control versus advanced control strategies.

8.4.4. Optimize Neural Network Structure

The neural network models developed in this study have shown that such models can

be easily developed and are suitable for use as process control models. The models are

not the "best" neural network models. Even though the models that have been developed

do a great job of predicting actual process gain changes, there could still be some room for

improvement. Optimal network training addresses issues of finding (training) the "best"

neural network for a given problem. Many such optimal training schemes have been

suggested in literature (Weigand et al., 1990; Le Cun et al., 1990). In my opinion, the

issue of optimal training becomes more critical when the networks become oversized (i.e.,

when the number of weights « number of data points), and if a training algorithm such as

backpropagation is used for weight adjustment. With the more robust Marquardt method

used for weight adjustment, it does not appear to be that critical. However, a suhable

technique, such as using a separate validation set (Weigand et al., 1990) or using the

second-derivative information (Le Cun et al., 1990), may be usefiil in deciding when to

stop training.

172

8.4.5. Using Neural Networks on Systems with Little Process Understanding

In this study, neural networks have been used to develop models of processes where

the phenomenological understanding of the process is quite clear. Distillation operation is

quite well known and understood to a great extent. The main point in using neural

network model instead of phenomenological models was to reduce the computations

involved and, thereby, increase the speed of the controllers. But, there are many chemical

systems where the phenomenological understanding is still not very clear. Neural

networks have the potential to model such systems provided enough data is available to

enable development of a neural network model. It is important to note that the data has to

capture the variations in the different independent variables and the effect these changes

have on the dependent variables. Oftentimes, in cases where the phenomenological

understanding is limited, the available data is repethious and does not reflect the cause-

effect patterns. However, neural networks do have the potential to solve such problems

and with the help of suitably designed experiments neural network models can be

developed.

8.4.6. Experimental Demonstration of Neural Network Model-Based Control

Even though the present study focused only on implementing the neural network

model-based controllers on dynamic simulations of the actual processes, care was taken to

make these dynamic simulations as "real" as possible in order to study the controller

performance from a real-worid perspective. In order to add to the credibility of the

technique, the neural network model-based controller has to be demonstrated on an

experimental system, and if possible, followed-up with an industrial implementation.

173

Implementation with a real-time system requires addressing several new issues such as

interfacing with the data acquishion system, and transportability of the models. The

experimental demonstration will verify some of the salient features of the neural network

models, such as fault tolerance and sensitivity to noisy data, that were observed from the

simulation studies.

8.4.7. Testing Neural Network Models that Different Steady-State Characteristics

This issue is specific to modeling distillation columns. In the neural network models

developed for controlling distillation columns four inputs were used, i.e., F, z, x^, and x^,

to obtain predictions for two outputs, i.e., Z. and V. In certain cases, when the stage

efficiency in a column is not affected greatly by changes in vapor and liquid loading in the

column, an alternate structure can be formulated. On a steady-state basis. Equations 4.1,

4.2 and 4.3 can be written to eliminate the feed flowrate, F from the above equations.

This is valid because now the effect of changing F is captured in the ratios L/F and V/F,

and L and V increase or decrease proportionally with F when all other conditions remain

unchanged. A neural network model can be developed to take three inputs, i.e.,z,Xj^ and

Xg, and predict the ratios L/F and V/F, . The network now becomes simpler (fewer inputs,

and therefore, fewer weights, and perhaps, fewer data points for training).

However, one has to be extremely cautious in applying such a model because,

oftentimes, systems with large flowrates require larger internal liquid and vapor flowrates,

and in such cases, the stage efficiency is affected greatly by the loading in the colunm. The

idea may not be valid for all cases, but, nevertheless it may be worth exploring.

174

8.4.8. On-line Adaptation of the Neural Network Models

In the present case, all the neural network models used for control purposes were

static models, i.e., the models do not change or adapt to changing process condhions.

Any mismatch or offset between the process and the model are eliminated by the integral

terms in the nonlinear PMBC control laws. The basic difference between a neural network

models and a phenomenological model is that the latter can be adjusted or adapted either

periodically (i.e., at steady-state condhions) or continuously (i.e., at every sampHng

interval) to keep the model as close as possible to the process at all times. Whh the

present neural network models, no attempt was made to adapt the model. If the model

represents the actual process behavior closely, it can be used for process optimization

purposes with a greater degree of confidence. There are several possible schemes to adapt

neural network models, and all the procedures usually involve adjusting at least one

parameter that captures most of the process uncertainties. For instance, stage efficiency in

distillation columns is one quantity that introduces the greatest uncertainty. One possible

scheme for adaptation of the neural network process-inverse model could be as follows:

use a separate neural network model that takes F, z, x^, Xg, L and V as inputs and predicts

the stage efficiency, r\ as the output. This network can be used to obtain predictions for

stage efficiency under the curtent operating conditions defined by F, z, x^y, Xg, L and V.

The new TI can then be used in the neural network process-inverse model which now takes

F, z, x^, Xg, and TI as inputs and predicts I and F. The adaptation procedure can be

applied either periodically (Riggs and Rhinehart, 1990) or in an incremental on-line

manner (Rhinehart and Riggs, 1991).

175

BIBLIOGRAPPIY

Astrom, K. J., and McAvoy, T. J. "Intelligent Control," in J. Proc. Cont,, Vol. 2, No. 3, 1992,115-127.

Balchen, J. G., Lie, B., and Solberg, I. "Internal Decoupling in Non-Linear Process Control," mModel. Ident. Control, Vol. 9, 1988, 137-148.

Barnard, E. "Optimization for Training Neural Networks," in IEEE Trans, on Neural Networks, Vol. 3, No. 2, 1992, 232-240.

Bartusiak, R. D., Georgakis, C, and Reilly, M. "Nonlinear Feedforward/Feedback Control Structures Designed by Reference System Synthesis," in Chem. Eng. Set Vol.44, 1989, 1837-1851.

Battiti, R. "Accelerated Back-propagation Learning: Two Optimization Methods," in Complex Syst., Vol. 3, 1989, 331-342.

Batthi, R. "First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method," m Neural Computation, 4, 1992, 141-166.

Bequette, B. W. "Nonlinear Control of Chemical Processes: A Review," mind Eng. Chem. Res., 30, 1991, 1391-1413.

Bhagat, P. "An Introduction to Neural Networks," in Chem. Eng. Prog, Vol. 86, No. 8, 1990, 55-60.

Bhat, N., and McAvoy, T. J. "Use of Neural Nets for Dynamic Modeling and Control of Chemical Process Systems," in Comp. and Chem. Eng, Vol. 14, No. 4/5, 1990, 573-583.

Bhat, N. v., Minderman, P., McAvoy, T. J., and Wang, N. S. "Modeling Chemical Process Systems Via Neural Computation," in IEEE Cont. Syst. Mag., Vol. 10, No. 3, 1990, 24-30.

Bosley, J. R., Edgar, T. F., Patwardhan, A. A., and Wright, G. T. "Model-Based Control: A Survey," in Advanced Control of Chemical Processes. Eds. K. Najim and E. Dufour, IF AC Symposium Series No. 8, 1992, 127-136.

Bristol, E. H. "On a Measure of Interactions for Muhivariable Control," in IEEE Trans. Auto. Cont., AC-\\, 1966, 133.

176

Broyden, C. G., Dennis, J. E., and More, J. J. "On the Local and Superiinear Convergence of Quasi-Newton Methods," in J.LM.A., Vol. 12, 1973, 223-246.

Cooper, D. J., Hinde, Jr., R. F., and Megan, L. "Pattern-Based Adaptive Process Control," in Comp. and Chem. Eng, Vol. 14, 1990, 1339-1350.

Cooper, D. J., Megan, L., and Hinde, Jr., R. F. "Comparing Two Neural Networks for Pattern-Based Adaptive Process Control," in AIChE J., Vol. 38, No. 1, 1992a, 41-55.

Cooper, D. J., Megan, L., and Hinde, Jr., R. F. "Disturbance Pattern Classification and Neuro-Adaptive Control," in IEEE Cont. Syst. Mag, 12(2), 1992b, 42.

Cott, B. J., Reilly, P. M., and Sullivan, G. R. "Selection Techniques for Process Model-Based Controllers." Paper presented at AIChE Spring National Meeting, Houston, TX, 1985.

Cutler, C. R., and Ramaker, B. L. "Dynamic Matrix Control-A Computer Control Algorithm," in Proc. Amer. Cont. Conf, San Francisco, CA, 1980, Paper WP5-B.

Cybenko, G. "Continuous Valued Neural Networks with Two Hidden Layers are Sufficient." Report, Department of Computer Science, Tufts University, Medford, 1988.

Cybenko, G. "Approximations by Superpositions of a Sigmoidal Function," in Math. Control Signal Systems, 2, 1989, 303-314.

Ding, S. S., and Luyben, W. L. "Control of Heat-Integrated Complex Distillation Configuration," mind Eng Chem. Res., Vol. 29, 1990, 1240-1249.

Elaahi, A., and Luyben W. L. "Control of an Energy-Conservative Complex Configuration of Distillation Columns for Four-Component Separations," in Ind Eng. Chem. ProcessDes. Dev., Vol. 24, 1985, 368-376.

Finco, M. v., Luyben, W. L., and Polleck, R. E. "Control of Distillation Columns with Low Relative Volatilities," in Ind Eng Chem. Res., Vol. 28, 1989, 75-83.

Fruehauf, P. S., and Mahoney, D. P. "Improve Distillation-Column Control Design," in Chem. Eng Progress, Vol. 90, No. 3, 1994, 75-83.

Garcia, C. E., and Morari, M. M. "Internal Model Control. 1. A Unifying Review and Some New Results," in Ind. Eng. Chem. Process Des. Dev., Vol. 21, 1982, 308-323.

Grossberg, S. Classical and Instrumental Learning by Neural Networks in Progress in Theoretical Biology. Vol. 3, Academic Press, New York, 1977, 51-141.

177

Grossberg, S. Studies of Mind and Brain: Neural Principles of Learning Perception. Development. Cognhion, and Motor Control Reidell Press, Boston, MA, 1982.

Guez, A., Eilbert, J. A., and Kam, M. "Neural Network Architecture for Control," in IEEE Cont. Sys. Mag, Vol. 8, No. 2, 1988, 22-25.

Harmon, P. "Neural Networks: Hot Air or Hot Technology? Part I," in Intelligent Software Strategies, Ed. Paul Harmon, Vol. VIH, No. 4, 1992, 1-12.

Hebb, D. O. The Organization of Behavior, a Neuropscvchological Theorv. John Wiley, New York, 1949.

Hecht-Neilsen, R. "Counterpropagation Networks," in Appl. Opt., Vol. 26, No. 23, 1987, 4979-4984.

Henley, E. J., and Rosen, E. M. Material and Energy Balance Computations. John Wiley & Sons, New York, 1969.

Henley, E. J., and Seader, J. D. Equilibrium-Stage Separation Operations in Chemical Engineering. John Wiley, New York, 1981.

Henson, M. E., and Seborg. D, E. "A Critique of Differential Geometric Approach to Nonlinear Process Control." Presented at the IF AC World Congress, Tallinn, Estonia, 1990.

Himmel, C. D., and May, G. S. "Advantages of Plasma Etch Modeling Using Neural Networks Over Statistical Techniques," in IEEE Transactions on Semiconductor Manufacturing, Vol. 6, No. 2, 1993, 103-111.

Hinde, R. F., and Cooper, D. J. "Using Pattern Recognition in Controller Adaptation and Performance Evaluation," in Proc. Amer. Cont. Conf, San Francisco, CA, 1993, 74-78.

Hokanson, D. A., and Gerstle, J. G. "Dynamic Matrix Control Multivariable Controllers," in Practical Distillation Control. Ed. W. L. Luyben, Van Nostrand Reinhold, New York, 1992, Chapter 12, 248-271.

Hopfield, J. J. ""Neural Networks and Physical Systems with Emergent Collective Computational Abilhies," in Proc. Natl. Acad Sci., Vol. 79, 1982, 2554-2558.

Hopfield, J. J. ""Neurons with Graded Response Have Collective Computational Properties Like Those of Two State Neurons," in Proc. Natl. Acad. Sci., Vol. 81, 1984, 3088-3092.

178

Hsiung, J. T., Suewatanakal, W., and Himmelblau, D. M. "Should Backpropagation be Replaced by a More Effective Optimization Algorithm?" in Proc. IJCNN, Seattle WA, 1991.

Humphrey, J. L., Seibert, A. F., and Koort, R. A. "Separation Technologies-Advances and Priorities." DOE Contract AC07-90ID 12920, February 1991.

Hush, D. R., and Sales, J. M. "Improving the Learning Rate of Backpropagation with the Gradient Reuse Algorithm," in Proc. IEEE Intl. Conf of Neural Networks (2nd), July, 1988, 441-448.

Isidori, A. Nonlinear Control Svstems. 2nd Edition, Springer-Veriag, New York, 1989.

Isidori, A., Krenner, A. J., Gori-Giori, C, and Monaco, S. "Nonlinear Decoupling via Feedback: A Differential Geometric Approach," in IEEE Trans. Auto. Cont, AC-26, 1981,331.

Kister, H. Z. Distillation-Design. McGraw Hill, New York, 1992.

Kister, H. Z. Distillation-Operation. McGraw Hill, New York, 1990.

Kollias, S., and Anastassious, D. "Adaptive Training of Multilayer Neural Networks using a Least Squares Estimation Technique," in Proc. IEEE Intl. Conf. of Neural Networks{2nd), Vol. I, 1988, 383-390.

Kramer, M. A., and Leonard, J. A. "Diagnosis Using Backpropagation Neural Networks-Analysis and Criticism," in Comp. and Chem. Eng, Vol. 14, No. 12, 1990, 1323-1338.

Kung, S. Y., and Hwang, J. N. "An Algebraic Projection Analysis of Optimal Hidden Units Size and Learning Rates in Backpropagation Learning," in Proc. IEEE Intl. Conf of Neural Networks (2nd), Vol. I, 1988, 363-370.

Le Cun, Y., Denker, J. S., and Solla, S. A. "Optimal Brain Damage," in Advances in Neural Information Processing Svstems. Ed. David S. Touretzky, Morgan Kaufinann, Vol.2, 1990,598-605.

Le Cun, Y. "HLM: A Multilayer Learning Network," in Proc. Connectionist Model Summer School, Pittsburgh, 1986, 169-177.

Lee, P L . "Generic Model Control-The Basics," in Nonlinear Process Control: Applications of Generic Model Control. Ed. Peter L. Lee, Springer-Veriag London Ltd., UK, 1993, 7-42.

179

Lee, P. L., and Sullivan, G. R. "Generic Model Control," in Comp. and Chem. Eng., Vol. 12, No. 6, 1988,573-580.

Leonard, J. A., and Kramer, M. A. "Improvement of the Backpropagation Algorithm for Training Neural Networks," in Comp. and Chem. Eng., Vol. 14, No 3, 1990a 337-341.

Leonard, J. A., and Kramer, M. A. "Limhation of the Backpropagation Approach to Fault-Diagnosis and Improvement with Radial Basis Functions." Presented at the AIChE Annual Meeting, Chicago, IL, 1990b, Paper 96e.

Luyben, W.L. Process Modeling. Simulation and Control for Chemical Engineers. 2nd Edition, McGraw Hill Co., New York, 1990, 129-141.

Luyben, W. L., (Editor). Practical Distillation Control. Van Nostrand Reinhold, New York, 1992.

MacMurray, J., and Himmelblau, D. "Identification of a Packed Distillation Column for Control via Artificial Neural Networks," in Proc. Amer. Cont Conf, San Francisco, CA, 1993, 1455-1459.

Marquardt, D. W. "An Algorithm for Least-Squares Estimation of Nonlinear Parameters," in J. Soc. Indust Appl. Math., 11, 2, 1963, 431-441.

McAvoy, T. J. "Connection Between Relative Gain and Control Loop Stability and Design," in AIChE J., Vol. 27, 1981, 613-619.

McAvoy, T. J. Interaction Analysis: Principles and Application. ISA Monograph #6, Instrument Society of America, NC, 1983.

McClelland, T. L., and Rumelhart, D. E. Parallel Distributed Processing. PDP Research Group, MIT Press, Cambridge, MA, 1986.

Megan, L., and Cooper, D. J. "A Neural Network Approach to Adaptive Control of a Pilot Plant Distillation Column." Presented at the AIChE Annual Meeting, St. Louis, MO, 1993, Paper 145f

Mehta, D. D., and Ross, D. E. "Optimize ICI Methanol Process," in Hydrocarbon Processing, November, 1970, 183-186.

Mehta, D. D., and Pan, W. W. "Purify Methanol This Way," in Hydrocarbon Processing, February, 1971, 115-120.

180

Minsky, M., and Pappert, S. Perceptrons. MIT Press, Cambridge, MA, 1969.

Muhrer, C. A., Collura, M. A., and Luyben, W. L. "Control of Vapor Recompression Distillation Columns," in Ind. Eng. Chem. Res., Vol. 29, 1990, 59-71.

Nahas, E. P., Henson, M. A., and Seborg, D. E. "Nonlinear Internal Model Control Strategy for Neural Network Models," in Comp. and Chem. Eng, Vol. 16, No. 12, 1992, 1039-1057.

Namatane, A., and Kimata, Y. "Improving the Generalizing Capabilities of a Back-Propagation Network," in Neural Networks, Vol. 1, 1989, 86-93.

Narendra, K. S., and Parthasarathy, K. "Identification and Control of Dynamical Systems Using Neural Networks," in IEEE Trans, on Neural Networks, Vol. 1, No. 1, 1990, 4-27.

Pandit, H. G., and Rhinehart, R. R. "Experimental Demonstration of Constrained Process Model-Based Control of a Nonideal Binary Distillation Column," Proc. Amer. Cont Conf, Chicago, IL, 1992, 630-631.

Pandit, H. G., Rhinehart, R. R., and Riggs, J. B. "Experimental Demonstration of Nonlinear Model-Based Control of a Nonideal Binary Distillation Column," Proc. Amer. Cont Conf, Chicago, IL, 1992, 625-629.

Papastathopoulou, H. S., and Luyben, W. L. "Control of Binary Sidestream Distillation Columns," in Ind Eng Chem. Res., Vol. 30, 1991, 705-713.

Parker, D. B. "Optimal Algorithm for Adaptive Networks: Second Order Backpropagation, Second Order Direct Propagation, and Second Order Hebbian Learning," in Proc. IEEE Conf on Neural Networks, Vol. II, 1987, 593-600.

Patwardhan, A. A., Rawlings, J. B., and Edgar, T. F. "Nonlinear Model Predictive Control," in Comp. Chem. Eng, Vol. 14, 1990, 123.

Peel, C, Willis, M. J., and Tham, M. T. "A Fast Procedure for the Training of Neural Networks," in J. Proc. Cont, Vol. 2, No. 4, 1992, 205-211.

Poh, I., and Jones, R. D. "A Neural Network Model for Prediction," in J. Amer. Stat Assn., Vol. 89, No. 425, 1994, 117-121.

Pottman, M., and Seborg, D. E. "Identification of Nonlinear Processes Using Reciprocal Multiquadratic Functions," in J. Proc. Cont, Vol. 2, No. 4, 1992, 189-203.

181

Press, W. H., Teukolsky, S. A., Vetteriing, W. T., and Flannery, B. P. Numerical Recipes in C, The Art of Scientific Computing 2nd Edition, Cambridge University Press, England, 1992, 683-688.

Prett, D. M., and Garcia, C. E. Fundamental Process Control. Butterworths, Boston, MA, 1988.

Psichogios, D. C, and Unger, L. H. "A Hybrid Neural Network-First Principles Approach to Process Modeling," in AIChE J., Vol. 38, No. 10, 1992, 1499-1511.

Raich, A., Wu, X., and Qmi, A. "Approximate Dynamic Models for Chemical Processes: A Comparative Study of Neural Networks and Nonlinear Time Series Modeling Techniques." Paper presented at the AIChE Annual Meeting, Los Angeles, CA, 1991, Paper 143b.

Ramchandran, S. "Marquardt Method-A Program for Nonlinear Optimization and Equation Solving." Technical Report, Department of Chemical Enginnering, Texas Tech University, Lubbock, TX, 1993.

Ramchandran, B., Riggs, J. B., and Heichelheim, H. R. "Nonlinear Plant-Wide Control: AppHcation to a Supercritical Fluid Extraction Process," in Ind. Eng. Chem. Res., 31, 1992, 290-300.

Rhiel, F. F. "Model-Based Control," in Practical Distillation Control. Ed. W. L. Luyben, Van Nostrand Reinhold, New York, 1992, Chapter 21, 440-450.

Rhinehart, R. R. Personal Communication, Department of Chemical Engineering, Texas Tech University, Lubbock, TX, 1993.

Rhinehart, R. R. and Riggs, J. B. "Two Simple Methods for On-line Incremental Model Parameterization," in Comp. and Chem. Eng, Vol. 15, No. 3, 1991, 181-189.

Rhinehart, R. R. and Riggs, J. B. "Process Control Through Nonlinear Modeling," in Control, Vol. 1, No. 7, 1990, 86-90.

Richalet, J. A., Rauh, A., Testud, J. D., and Papon, J. D. "Model Predictive Heuristic Control: Applications to Industrial Processes," in Automatica, Vol. 14, 1978, 413.

Ricotti, LP. , Ragazzini, S., and Martinelli, G. "Learning Word Stress in a Sub-optimal Second Order Backpropagation Neural Network," in Proc. IEEE Conf on Neural Networks, Vol. I, 1988, 355-361.

182

Rietman, E. A., and Lory, E. R. "Use of Neural Networks in Modeling Semiconductor Manufactunng Processes: An Example for Plasma Etch Modding," in IEEE Transactions on Semiconductor Manufacturing, Vol. 6, No. 4, 1993, 343-347.

Riggs, J. B. "It's the Gain Prediction, Stupid!" Personal Communication, Department of Chemical Engineering, Texas Tech University, Lubbock, TX, 1993.

Riggs, J. B. "Nonlinear Process Model Based Control of a Propylene Sidestream Draw Column," in Ind. Eng. Chem. Res., Vol. 29, 1990, 2221-2226.

Riggs, J. B., Beauford, M., and Watts, J. "Using Tray-to-Tray Models for DistUlation Control," in Nonlinear Process Control: Applications of Generic Model Control. Ed. Peter L. Lee, Springer-Veriag London Ltd., UK, 1993, 67-103.

Riggs, J. B., and Rhinehart, R. R. "Comparison Between Two Nonlinear Process-Model Based Controllers," in Comp. and Chem. Eng, Vol. 14, No. 10, 1990, 1075-1081.

Rosenblatt, F. "The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain," in Psych. Rev., Vol. 65, 1958, 386-408.

Rumelhart, D. E., Hinton, G. E., and Williams, R. J. "Learning Internal Representations by Error Propagation," in Parallel Distributed Processing. Eds. D. E. Rumelhart and J. L. McClelland and the PDP Research Group, MIT. Press, Vol. 1, Chap. 8, 1986, 318-362.

Scott, G. M., and Ray, W. H. "Creating Efficient Nonlinear Neural Network Process Models That Allow Model Interpretation," in J. Proc. Cont, Vol. 3, No. 3, 1993, 163-178.

Seborg. D. E., Edgar, T. F., and Shah. S. L. "Adaptive Control Strategies for Process Control: A Survey," in AIChE J., Vol. 32, No. 6, 1986, 881-913.

Skogestad, S., Lundstrom, P., and Jacobsen, E. W. "Selecting the Best Distillation Control Strticture," in AIChE J., Vol. 36, No. 5, 1990, 753-764.

Thibault, J., and Grandjean, B. P. A. "Neural Networks in Process Control-A Survey," in Advanced Control of Chemical Processes. Eds. K. Najim and E. Dufour, IF AC Symposium Series No. 8, 1992, 251-260.

Treybal, R. E. Mass Transfer Operations. 3rd Edition, McGraw Hill Co., New York, 1980, pp 342-473.

Tyreus, B. D., and Luyben, W. L. "Controlling Heat Integrated Distillation Colunms," in Chem. Eng Progress, Vol. 72, No. 9, 1976, 59-66.

183

Venkatasubramanian, V., Vaidyanathan, R., and Yamamoto, Y. "Process Fault Detection and Diagnosis Using Neural Networks-1. Steady-State Processes," in Comp. and Chem. Eng, Vol. 14, No. 7, 1990, 699-712.

Venkatasubramanian, V., and Chan. K. "A Neural Network Methodology for Process Fauh Diagnosis," in AIChE J., 35, 1989, 1993-2005.

Watrous, R. L. "Learning Algorithms for Connections and Networks: Applied Gradient Methods for Nonlinear Optimization," in Proc. IEEE Conf on Neural Networks, Vol.11, 1987,619-627.

Weigand, A. S., Huberman, B. A., and Rumelhart, D. E. "Predicting the Future: A Connectionist Approach," in Intl. J. Neural Systems, Vol. 1, No. 3, 1990, 193-209.

Werbos, P. "Beyond Regression: New Tools for Prediction and Analysis in Behavioral Sciences," Ph.D. Dissertation, Harvard University, 1974.

White, H. "Some Asymptotic Results for Backpropagation," in Proc. IEEE Conf on Neural Networks, Vol. HI, 1987, 261-266.

Widrow, B. "Generalization and Information Storage in Networks of Adaline 'Neurons'," in Self-Organizing Svstems. Eds. M. C. Jovitz, G. T. Jacobi, G. Goldstein, Spartan Books, Washington, D C , 1962, 435-461.

WiUis, M. J., Montague, G. A., Di Massimo, C, Tham, M. T., and Morris, A. J. "Artificial Neural Network Based Predictive Control," in Advanced Control of Chemical Processes. Eds. K. Najim and E. Dufour, IF AC Symposium Series No. 8, 1992, 261-266.

Willis, M. J., Di Massimo, C, Montague, G. A., Tham, M. T., and Morris, A. J. "On The Applicability of Neural Networks in Chemical Process Control." Presented at the AIChE Annual Meeting, Chicago, IL, 1990, Paper 16d.

Wood, R. K., and Berry, M. W. "Terminal Composition Control of a Binary Distillation Column," in Chem. Eng Sci., Vol. 28, 1973, 1707-1717.

You, Y., and Nikolaou, M. "Dynamic Process Modeling with Recurrent Neural Networks," in AIChE J., Vol. 39, No. 10, 1993, 1654-1667.

Zurada, J. M. Introduction to Artificial Neural Svstems. West Publishing Co., New York, 1992.

184

APPENDIX A

ERROR BACKPROPAGATION TRAINING

ALGORITHM

The following development of the error backpropagation training algorithm has been

adapted from the class notes of Dr. W. J. B. Oldham's graduate course on neural networks

at Texas Tech University (CS 5388: Neural Networks). Let us consider a three-layered

feedforward neural network with hyperbolic tangent transfer fiinctions for all the nodes in

the hidden and output layers. The hyperbolic transfer fiinction is given as

v|/(x) = tanh(x),

which can also be written as

^ W = ^ r — ^ ' (A.1) e -\-e

and the derivative of vj/(x) is given as

e -\-e

I.e.,

vl/•(x)=l-(vi/(x))^ (A.2)

The hyperbolic tangent transfer fiinction described in Equation A. 1 is continuous and is

bounded between ±1, for the range -c» < x < oo.

Consider the three-layered network as shown in Figure A. 1. The details of the

processes occurring at the '/'th hidden node and 'A'th output node are shown in Figure A.2

and A.3, respectively. Inputs to the '/'th node in the hidden layer can be written as

Zi=i:yv]jXj, (A.3) j

and the transformed output from the '/'th node in the hidden layer can be written as

185

X(l)

X(2)

X(j)

X(n)

Y(l)

Y(k)

M Y(m)

Figure A.1. A 3-layered feedforward neural network

186

X(j).W. . ^ 1,J

h(i)

Figure A.2. Processes in the 'i'th neuron in the hidden layer

187

h(i).w2. r(k) Y(k)

Figure A.3. Processes in the 'k'ih neuron in the output layer

188

hi = ^¥{^,). (A.4)

Similarly for the 'A'th node in the output layer, the summed input is given as

r,=I.wlh,, (A.5)

and the transformed output is given as

yk=^M- (A.6)

Let Ypi and Dpi be the actual (network predicted) and desired responses, respectively,

at the '/"th output node for the 'p'th input pattern, where 1 </? < P, and P is the total

number of input patterns. If £^ is defined as half of the error between the desired and

actual values, then

E^ = Uyp,-D^,)\ (A.7) •^ /—I

where w u/ is the total number of outputs from the network. The total error over allp

patterns is then

E=iEp, p=i

or

I .^"jH.,, ^ 2 E^-HZiYpi-Dpiy. (A.8)

2 p=i/=i

In a least-squares method, the weights are adjusted to minimize E. In the error

backpropagation training algorithm (EBTA), the weights are adjusted to minimize Ep. The change in the second layer of weights is given as

BE Apwl=-a-i, (A.9)

where a is the learning rate (step size). The partial derivative of the error fiinction can be

obtained by differentiating Equation A.7 as shown below:

189

I.e.,

^E 1 n ay T-f = -.2.I(y,,-D,,)-f, ow,^ 2 1=1 ^ '' a fa

or

dE. n^ dY. ^=^^Ypi-Dpi)-^- (AlO)

But, according to Equation A.6

Ypi^Wpi). (A. 11)

Therefore,

^ = V / ( / / ) - T : ^ . (A.12)

Substituting Equation A.2 and A.11 in Equation A.12 gives

a n/ -, dr., ^ = ( 1 - ( 1 ; / ) ' ) T 4 - (A.13)

Therefore, now Equation A. 10 after substitution from Equation A. 13 becomes

—f=2(i; , -^p,)( i-(} 'p,) ' )T4 (A.14) ^ , _, ,, „. , ~ . . ' -g^.

Now,

and

^ = /7„if/ = . (A.15) ^k,

190

Substituting Equation A.15 in Equation A.14 gives

dE. ^ = iYpi-Dpi)il-(Ypi)')hp,, (A.16)

and substituting Equation A.16 in Equation A.9 yields

Apwl =a(Dpi-Ypi){\-iYpif)hp^. (A.17)

With a similar treatment for the weights between the mput and hidden layers, h can be

shown that

dE„ "« « dr„, -^= Z(Yp,-Dpi)(\-(Ypif)-^, (A.18) dw;^ /Tf ^' ^'- ^" ' dw,j

and

dr^^dTp^dh^dZp^

dwjj dhp, dZp, dwl

Since,

and therefore, differentiating with respect to Z? , gives

dr,

dhp^

and in the similar manner, differentiating Equation A.4 with respect to 2 , gives

and differentiating Equation A.3 with respect to wjj gives

dz. —^ = x

191

(A. 19)

^ = wl (A.20)

2^=l-(^.)^ (A.21) ^p.

(A.22)

Substituting Equations A.20, A.21, and A.22 in Equation A. 19 and then Substituting

Equation A. 19 in A.18 gives

S = (i-(^)')^/f (};, -^,,)(i-(i',<)')n'. OWjj 1=1

which can be expressed as

A X = a(l-(hp,?)Xj"tiYpi - Dp,){\ -(7,;)')^,^ • (A.23)

Equations A. 17 and A.23 are the algorithms that are used to "backpropagate" the

weights from the outputs to the inputs. In general, the EBTA for weight adjustment for a

M-layered neural network can be shown to take form given below:

Apwi'-^=a.df-'^\x^-^, for/= 1, 2, ...,M-\,

where

5 M - / + i = ( / ) . _ } ; ) ( l _ ( } ; ) 2 ) , f o r / = l ,

and

gM-/+l ^(l_(^A/-/- . l )2)^5A/-/-H2^M-W foj./ = 2 , 3 , . . . , M - 1 , ' ' k=2

where n is the number of nodes in the '/'th layer

192

APPENDIX B

THE MARQUARDT ALGORITHM

The Marquardt-Levenberg method (also known as the Marquardt method)

(Marquardt, 1963) is a nonlinear optimization and equation-solving technique. The

algorithm can be used to estimate unknown variables in sets of nonlinear equations where

the number of variables is less than or equal to the number of equations. Simple

constraints on the parameters may be used to keep the solution in bounds. The following

paragraphs give a description and explanation for the usage of a FORTRAN subroutine

Marquardt (Ramchandran, 1993). Examples are provided to illustrate the usage of the

program code.

B.l. Description

Consider a set of w equations in k unknown variables of the form:

f\\X\-,X2,...,Xf^)-y^,

/ j (Xj , ^2 , . . . , JCjt) - 3^2'

where r are the unknown variables, y. are the known values, and/ are the known

fiinctions. Also n>k, and {x)^^ < x. < (x)^^, where (x)^^ and (x)^^ are the minimum

and maximum constraints on the unknowns.

The algorithm seeks to find a set of x that will minimize a user-defined fiinction, such

as the sum of squares error, <\>, given by

193

1=1

The algorithm can also be used to maximize a user-defined fiinction and will also

handle weighted objective fiinctions. The method of solution combines the best features

of gradient and Newton-Raphson procedures by using a suitable weighting parameter (the

parameter is adjusted internally by the routine). The method has the stability of the

gradient procedure with respect to poor starting values, and at the same time, it possess

the speed of convergence of the Newton-Raphson method when close to the final solution.

More details on the Marquardt method may be obtained from Press et al. (1992) and

Battiti (1992). The FORTRAN program code for the Marquardt method is attached m

Appendix B.

B.2. Limitations

The routine may converge to a relative minimum in the sum of squares surface (rather

than a global minimum), get hung on a ridge (when very long and narrow), or terminate

due to round-off errors. The bounds are intended to keep the solution inside a feasible

region, but h is assumed that the solution does not lie on a bound. If the answer does he

on a bound then h may not be found. If the solution is one of the bounds then the user

might want to extend the bounds and retry.

B.3. Program Usage

The FORTRAN subroutine MARQUARDT calculates the derivatives numerically.

The program can, however, be modified to handle analytical derivatives also. The user is

required to code the equations required to do the fiinction evaluations in the calling

program.

194

B.4. Declaratives

The following declaratives are needed for proper execution of the program:

DIMENSION B(2*K), Z(2*N), Y(N), BV(K), BMIN(K), BMAX(K), P(K*(N+2)+N),

A(KD,1), AC(KD,1), CC(6),INDEX(5),OUTPUT(5), where K and N are defined m the

argument list.

B.5. Argument List

The calling sequence is CALL MARQUARDT (K, B, N, Z, Y, CC, INDEX, BV,

BMIN, BMAX, OUTPUT, KD, P, A, AC, INDEXl), where

1 K-the number of independent variables (K > 1). {Input)

2. B-the vector of K unknowns. On first entry into the subroutine, initial estimates

must be supplied for B(l) through B(K). On each exit, the routine supplies a new

improved estimate of the unknowns. On the final exit, the vector contains the "best" point

found to date. Locations B(K+1) through B(2*K) always contains the "best" point (i.e.,

the base point) found to date. {Input and Output)

3. N-the number of equations to be solved (N > K). {Input)

4. Z-a vector of the N computed fiinction values calculated in the calling program

before first entry and on subsequent requests for fiinction evaluations. Locations Z(N+1)

through Z(2*N) contain fiinction values corresponding to B(K+1) through B(2*K).

{Input)

5. Y-a vector of N desired fiinction values. {Input)

6. CC-a real data storage vector for the convergence criteria factors.

(i) CC(l)-initial value of v. If CC(1) < 0.0, v is set internally to 10.0. v is the factor

used to change X by multiplication or division. For a finer one dimensional search, set v to

a smaller value, say 2.0. {Input)

195

(ii) CC(2)-inhial value of X. If CC(2) < 0.0, \ is set internally to 0.01. The value wUl

automatically change as computation continues. X is the factor that is used to combine the

move from gradient and the Newton-Raphson methods. When X is large, (i.e., 1.0), the

search is primarily in the negative gradient direction. When h is small (i.e., 0.00001), h is

primarily in the Newton-Raphson direction.

(ih)CC(3)-initial value of i. If CC(3) < 0.0, x is set internally to 0.001. x is used in

the convergence test (explained in INDEX(3)). {Input)

(iv)CC(4)-initial value of e. If CC(4) < 0.0, e is set internally to 0.00002. e is used

in the convergence test. {Input)

(v) CC(5)-inhial value of (J) . If CC(5) < 0.0, (|)^ is set internally to 0.0. When ^ <

min' ^ ^ partial derivatives from the previous iteration are used instead of computing them

again. {Input)

(vi)CC(6)-error limit, set to l.OxlO- o. {Input)

1. INDEX-an integer storage vector.

(i) INDEX(l)-used to control the sequence of operations internally. Must be set to

1 on initial entry into MARQUARDT. It is reset by MARQUARDT after initial entry.

Table B. 1 describes the values that INDEX(l) can have and their corresponding meaning.

{Input and Output)

(ii) INDEX(2)-used to determine if fiinction or derivative needs to be calculated, or if

a new base point is being reported. It is set by MARQUARDT. Table B.2 describes the

values that INDEX(2) can take and their corresponding meaning. {Output)

(iii)INDEX(3)-indicates status of search at new base point. INDEX(3) is set to K

initially. Table B.3 describes the values that INDEX(3) can take and their corresponding

meaning. {Output)

(iv)INDEX(4) - iteration counter. {Output)

196

Table B.l. Values for INDEX(l) and their Corresponding Meanmg

Value Meaning

0 Must be set on initial entry. {Input) 2 Analytical derivative mode. {Not applicable in this version).

{Output) 3 Numerical derivative mode. {Output) 4 Search mode. {Output) 5 New base point mode. {Output)

-1 Search cannot continue. {Output)

197

Table B.2. Values for INDEX(2) and their Corresponding Meanmg

Value Meaning

0 Calculate fiinction, Z(X), for all new values of X. 1 Calculate derivative vector of fiinction with respect to X(J), where

J is given in INDEXl. -1 A new base point has been found (The starting point is a new base

point.). Examine INDEX(3) for convergence.

198

Table B.3. Values for INDEX(3) and their Corresponding Meaning

Value Meaning

> 0 Gives the number of variables not satisfying convergence criterion where

\x,\+z where x and 8 are specified in CC(3) and CC(4). Recall MARQUARDT.

0 All parameters satisfy the convergence criterion.

< 0 Discussed under Error Returns.

199

8. BV-a vector indicating which of the B variables are actually to be varied by the

program. It may be varied (by the user) after each new base point. If BV(I) = 0.0, hold

B(I) constant. If BV(I) = 1.0, allow B(I) to vary by using numerical derivatives. {Input)

9. BMIN-a vector containing the lower bounds on all B variables. It may be varied

(by the user) after each new base point has been found.

10. BMAX-a vector containing the upper bounds on all B variables. It may be varied

(by the user) after each new base point has been found.

11. OUTPUT-an output vector of real variables which is reported at each new base

point. {Output)

(i) OUTPUT(l)-(j), value of the user-defined objective fiinction at the current base

point.

(ii) 0UTPUT(2)-y, the angle in degrees between the step actually taken and the

steepest descent direction at the last base point.

(iii)0UTPUT(3)-a counter for the number of times a return from MARQUARDT is

made. It is set to zero on first entry and incremented by 1 on each exit.

(iv)0UTPUT(4)-a counter for the number of functional evaluations required by

MARQUARDT. It is set to 1 on initial entry (to count the initial functional evaluation)

and incremented by 1 each time a return from a function evaluation is made.

(v) 0UTPUT(5)-a counter for the number of derivative evaluations. It is set to 0 on

the first entry and incremented by one each time a return from partial derivative request is

made.

12. KD-the number of rows of the storage matrices A and AC in the calling program,

KD must be greater or equal than K. Generally, KD = K+2. {Input)

13. P-a scratch vector used to store the values of all the partial derivatives computed

in the calling program. The first N*K locations contain the partial derivatives stored

columnwise:

200

dZy

cbCj

& 2

^x^

dz^

8X2

dz2

cbCj

dzi

^k

dz2

dx^

dz„ dz„ dz. 'n ^^n

dxi dx2 dXf^

The partial derivative in the P vector are calculated using finite difference approximations.

The space in the P vector is used to store the following data:

P( 1) - P(N*K) - N*K Jacobian matrix

P(N*K+1) - P(N*K+K) - Current value of B

P(N*K+K+1) - P(N*K+2K) - Value of B at each new base point + AB

P(N*K+2K+1) - P(N*K+2K+N) - Value of Z corresponding to current B

14. A-scratch matrix used internally of dimension (K, KD).

15. AC-scratch matrix used internally of dimension (K, KD).

16. INDEXl-dummy counter to store index of B when evaluating fiinctions.

B.6. Error Returns

INDEX(2) = -1 implies that either a new base point has been found (i.e., INDEX(l) =

5) or that the search cannot be continued (i.e., INDEX(l) = -1). If the search is to be

terminated either INDEX(3) has to be 0 (i.e., the convergence criteria has been satisfied)

or INDEX(3) is negative. Table B.4 describes the values that INDEX(3) can take and

their corresponding meaning.

201

Table B.4. Values for INDEX(3) Under Error Returns and their Corresponding Meaning

Value Meaning

-1 A new base point has been found but X > 1 and y > 90° and the convergence criteria have not been met. This implies that numerical difficuhies are present.

-2 There are more unknowns present than equations (N < K). -3 The total number of variables to be varied is zero as indicated in the BV

vector. -4 The convergence criteria have been met (same as INDEX(3) = 0), but

X > 1 and Y > 45°. This generally means that progress has been very slow due to perhaps the presence of a ridge.

-5 On entry the value of INDEX( 1) was 0 or negative. -6 One of the variables was out of the stated range of BMAX and BMIN on

entry. -7 The value of X > 10 but the convergence criteria have not been met. This

implies may be too small. -8 Convergence criterion have been met in equation solving but ^ > lO^^.

This implies existence of a relative minimum which is not an exact solution.

202

B.7. Examples for Use of Marnuardt Method fnr Equation Solving and Optimization

The examples discussed below illustrate the use of the FORTRAN subroutine

MARQUARDT as an equation solver and an optimizer.

B.7.1 Example 1: Multiple Reaction System (Example 8-12.2 from Henley and Rosen,

1969)

Synthesis gas manufacture involves the following reactions:

CH4 + H O o CO + 3H2 (I)

CO + H O o CO2 + H2 (II)

The K^ values for these reactions are 0.59 and 2.49. If the Ideal Gas law and Dahon's law

apply, how many moles of each component are present at equilibrium if initially 6 moles of

CH4 and 5 moles of H O are charged? The pressure is 1 atmosphere.

Solution : The equation relating equilibrium constant to the standard free energy of

reaction and the liquid or gas phase composhion can be written as

N N AG?r (In K,r)j = Z a , \n{x,p^) = Z a , \n{y,P) = - ( — ^ ) , ,

where, J = 1, 2, ..., M reactions, and x. is the mole fraction of component / in the Hquid

phase; p. is the vapor pressure; K^ j is the reaction equilibrium constant at temperature T\

y. is the mole fraction of component / in the gas phase; a is the stoichiometric coefficient

of component / in reaction^; AG° y, is the standard free energy change for the reaction at

temperature T; and R is the gas constant.

We seek a solution for the extents of reactions, e^ and ^j, that make/j and/j equal to

zero, where/, and/j are given by

/i(^i,^2) = -ln0.54 + Ia„ln>',(l) ;=1

203

/2(^i,^2) = -ln2.49 + Za,2ln>;,(l)

The stoichiometric matrix, a^ and initial moles are as given below:

Component Reaction (I) Reaction (II) Initial Moles

CH4 - 1 0 6

H2O -1 -1 5

CO 1 3 0

H2O 3 1 0

CO2 0 1 0

If A2. is the initial moles present of component /, then the moles present at any set of

extents are given as

2 ",=w,o + Eaye^ / = 1, 2,..., 5

Using the above equation values of «y are calculated, and/, and/2 are evaluated, and

e^ and 2 are adjusted by MARQUARDT to minimize/, and/2 independently. The

FORTRAN program code for Example 1 is attached in Appendix D along with the

corresponding program output after execution.

B.7.2. Example 2 : Find the solution to the following set of equations:

3JC, +X2 +2x3 - 3 = 0

-3x,+5x2+2x,X3-l = 0

25XjX2+20x3+ 12 = 0

Starting values are (1.0,1.0,1.0). The equations are scaled to the formf{x) = 1, and

the y vector is set to 1.0. Using numerical derivatives the solution was found after 28

evaluations of the function. The routine was entered 34 times. The angle between the

204

actual and steepest descent direction for the last step was 51.16°. The FORTRAN

program code for Example 2 is attached in Appendix D along with the cortesponding

program output after execution.

B.7.3. Example 3 : It is desired to determine the parameters a,, ttj, 03, and a^ that fit an

equation of the form

>/, =a,e^°*'' +a3e^" ''

to a set of nine data points in a least squares sense. Starting values, and maximum and

minimum values for the a's are given in the DATA statements. The FORTRAN program

code for Example 3 is attached in Appendix D along with the corresponding program

output after execution.

205

APPENDIX C

EMPIRICAL CORRELATIONS FOR THE METHANOL-

WATER SYSTEM

The dynamic simulations for the methanol-water distillation columns require system

specific information that describes the vapor-liquid equilibrium (VLE) for the methanol-

water system under the chosen operating conditions of pressure and temperature, the

liquid and vapor enthalpy data, and the liquid and vapor density data. While the

differential equations that describe the material and energy balances (discussed in Chapter

IV, Section 4.2) are standard for any distillation column, the behavior of the distillation

column, both from a steady state as well as a dynamic viewpoint, is dependent on the

thermodynamic properties of the system under consideration.

The system under consideration is a methanol-water system which is a nonideal binary

mixture. In order to get an accurate description of the thermodynamic behavior of the

system, one has to account for the nonidealities in the vapor and liquid phases. Since the

operating pressures are close to atmospheric pressure, one can assume that the vapor

phase behaves ideally, and that the nonideal behavior is essentially in the liquid phase. The

liquid-phase nonideality can be described whh the help of an activity coefficient model,

such as the Wilson model, or the NRTL model. Even though the activity models describe

the thermodynamic behavior more accurately, they tend to be more complicated from a

computationally standpoint. Also, the parameters of the activity model are values

regressed from experimental data obtained over specific range of conditions. Instead of

using the detailed thermodynamic model in the dynamic simulations, empirical relations

were developed using experimental data to correlate the VLE, vapor- and liquid-phase

enthalpies and densities as fiinction of the liquid-phase composhion. The empirical

206

con-elations are standard polynomial regression equations that provide sufficient accuracy

with the advantage of high computational speeds.

C. 1 Correlations for the Lab-Column Dvnamic Simulator

The lab column operates essential at atmospheric pressure condition, and the vapor-

liquid equilibrium for the methanol-water system for one atmosphere absolute pressure

(Henley and Seader, 1981) is reported in Table C.l, along with the corresponding

equilibrium temperature. Table C.2 shows enthalpy data for the same system for one

atmosphere absolute (Henley and Seader, 1981). Data for the average molecular weight,

saturated liquid and vapor density, and density of liquid subcooled to 120°F were obtained

from a steady-state process simulation package (HYSIM) using the NRTL thermodynamic

model for the methanol-water system at one atmosphere absolute pressure. Table C.3

shows the above mentioned data.

C. 1.1. Correlation for Vapor-Liquid Equilibrium

The composhion of the vapor-phase >', in mole fraction methanol, in terms of the

liquid-phase composition x, in mole fraction methanol, is given by the equation

>; = 0.0207 + 5.6509x-20.2753x^+37.8756x^-33.4747x^+11.2092x^ (C.l)

Figure C. 1 shows the fit obtained by the above equation.

C.1.2. Correlation for Saturation Temperature

The saturation temperature T, in degrees Fahrenheit, at any liquid-phase composition

X (mole fraction methanol), is given by the equation

r=210.76-243.45x + 515.74x^-547.70x^+213.33x\ (C.2)

Figure C.2 shows the fit obtained by the above equation.

207

Table C.l. Vapor-Liquid Equilibrium for Methanol-Water System at latma (from Henley and Seader, 1981).

X Y T (mf MeOH) (mf MeOH) (deg. C)

0.00 0.02 0.04 0.06 0.08 0.10 0.15 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95 1.00

0.00 0.13 0.23 0.30 0.37 0.42 0.52 0.58 0.67 0.73 0.78 0.83 0.87 0.92 0.96 0.98 1.00

100.00 96.40 93.50 91.20 89.30 87.70 84.40 81.70 78.00 75.30 73.10 71.20 69.30 37.60 66.00 65.00 64.50

l>

208

Table C.2. Enthalpy Data for Methanol-Water System at 1 atma, (Henley and Seader, 1981).

XorY HV hL (mf MeOH) (BTU/lbmol) (BTU/lbmol)

0.00

0.05 0.10 0.15

0.20 0.30

0.40

0.50 0.60

0.70

0.80 0.90

1.00

20720.00

20520.00 20340.00 20160.00

20000.00 19640.00

19310.00 18970.00

18650.00 18310.00

17980.00 17680.00

17930.00

3240.00

3070.00 2950.00 2850.00 2760.00 2620.00

2540.00 2470.00 2410.00 2370.00

2330.00 2290.00

2250.00

209

Table C.3. Transport Property Data for Methanol-Water System at 1 atma.

X AMW T rho_L rho_V rho_L_SC (mf MeOH) (Ib/lbmol) (deg. F) (lb/ft^3) (lb/ft^3) (lb/ft^3)

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00

18.02 18.72 19.42 20.12 20.82 21.52 22.22 22.93 23.63 24.33 25.03 25.73 26.43 27.13 27.83 28.54 29.24 29.94 30.64 31.34 32.04

212.00 198.84 190.01 183.65 178.80 174.94 171.75 169.04 166.67 164.55 162.63 160.85 159.19 157.62 156.12 154.98 153.29 151.94 150.62 149.33 148.07

59.18 58.12 57.07 56.07 55.12 54.24 53.41 52.65 51.93 51.26 50.64 50.07 49.53 49.03 48.57 48.13 47.73 47.36 47.01 46.69 46.40

0.0367 0.0383 0.0399 0.0414 0.0431 0.0447 0.0463 0.0480 0.0497 0.0514 0.0532 0.0550 0.0568 0.0586 0.0605 0.0625 0.0644 0.0664 0.0683 0.0703 0.0722

61.75 60.37 59.11 57.96 56.90 55.93 55.04 54.21 53.45 52.73 52.07 51.46 50.89 50.35 49.85 49.39 48.95 48.55 48.17 47.81 47.48

210

1.1

1.0 —

o

(4-r

a

Y - 0.0207 + S.6509X - 20.2753X-2 + 37.8756X-3 33.4747X-4 • I1.2092X-5 r-2 - 0.9991

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

X (mf MeOH)

Figure C.l. Vapor-Liquid equilibrium for methanol-water system at 1 atma.

211

220

210

fc 200

^3

190 —

ex

XT

180 —

J 170 —

4)

ea

P

CO 160

150

140

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

X. (mf MeOH)

Figure C.2. Saturated liquid temperature versus liquid-phase composition for methanol-water system at 1 atma.

212

c.l.3. Correlation for Saturated Liquid Density

The saturated liquid density p , in Ib/ft , at any liquid-phase composhion x (mole

fraction methanol), is given by the equation

Pi=59.215-23.01x + 13.298x^-3.104x^ (C.3)


C.l.4. Correlation for Saturated Vapor Density

The saturated vapor density Py, in Ib/ft , at any vapor-phase composhion>' (mole


Py = 3.68x10"^ +3.02x10-^ + 5.57x10-^2 -2.02x10-^/. (C.4)


C. 1.5. Correlation for Liquid Density at 120°F

The liquid density at 120°F p 120, in Ib/ft at any liquid-phase composhion x (mole


p J20 = 61.696-27.477x + 19.492x2-6.269x^ (C.5)


C. 1.6. Correlation for Liquid Enthalpy

The liquid enthalpy h^, in BTU/lbmol, at any liquid-phase composition (mole fraction

methanol) is given by the equation

/7^=3218.5-2918.9x + 3631.7x2-1692.5x^ (C.6)


213

60

m

Ui

Q

p ST

«} 4-1 «

P 4-1 cd

CO

50

d e n L - 5 9 . 2 1 5 - 2 3 . 0 1 X • 1 3 . 2 9 8 X - 2 - 3 . 1 0 4 X ' 3 - t - 2 - 1 .000

40

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

X. (mf McOH)

Figure C.3. Saturated Uquid density versus Uquid-phase composhion for methanol-water system at 1 atma.

214

0.080

0.070 —

<n

>; 0.060 —

o a, «s >

CO

0.050

0.040 —

0.030

den_V - 3.68B-2 + 3.02B-2X + 5.57E-3X-2 -2.02B-4X*3 r-2 - 1.00

H

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

X. (mf McOH)

Figure C.4. Saturated vapor density versus liquid-phase composhion for methanol-water system at 1 atma.

215

70.0

d e n _ S C L - 6 1 . 6 9 6 - 2 7 . 4 7 7 X + 19 .492X*2 - 6 .269X*3 t'l - 1 .000

GO 6 0 . 0 o

o

CS

>s 4-»

(A

o Q

9 S 50.0

40.0

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

X. (mf MeOH)

Figure C.5. Subcooled liquid density versus Uquid-phase composition for methanol-water system at 1 atma.

216

3300

H hL

3200 —

3 1 0 0

3218.5 - 2918.9X • 3631.7X*2 - 1692.5X'3 r'2 - 0.9987

OU

I

Xi

.«^ p «<J

CQ . ««

>s

a a

j i 4->

C ti

•o p O"

3000

2900

2800

2700

2600 —

2 2500 — c3

CO

2400 —

2300 —

2200

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

X, (mf McOH)

Figure C.6. Saturated liquid enthalpy versus liquid-phase composhion for methanol-water system at 1 atma.

217

C. 1.7. Correlation for Vapor Enthalpy

The vapor enthalpy Hy, in BTU/lbmol, at any vapor-phase composhion >' (mole

fraction methanol) is given by the equation

/ / j . =20669.1-3338.3>'. (C.7)


C. 1.8. Correlation for Average Molecular Weight

The average molecular weight AMW, in Ib/lbmol, at any liquid-phase composition x

(mole fraction methanol), is given by the equation

/}A/W^ = 18.015+14.027X. (C.8)


C.2. Correlations for the High-Purity Column Dynamic Simulator

The high-purity column operates at approximately two atmospheres absolute pressure

condition, and the vapor-liquid equilibrium for the methanol-water system for two

atmospheres absolute pressure (HYSIM using a NRTL thermodynamic model) is reported

in Table C.4, along whh the corresponding equilibrium temperature. Table C.5 shows

enthalpy data for the same system at two atmospheres absolute. Data for the saturated

liquid and vapor density, and density of liquid subcooled to 120°F were obtained from the

steady-state process simulation package (HYSIM) using the NRTL thermodynamic model

for the methanol-water system at two atmospheres absolute pressure. Table C.6 shows

the above mentioned data.

218

21000

o B

U3 W^

••"•^B

P * j

CQ S i . * '

a, «d .c 4 - J

p

i->

o tx <d

>

•o «

•ta

«d • - I

P 4 > *

«d CO

20000

19000

18000

17000

HV - 2 0 6 6 9 . 1 - 3338 .3Y r '2 - 0 9 9 9 3

[^

[^

-[©I

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Y, (mf MeOH)

Figure C.7. Saturated vapor enthalpy versus vapor-phase composition for methanol-water system at 1 atma.

219

40.0

o B

Xi 30.0 —

Xi 60

<B

P O

c "o

« 20.0 CO C0

« >

10.0

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

X, (mf McOH)

Figure C.8. Average molecular weight versus hquid-phase composhion for methanol-water system.

220

Table C.4. VLE for Methanol-Water System at 2 atma.

X Y T (mf MeOH) (mf MeOH) (deg. F)

0.000

0.025

0.050

0.075 0.100

0.150 0.200 0.250 0.300

0.350 0.400

0.450

0.500

0.550 0.600

0.650 0.700

0.750

0.800

0.850

0.900

0.925

0.950

0.975

1.000

0.0000

0.1442

0.2517

0.3346 0.4003

0.4977 0.5669 0.6194 0.6612

0.6962 0.7263

0.7533 0.7781

0.8014

0.8236

0.8454 0.8668

0.8881

0.9096

0.9315 0.9537

0.9651

0.9765

0.9882

1.0000

242.26

235.04

229.10

224.14 219.94

213.22 208.06 203.95 200.55

197.67 195.17

192.94 190.91

189.04

187.29

185.63 184.04

182.51

181.03

179.60

178.20

177.51

176.83

176.16

175.49

221

Table C.5. Enthalpy Data for Methanol-Water System at 2 atma.

XorY HV hL hL_SC (mf MeOH) (BTU/lbmol) (BTU/lbmol) (BTU/lbmol)

0.000

0.025 0.050

0.075 0.100 0.150 0.200

0.250 0.300

0.350 0.400

0.450 0.500

0.550

0.600

0.650

0.700

0.750

0.800

0.850

0.900

0.925 0.950

0.975 1.000

21301.89

21246.98 21191.66 21135.91 21079.71 20965.94 20850.24

20732.43 20612.40

20490.01

20365.11 20237.52 20107.09

19973.70 19836.14

19698.13

19556.36

19412.15 19269.19

19126.26

18985.26

18915.70 18846.84

18778.67

18711.21

3799.75 3723.40 3664.80

3622.16 3591.20 3555.07 3542.90

3546.72 3561.36

3583.56

3610.82 3641.58 3674.63

3709.23 3745.47

3780.43

3816.28

3851.94 3889.61

3922.06

3956.29

3973.18

3989.92

406.53 4022.88

1588.99 1609.98 1630.98

1651.97 1672.96 1714.95 1756.93

1798.92 1840.90

1882.89

1924.87 1966.86 2008.84

2050.83 2092.81

2134.80

2176.78 2218.77

2260.75

2302.74

2344.72

2365.71 2386.71

2407.70 2428.69

222

Table C.6. Transport Property Data for Methanol-Water System at 2 atma.

X rho_L rho_V rho_L_SC (mfMeOH) (lb/ft' 3) (lb/ft' 3) (lb/ft' 3)

0.000 0.025 0.050 0.075 0.100 0.150 0.200 0.250 0.300 0.350 0.400 0.450 0.500 0.550 0.600 0.650 0.700 0.750 0.800 0.850 0.900 0.925 0.950 0.975 1.000

58.28 57.75 57.21 56.68 56.16 55.15 54.20 53.31 52.47 51.69 50.96 50.29 49.65 49.07 48.50 48.01 47.53 47.09 46.69 46.29 45.93 45.77 45.60 45.45 45.30

0.0622 0.0635 0.0648 0.0662 0.0675 0.0702 0.0729 0.0757 0.0785 0.0813 0.0842 0.0872 0.0901 0.0932 0.0963 0.0994 0.1026 0.1059 0.1092 0.1125 0.1157 0.1174 0.1190 0.1206 0.1222

59.38 58.94 58.52 58.10 57.68 56.87 56.10 55.35 54.62 53.93 53.26 52.63 52.02 51.44 50.89 50.36 49.87 49.40 48.96 48.55 48.17 47.99 47.81 47.65 47.49

223

c.2.1. Correlation for Vapor-Liquid Equilibrium

The composition of the vapor-phase >', in mole fraction methanol, in terms of the

liquid-phase composhion x, in mole fraction methanol, for the range 0.0 < x < 0.1 is given

by the equation

>/ = 0.000142 + 6.596215x-36.315685x2+103.809028x^ (C.9)

Figure C.9 shows the fit obtained by the above equation. For the range 0.1 < x < 0.98, the

relationship is given by the equation

>' = 0.118525 + 3.658772x-9.584915x^ +

14.42337x^-10.870606x^^-3.256565x^ (CIO)

and for the range 0.98 < x < 1.0, the relationship is

>' = 0.758018 + 0.006244x + 0.23573x^ (C.ll)

Figures CIO and C.ll show the fit obtained by the Equations CIO and C11,

respectively.

C.2.2. Correlation for Saturation Temperature

The saturation temperature T, in degrees Fahrenheit, at any liquid-phase composition

X (mole fraction methanol), is given by the equation

r = 240.96-249.05x + 518.71x^-545.46x^+210.95x\ (C.12)

Figure C12 shows the fit obtained by the above equation.

C.2.3. Correlation for Saturated Liquid Density

The saturated liquid density p , in Ib/ft , at any liquid-phase composhion x (mole


p^=58.322-23.133x + 13.125x^-3.017x^ (C.13)


224

0.50

0.40 —

X

o <i

0.30 —

0.20

0.10 —

0.00 ^ •

0 .000 0.025 0.050 0.075 0.100

X (mf MeOH)

Figure C.9. Vapor-Liquid equilibrium for 0.0 < x < 0.1 for methanol-water system at 2 atma.

225

1.0

X

o

B

0.9 —

0.8 —

0.7 —

0.6 —

0.5

0.4

f*'

!•]

[?:

L ^ [•I

Y - 0 . 1 1 8 5 2 5 + 3 . 6 5 8 7 7 2 X - 9 . 5 8 4 9 1 5 X * 2 + 1 4 . 4 2 3 3 7 0 X ^ 3 - 1 0 . 8 7 0 6 0 6 X * 4 + 3 . 2 5 6 5 6 5 X * 5

r '2 - 0 . 9 9 9 9 6 1

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

X (mf MeOH)

Figure C10. Vapor-Liquid equilibrium for 0.1 < x < 0.98 for methanol-water system at 2 atma.

226

X

o

< 4 - l

B

1.000

0.999 —

0.998

0.997 —

0.996 —

0.995 —

0.994 —

0.993

0.992

0.991

0.990

0.98 0.99 1.00

X (mf McOH)

Figure C 11. Vapor-Liquid equiUbrium for 0.98 < x < 1.0 for methanol-water system at 2 atma.

227

250

T - 240.96 - 249.05X • 518.71X*2 - 545.46X'3 • 210.95X*4 r-2 - 0.999

240

230 —

220

E

H c o

ea •-> p

a

210

200

190 —

180 —

170

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

X. (mf McOH)

Figure C.12. Saturated liquid temperature versus hquid composition for methanol-water system at 2 atma.

228

60 58.322 - 23.133X + 13.125X-2 - 3.017X-3

r*2 - 1.000

«n

p

= 50 -

CO

p ei

x/i

40

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

X, (mf MeOH)

Figure C.13. Saturated liquid density versus hquid composition for methanol-water system at 2 atma.

229

C.2.4. Correlation for Saturated Vapor Density

The saturated vapor density py, in Ib/ft , at any vapor-phase composition;; (mole


PK =0.0623+ 0.0506>' + 0 . 0 1 1 8 / - 0 . 0 0 2 4 / . (C.14)


C2.5. Correlation for Liquid Density at 120°F

The liquid density at 120°F p 120, in Ib/ft , at any liquid-phase composhion x (mole


PL,I2O = 59.380-17.554x + 5.663x^ (C15)


C.2.6. Correlation for Liquid Enthalpy

The liquid enthalpy /? , in BTU/lbmol, at any liquid-phase composhion (mole fraction

methanol) is given by the equation

h^ = 3793.26-3025.26X +12111.98x^-19761.89x^

+15960.15x'*-5058.99x^ (C.16)


C.2.7. Correlation for Vapor Enthalpy

The vapor enthalpy Hy, in BTU/lbmol, at any vapor-phase composition y (mole

fraction methanol) is given by the equation

//^ =21296.22-2059.04>'-786.02y+252.15/. (C.17)


230

0.13

0.12 —

fn 0.11 —

.? 010 p u

Q

o S" 0.09 >

V

C3 • i

P « 0.08

e/3

0.07 —J

den_V - 0 .0623 + 0 .0506Y + 0.0118Y'2 - 0.0024Y-3 r*2 - 1.0000

0.06

r*:

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Y, (mf MeOH)

Figure C.14. Saturated vapor density versus vapor composition for methanol-water system at 2 atma.

231

51

50 —

O

CO

C u

Q

p a"

49

48 —

47

rho_L_SC - 59.380 - 17.554X + 5.663X-2 t'l - 1.000

"\^

' ^ ,

0.6 0.7 0.8 0.9 1.0

X. (mf MeOH)

Figure C.15. Subcooled Uquid density versus Uquid composition for methanol-water system at 2 atma.

232

4100

4000 —

o B

Xi

p 4-1

CQ

« Xi c

p

V

CS

p

3900

3800 - ®

3700 —

3600 —

3500

hL - 3 7 9 3 . 2 6 - 3 0 2 5 . 2 6 X + 1 2 1 1 1 . 9 8 X 2 -1 9 7 6 1 . 8 9 X * 3 + 1 5 9 6 0 . 1 5 X ' 4 - 5 0 5 8 . 9 9 X * 5

r*2 - 1.000

\S] ^

[

®

^

g

®

!•'

0.0 0.1 0 .2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

X. (mf MeOH)

Figure C.16. Saturated Uquid enthalpy versus Uquid composition for methanol-water system at 2 atma.

233

22000

HV - 2 1 2 9 6 . 2 2 - 2 0 5 9 . 0 4 Y - 7 8 6 . 0 2 Y - 2 + 252 .15Y*3 r '2 - 1.000

21000 —

o B

Xi

p 4->

Ui

c 20000

o

>

c

ca

p

« 19000 on

18000

'^

[•

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Y, (mf McOH)

Figure C.17. Saturated vapor enthalpy versus vapor composition for methanol-water system at 2 atma.

234

C.2.8. Correlation for Liquid Enthalpy at 120°F

The liquid enthalpy subcooled to 120°F, Z? 120, in BTU/lbmol, at any liquid-phase

composition x (mole fraction methanol), is given by the equation

L,i20 = 1588.99 + 839.7X. (C.18)


235

2500

2400

to ii

•o o c< t—(

«-* ti

>s d*

^ i ^

«B XI «.• a]

T3 • ^ 4

P cr

2300

2200

hL_SC - 1588 .99 + 839 .7X r*2 - 1.000

2100

2000

0.6 0.7 0.8 0.9 1.0

X. (mf MeOH)

Figure C.18. Subcooled Uquid enthalpy versus liquid composhion for methanol-water system at 2 atma.

236

APPENDIX D

FORTRAN PROGRAMS FOR EXAMPLES IN

APPENDIX B

Listed below are the FORTRAN program codes for Examples 1, 2 and 3 discussed m

Appendix B, Section 7.1, 7.2, and 7.3, respectively.

D. 1 • FORTRAN Code for Example 1

C

c C PROGRAM : EXAMPLE 1 FOR C DATE : DECEMBER 20, 1993 C VERSION 1.0 C PROGRAMMER : SOUNDAR RAMCHANDRAN, DEPT. OF CHEM. ENG., C TEXAS TECH UNIVERSITY C

c C PROGRAM TO TEST SUBROUTINE MARQUARDT TO ILLUSTRATE C EXAMPLE 1 FROM APPENDIX B, SECTION 7.1 C

c IMPLICIT REAL*8 (A-H,0-Z) DIMENSION B(4),Z(4),Y(2),BV(2),BMIN(2),BMAX(2),P(10),A(2,4),

* AC(2,4),CC(6),INDEX(5),OUTPUT(5) COMMON XN(5)

C

C OPEN (6,FILE = "EXl.PRN")

IFV = 0 KD = 2 KK = 2 NN = 2 DO 10 J = 1,KK

Y(J) = 0.0

237

B(J) = 0.0 BV(J)=1.0 BMIN(J) =-100.0 BMAX(J)= 100.0

10 CONTINUE

INDEX(1) = 0 INDEX(2) = 0 INDEX(3) = KK INDEX(4) = 0

CC(1) = 0.0 CC(2) = 0.0 CC(3) = 0.0 CC(4) = 0.0 CC(5) = 0.0 CC(6)= l.OE-10

WRITE (6,3000) (B(J), J = 1,KK) WRITE (6,3001)

125 CALL MARQUARDT(KK,B,NN,Z,Y,CC,INDEX,BV,BMIN,BMAX, * 0UTPUT,KD,P,A,AC,INDEX1) IF (INDEX(4).GT. 100) THEN

WRITE (6,*) 'ITER GREATER THAN 100' GO TO 200

ENDIF IF (INDEX(3).NE.0) THEN

IF (INDEX(3).LT.O) THEN WRITE (6,*) 'ICON IS LESS THAN ZERO' STOP

ENDIF IF (INDEX(l).EQ.O) THEN

WRITE (6,3002) IFV,INDEX(3),(B(J), J = 1,KK),0UTPUT(1) GOTO 125

ENDIF IF (INDEX(3).GT.O) THEN

CALL FNTX (B,Z,IFV) GOTO 125

ENDIF ENDIF

200 WRITE (6,3002) IFV,INDEX(3),(B(J), J = 1,KK),0UTPUT(1) WRITE (6,3003) (XN(J), J = 1,5)

238

STOP C

3000 FORMAT (18H STARTING VALUES = 6F8.3//) 3001 FORMAT (66H EVALUATIONS ICON Bl B2 ERROR ) 3002 FORMAT (1X,I10,I12,3E15.5) 3003 FORMAT (3X, 18HM0LES PRESENT =/lX,5E14.5)

END C

c C SUBROUTINE FNTX C

c SUBROUTINE FNTX (B,Z,IFV)

C IMPLICIT REAL* 8 (A-H,0-Z) DIMENSION AR(5,3),Y(5),B(2),Z(2) COMMON XN(5) DATAAR/-1.0,-1.0,1.0,3.0,0.0,0.0,-1.0,-1.0,1.0,1.0,6.0,5.0,0.0,0.0,0.0/

c IFV = IFV+1 SUM = 0.0 DO 10 J = 1,5

XN(J) = AR(J,3)+AR(J,1)*B(1)+AR(J,2)*B(2) SUM = SUM+XN(J)

10 CONTINUE C

DO 20 J = 1,5 Y(J) = XN(J)/SUM IF(Y(J).LE.1.0E-10)Y(J)= l.OE-10

20 CONTINUE C

SUM2 = 0.0 SUM3 = 0.0 DO 30 J =1,5

SUM2 = SUM2+AR(J,1)*AL0G(Y(J)) SUM3 = SUM3+AR(J,2)*ALOG(Y(J))

30 CONTINUE C

Z(l) = -ALOG(0.54)+SUM2 Z(2) = -ALOG(2.49)+SUM3

239

RETURN END

D.2. Solution For Example 1

c C SOLUTION FOR EXAMPLE 1 FOR C C SOUNDAR RAMCHANDRAN C DEPT. OF CHEM. ENG., TEXAS TECH UNIVERSITY, LUBBOCK, TX C 79409-3121

C

c

STARTING VALUES = .000 .000 JATIOI 1 4 7 10 13 16 19 22 25 28 31

sfS ICON 2 2 2 2 2 2 2 2 2 2 0

Bl .OOOOOE+00 .14507E-04 .32512E-03 .33480E-02 .26166E-01 .15052E+00 .60114E+00 .15364E+01 .23077E+01 .24200E+01 .24219E+01

B2 .OOOOOE+00 .11834E-04 .65127E-04 .77429E-03 .68779E-02 .45418E-01 .20816E+00 .58457E+00 .85912E+00 .84402E+00 .84218E+00

ERROR .86526E+04 .25993E+04 .14439E+04 .80367E+03 .39366E+03 .15824E+03 .45444E+02 .60978E+01 .71153E-01 .31884E-04 .74327E-11

MOLES PRESENT = .35781E+01 .17359E+01 .15797E+01 .81079E+01 .84218E+00

240

D.3. FORTRAN Code for Example 2

C

c C PROGRAM : EXAMPLE2.F0R C DATE : DECEMBER 20, 1993 C VERSION : 1.0 C PROGRAMMER : SOUNDAR RAMCHANDRAN, DEPT. OF CHEM. ENG., C TEXAS TECH UNIVERSITY C

c C PROGRAM TO TEST SUBROUTINE MARQUARDT TO ILLUSTRATE C EXAMPLE 2 FROM APPENDIX B, SECTION 7.2 C QilftJtftJtitilfifiit * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c IMPLICIT REAL*8 (A-H,0-Z) DIMENSION B(6),Z(6),Y(3),BV(3),BMIN(3),BMAX(3),P(18),A(3,5),

* AC(3,5),CC(6),INDEX(5),OUTPUT(5),XO(3) C

OPEN (6, FILE = "EX2.PRN") C

IFV = 0 KD = 3 KK = 3 NN = 3 DO 10 J = 1,KK

Y(J)=1.0 XO(J)=1.0 B(J) = XO(J) BV(J)=1.0 BMIN(J) = -20.0 BMAX(J) = 20.0

10 CONTINUE C


C CC(1) = 0.0

241

c

c

CC(2) = 0.0 CC(3) = 0.0 CC(4) = 0.0 CC(5) = 0.0 CC(6)= l.OE-10

WRITE (6,401) (B(J), J = 1,KK)

125 CALL MARQUARDT(KK,B,NN,Z,Y,CC,INDEX,BV,BMIN,BMAX,OUTPUT, * KD,P,A,AC,INDEX1)

IF (INDEX(4).GT. 100) THEN WRITE (6,*) 'ITER GREATER THAN 100' GO TO 200

ENDIF IF (INDEX(3).NE.O) THEN

IF (INDEX(3).LT.O) THEN WRITE (6,*) 'ICON IS LESS THAN ZERO; ICON = ',INDEX(3) STOP


WRITE (6,150) IFV,INDEX(3),(B(J), J = 1,KK),0UTPUT(1) 150 FORMAT (I5,I5,3(F15.5),E15.5)

(30 TO 125 ENDIF IF (INDEX(3).GT.O) THEN

CALL FNTX (B,Z,IFV) GOTO 125

ENDIF ENDIF

C 200 WRITE (6,*)

WRITE (6,403) INDEX(3) WRITE (6,404) OUTPUT(l) WRITE (6,405) 0UTPUT(2) WRITE (6,406) IFV WRITE (6,407) DO 3001= 1,KK

WRITE (6,408) I,XO(I),B(I),Z(I) 300 CONTINUE

C STOP

C 401 FORMAT (18H STARTING VALUES = 3F8.3/) 403 FORMAT ('ICON =', 112)

242

404 FORMAT ('SUM OF SQUARES =',E12 5) 405 FORMAT ('ANGLE =', F12.2) 406 FORMAT (• NUMBER OF FUNCTION EVALUATIONS =' 112/) 407 FORMAT (• NUMBER INITIAL X FINAL X VALUE OF Z'/) 408 FORMAT (I5,2X,E12.5,2X,E12.5,2X,E12.5)

END C

c C SUBROUTINE FNTX C C*********************************************t:^tiiît*******************

c SUBROUTINE FNTX (B,Z,IFV)

C IMPLICIT REAL*8 (A-H,0-Z) DIMENSION B(3),Z(3)

C IFV = IFV+1 Z(l) = (3.0*B(l)+B(2)+2.0*B(3)**2)/3.0 Z(2) = (-3.0*B(1)+5.0*B(2)**2+2.0*B(1)*B(3)) Z(3) = (25.0*B(l)*B(2)+20.0*B(3))/(-12.0)

C RETURN END


C Q:îltiHf:ti:tliilf>tf:tiilliift ********************* **************************************

c C SOLUTION FOR EXAMPLE2.F0R C C SOUNDAR RAMCHANDRAN C DEPT. OF CHEM. ENG., TEXAS TECH UNIVERSITY, LUBBOCK, TX C 79409-3121 C Qitf******************:^^llf^Hi,li^lîiîliHfiiHHt^fiit **********************************

c

243

STARTING VALUES = 1.000 1.000 1.000

1 5 12 16 20 24

3 3 3 3 3 3

1.00000 -1.47169 -1.60991 -2.23996 -2.41790 -2.41352

1.00000 .20160 1.19071 .89063 .91513 .91465

1.00000 2.25949 1.86049 2.12826 2.16088 2.15939

.32563E+02 .27396E-H)2 .24308E+02 .18164E+00 .14245E-03 .17442E-09

ICON = 0 SUM OF SQUARES = .46675E-16 ANGLE = 44.98 NUMBER OF FUNCTION EVALUATIONS = 28

NUMBER INITIAL X FINAL X VALUE OF Z

1 2 3

.lOOOOE+01

.lOOOOE+01

.lOOOOE+01

-0.24135E+01 .91464E+00 .21594E+01

.lOOOOE+01

.lOOOOE+01

.lOOOOE+01

D.5. FORTRAN Code for Example 3

r^^^lî^tHi,li;î3îîîHii|iî:î}îilf^liilli^cilliilîl^>î:tf^litf * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c C PROGRAM : EXAMPLE3.F0R C DATE : DECEMBER 20, 1993 C VERSION: 1.0 C PROGRAMMER : SOUNDAR RAMCHANDRAN, DEPT. OF CHEM. ENG., C TEXAS TECH UNIVERSITY C r^**********************************************************************

c C PROGRAM TO TEST SUBROUTINE MARQUARDT TO ILLUSTRATE C EXAMPLE 3 FROM APPENDIX B, SECTION 7.3 C ^îili********:^L^^lil^:^lii^^^tilfill^siîiî^fililHl^^^lil^l^li(fiiîi it ***********************************

c

244

IMPLICIT REAL*8 (A-H,0-Z) DIMENSION B(8),Z(18),Y(9),BV(4),BMIN(4),BMAX(4),P(53),A(4,9),

* AC(4,9),CC(6),INDEX(5),OUTPUT(5),B0(4),X(9)

DATA Y/0.173,0.292,0.369,0.429,0.465,0.486,0.504,0.521,0.535/ DATA BO/0.11400253,0.6856597E-3,-1.7036566,-0.53485967E-3/ DATA BMIN/0.0,0.0,-1.8,-0.1/ DATA BMAX/0.5,0.005,0.0,0.0/

OPEN (6, FILE = "EX3.PRN")

X(l) = 540.0 X(2) = 900.0 X(3)= 1260.0 X(4)= 1800.0 X(5) = 2340.0 X(6) = 2880.0 X(7) = 3600.0 X(8) = 4500.0 X(9) = 5400.0

IFV = 0 KD = 4 KK = 4 NN = 9 DO10J=l,KK

B V(J) = 1.0 B(J) = BO(J)

10 CONTINUE


CC(1) = 0.0 CC(2) = 0.0 CC(3) = 0.0 CC(4) = 0.0 CC(5) = 0.0 CC(6)= l.OE-10

WRITE (6,401) (B(J), J = 1,KK)

245

c 125 CALL MARQUARDT(KK,B,NN,Z,Y,CC,INDEX,BV,BMIN,BMAX,OUTPUT,

* KD,P,A,AC,INDEX1) IF (INDEX(4).GT.100) THEN

WRITE (6,*) 'ITER GREATER THAN 100' GO TO 200

ENDIF IF (INDEX(3).NE.O) THEN

IF (INDEX(3).LT.O) THEN WRITE (6,*) 'ICON IS LESS THAN ZERO; ICON = ',INDEX(3) STOP


WRITE (6,150) IFV,INDEX(3),(B(J), J = 1,KK),0UTPUT(1) 150 FORMAT (I4,I4,4(E12.5),E12.5)

GO TO 125 ENDIF IF (INDEX(3).GT.0) THEN

CALL FNTX (X,B,Z,IFV,KK,NN) GOTO 125

ENDIF ENDIF

C 200 WRITE (6,*)

WRITE (6,403) INDEX(3) WRITE (6,404) OUTPUT(l) WRITE (6,405) 0UTPUT(2) WRITE (6,406) IFV WRITE (6,407)

C DO300I=l,KK

WRITE (6,408) I,BO(I),B(I) 300 CONTINUE

C STOP

C 401 FORMAT (18H STARTING VALUES = 4F8.3/) 403 FORMAT (' ICON = ', 112) 404 FORMAT (' SUM OF SQUARES = ', E12.5) 405 FORMAT (' ANGLE = ', F12.2) 406 FORMAT (' NUMBER OF FUNCTION EVALUATIONS = ', 112/) 407 FORMAT (• NUMBER INITIAL A FINAL A'/) 408 FORMAT (I5,2X,E12.5,2X,E12.5)

C

246

END C C**************************^**^Hlfi^^f^f;i,fi^^,^^^^^^^^^^^^^^^^^.^^.^^.^.^^,^^,^^,^^^^^^

c C SUBROUTINE FNTX C Q*******************:tf*,^*^f^f^f^,^f^f^:ff^,^,^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

c SUBROUTINE FNTX (X,B,Z,IFV,K,N)

C IMPLICIT REAL*8 (A-H,0-Z) DIMENSION B(K),Z(N),X(N)

C IFV = IFV+1 DO 20 I = 1,N

Z(I) = B(1)*EXP(B(2)*X(I))+B(3)*EXP(B(4)*X(I)) 20 CONTINUE

C RETURN END


C r^il^^s^i:iliiliiiii3lli*it: ************************************************************

c C SOLUTION FOR EXAMPLE3 FOR C C SOUNDAR RAMCHANDRAN C DEPT. OF CHEM. ENG., TEXAS TECH UNIVERSITY, LUBBOCK, TX C 79409-3121 C r^************************************ * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

c

STARTING VALUES = .114 .001 -1.704 -.001

I 4 .11400E+00 .68566E-03 -0.17037E+01 -0.53486E-03 .24099E+02 6 4 .47283E-01 .62574E-03 -0.11165E+00 -0.82820E-03 .12072E+01 II 4 .81681E-01 .37763E-03 -0.34022E+00 -0.79134E-02 .34520E+00 16 4 .27235E+00 OOOOOE+OO O.OOOOOE+00 O.OOOOOE+00 .31137E+00

247

21 3 .26610E+00 31 4 .24239E+00 36 4 .34488E+00 42 3 .50000E+00 48 4 .46836E+00 53 4 .40264E+00 58 4 .44786E+00 63 4 .45331E+00 68 4 .45537E+00 73 4 .45541E+00

.22271E-03

.10595E-03

.87630E-04

.87169E-04

.OOOOOE+00

.48475E-04

.32769E-04

.30995E-04

.30176E-04

.30141E-04

-0.62521E-02 -0.13442E-01 O.OOOOOE+00

-0.23513E+00 -0.28848E+00 -0.52887E+00 -0.58655E+00 -0.60999E+00 -0.61610E+00 -0.61616E+00

O.OOOOOE+00 -0.10335E-01 O.OOOOOE+00 O.OOOOOE+00

-0.89157E-03 -0.17857E-02 -0.12414E-02 -0.13904E-02 -0.13966E-02 -0.13966E-02

.18267E+00

.13343E+00

.47967E-01

.39940E-01

.29281E-01

.45120E-02

.20659E-02

.18546E-04

.90993E-05

.90933E-05

ICON = 0 SUM OF SQUARES = .90933E-05 ANGLE = 44.98 NUMBER OF FUNCTION EVALUATIONS = 84

NUMBER INITIAL A FINAL A

1 2 3 4

0.11400E+00 0.68566E-03

-0.17037E+01 -0.53486E-03

0.45541E+00 0.30139E-04

-0.61616E+00 -0.13966E-02

248

neural network model-based control a dissertation …

Documents