imposing steady-state performance on identified nonlinear polynomial models by means of constrained...

6
Imposing steady-state performance on identified nonlinear polynomial models by means of constrained parameter estimation L.A. Aguirre, M.F.S. Barroso, R.R. Saldanha and E.M.A.M. Mendes Abstract: The authors present a procedure that permits the use of steady-state information to constrain the identification of nonlinear polynomial models. Such a procedure has three main steps. First, a general framework is provided that relates the static function of nonlinear global polynomial models to their terms and parameters. Second, using standard nonlinear programming techniques, a rational function is fitted to the system static function, which is assumed to be known and is used as auxiliary information. Finally, the information gathered in the first two steps is used to write a set of equality constraints that are exactly satisfied by a standard constrained least-squares algorithm used to estimate the parameters of the identified model. It is shown that the resulting model will always have the specified static nonlinearity and will use additional degrees of freedom to fit the dynamics underlying the observed data. 1 Introduction The issue of recovering static nonlinearities has received considerable attention in recent years [1–4]. A possible, but certainly not the only, application of this technique is the analysis and design of intelligent control systems [5]. A closely related field of research is the use of auxiliary information in identification problems. In this context, a key issue is to define which kind of knowledge is usable and how to incorporate such information into the model. To this end, some alternatives have been put forward such as writing equations based on first principles and using black-box structures to cater for mismatches and uncertainties [6]. Also, more general approaches have been developed where the prior knowledge comes in the form of the static gain sign, pole locations and stability for linear models [7], crossover frequency and gain for linear systems [8] and partially or completely known models, stability, linearity and steady-state regimes in the case of nonlinear models [9]. In such cases the prior knowledge is translated into equalities and inequalities that are used by constrained optimisation techniques during parameter estimation. The term ‘grey’ points to the fact that the procedure is not purely ‘black’, but of course there are many possible ‘shades of grey’. In particular, ‘clear grey’ techniques use first principles to derive the model structure. Auxiliary infor- mation will now be used to constrain parameter estimation and this seems to configure a ‘dark grey’ procedure. Our use of the term grey box should be understood in this context. A related approach has been recently proposed for fuzzy models in [10]. However, the static function is only approximately retained by the identified model. That means to say that there is no guarantee that the auxiliary information is correctly taken into account. As a matter of fact, as demonstrated in [10] the performance of the procedure, in what concerns steady-state behaviour, varies greatly. Moreover, in the case of fuzzy models the information is distributed among various local models in contrast to the present approach where the overall static function is related to the global nonlinear model structure, which is usually a more difficult problem [11]. An obvious difficulty is to relate ‘engineering-type prior knowledge’ to the model structure and parameters. In [12], physically meaningful parameters were constrained to remain within a particular range during estimation. Of course, the problem of constraining parameters becomes particularly difficult in the case of nonlinear black-box representations for which the parameters do not have a clear and direct physical interpretation. As pointed out by Johansen ‘more work is needed to develop practical engineering procedures for the coding of prior knowl- edge…’ [9, p. 353]. The present work aims at developing a procedure by which a nonlinear autoregressive model with exogenous inputs (NARX) can be obtained in such a way that: (i) it fits dynamical data in a least-squares sense (as in black-box identification); and (ii) it has a predefined static nonlinearity. In order to attain the aforementioned goals it is necessary to translate the auxiliary information about the static nonlinearity of the system into precisely defined constraints that can be directly used by a standard constrained least- squares algorithm. Thus, our main contributions are to present, in a closed form, a general formula that relates steady-state behaviour to a global NARX polynomial, and also to show, based on that formula, how to write the restrictions. 2 Statement of the problem It is assumed that for a given system two sets of data are available, namely: (i) the M values of the steady-state inputs " u ¼f " u 1 ; ... ; " u M g; and the M corresponding values of the q IEE, 2004 IEE Proceedings online no. 20040102 doi: 10.1049/ip-cta:20040102 The authors are with Programa de Po ´s Graduac ¸a ˜o em Engenharia Ele ´trica, Universidade Federal de Minas Gerais, Av. Anto ˆnio Carlos 6627, 31270- 901 Belo Horizonte, M.G., Brazil Paper received 16th April 2003 IEE Proc.-Control Theory Appl., Vol. 151, No. 2, March 2004 174

Upload: emam

Post on 19-Sep-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Imposing steady-state performance on identified nonlinear polynomial models by means of constrained parameter estimation

Imposing steady-state performance on identifiednonlinear polynomial models by means ofconstrained parameter estimation

L.A. Aguirre, M.F.S. Barroso, R.R. Saldanha and E.M.A.M. Mendes

Abstract: The authors present a procedure that permits the use of steady-state information toconstrain the identification of nonlinear polynomial models. Such a procedure has three main steps.First, a general framework is provided that relates the static function of nonlinear global polynomialmodels to their terms and parameters. Second, using standard nonlinear programming techniques, arational function is fitted to the system static function, which is assumed to be known and is used asauxiliary information. Finally, the information gathered in the first two steps is used to write a set ofequality constraints that are exactly satisfied by a standard constrained least-squares algorithm usedto estimate the parameters of the identified model. It is shown that the resulting model will alwayshave the specified static nonlinearity and will use additional degrees of freedom to fit the dynamicsunderlying the observed data.

1 Introduction

The issue of recovering static nonlinearities has receivedconsiderable attention in recent years [1–4]. A possible, butcertainly not the only, application of this technique is theanalysis and design of intelligent control systems [5].

A closely related field of research is the use of auxiliaryinformation in identification problems. In this context, a keyissue is to define which kind of knowledge is usable and howto incorporate such information into the model. To this end,some alternatives have been put forward such as writingequations based on first principles and using black-boxstructures to cater for mismatches and uncertainties [6].Also, more general approaches have been developed wherethe prior knowledge comes in the form of the static gainsign, pole locations and stability for linear models [7],crossover frequency and gain for linear systems [8] andpartially or completely known models, stability, linearityand steady-state regimes in the case of nonlinear models [9].In such cases the prior knowledge is translated intoequalities and inequalities that are used by constrainedoptimisation techniques during parameter estimation. Theterm ‘grey’ points to the fact that the procedure is not purely‘black’, but of course there are many possible ‘shades ofgrey’. In particular, ‘clear grey’ techniques use firstprinciples to derive the model structure. Auxiliary infor-mation will now be used to constrain parameter estimationand this seems to configure a ‘dark grey’ procedure. Our useof the term grey box should be understood in this context.

A related approach has been recently proposed for fuzzymodels in [10]. However, the static function is only

approximately retained by the identified model. Thatmeans to say that there is no guarantee that the auxiliaryinformation is correctly taken into account. As a matter offact, as demonstrated in [10] the performance of theprocedure, in what concerns steady-state behaviour, variesgreatly. Moreover, in the case of fuzzy models theinformation is distributed among various local models incontrast to the present approach where the overall staticfunction is related to the global nonlinear model structure,which is usually a more difficult problem [11].

An obvious difficulty is to relate ‘engineering-type priorknowledge’ to the model structure and parameters. In [12],physically meaningful parameters were constrained toremain within a particular range during estimation. Ofcourse, the problem of constraining parameters becomesparticularly difficult in the case of nonlinear black-boxrepresentations for which the parameters do not have a clearand direct physical interpretation. As pointed out byJohansen ‘more work is needed to develop practicalengineering procedures for the coding of prior knowl-edge…’ [9, p. 353]. The present work aims at developing aprocedure by which a nonlinear autoregressive model withexogenous inputs (NARX) can be obtained in such a waythat: (i) it fits dynamical data in a least-squares sense (as inblack-box identification); and (ii) it has a predefined staticnonlinearity.

In order to attain the aforementioned goals it is necessaryto translate the auxiliary information about the staticnonlinearity of the system into precisely defined constraintsthat can be directly used by a standard constrained least-squares algorithm. Thus, our main contributions are topresent, in a closed form, a general formula that relatessteady-state behaviour to a global NARX polynomial, andalso to show, based on that formula, how to write therestrictions.

2 Statement of the problem

It is assumed that for a given system two sets of data areavailable, namely: (i) the M values of the steady-state inputs�uu ¼ f�uu1; . . . ; �uuMg; and the M corresponding values of the

q IEE, 2004

IEE Proceedings online no. 20040102

doi: 10.1049/ip-cta:20040102

The authors are with Programa de Pos Graduacao em Engenharia Eletrica,Universidade Federal de Minas Gerais, Av. Antonio Carlos 6627, 31270-901 Belo Horizonte, M.G., Brazil

Paper received 16th April 2003

IEE Proc.-Control Theory Appl., Vol. 151, No. 2, March 2004174

Page 2: Imposing steady-state performance on identified nonlinear polynomial models by means of constrained parameter estimation

output �yy ¼ f�yy1; . . . ; �yyMg; and (ii) a set of dynamical datafuðkÞ; yðkÞg; k ¼ 1; . . . ;N: The objective is to obtain aNARX polynomial model with dynamical performancecompatible with the data fuðkÞ; yðkÞg; k ¼ 1; . . . ;N and asteady-state performance that is coherent with the infor-mation in f �uu; �yyg:

In some cases, steady-state measurements can beretrieved, directly or after some data analysis, fromhistorical records. M can be rather small (less than ten),but it is desirable that f�uu1; . . . ; �uuMg span a wide range ofinput values. The dynamical data should be obtainedfollowing the well known procedures of dynamical testing.

3 Constrained least-squares techniques

Consider the NARX model [13]:

yðkÞ ¼ F‘½yðk � 1Þ; . . . ; yðk � nyÞ;uðk � dÞ . . . uðk � d � nu þ 1Þ; eðkÞ ð1Þ

where u(k) and y(k) are respectively the input and outputsignals and e(k) accounts for uncertainties, possible noiseand unmodelled dynamics. We will assume that F‘½· is apolynomial-type function with nonlinearity degree ‘ 2 Z

þ:Expressing (1) in linear regression form yields:

yðkÞ ¼ cTðk � 1Þuuþ �ðkÞ ð2Þwhere cðk � 1Þ is the regressors vector which containslinear and nonlinear combinations of output, input and noiseterms up to and including time k � 1: It is well known thatleast-squares techniques provide an (unconstrained) esti-mate of uu by minimising the norm of the vector of residuals,jTj; where jT ¼ ½�ð1Þ �ð2Þ . . . �ðNÞ [14]. The movingaverage (MA) part of the model is used during parameterestimation to reduce bias [15]. By model structure we meanthe particular set of multinomials (regressors) that composec; see (2). The structure selection stage comprises thechoice, among the set of candidate terms, of those terms thatfor a given situation and a given criterion are best suited tocompose the model. The choice of model structure is criticaland can be chosen using orthogonal techniques [16].Improvements to such techniques can be found in [17]. Inthe following Sections it is assumed that some criterion willbe used to choose the model structure.

An interesting alternative to parameter estimation, thatwe will further explore, is the constrained least-squares(CLS) algorithm [18]. In this case, the starting point is stillthe linear regression form of (2) but jTj can be minimisedconstrained to another linear function in the parametervector, say, c ¼ Su; where vector c and matrix S are knownat the estimation stage. As will become clear later, vector cis composed of constants taken from a nonlinear staticmodel and matrix S can be easily obtained by inspectionafter choosing the structure of the final model. Therefore,the constraints take the form:

c ¼ Su ð3ÞThe solution to the problem of finding a vector u thatminimizes jTj and (exactly) satisfies the set of constraintsc ¼ Su; that is:

uuCLS ¼ arg min ½jTju : c ¼ Su

ð4Þ

is given by [18]:

uuCLS ¼ ðCTCÞ�1CTy � ðCTCÞ�1ST

� ½SðCTCÞ�1ST�1ðSuuLS � cÞ ð5Þ

where C is the regressors matrix (each column of whichcorresponds to each element of c taken along the datarecords, similar to vector y and y(k), see (2)) and uuLS is thestandard least-squares solution which is given by the firstterm on the right-hand side of (5).

This procedure has recently been used by Pearson andPottmann [3] to estimate linear models with unity DC-gainin the context of block-oriented model identification. In thatcase, c is a scalar c ¼ 1 and S is easily obtained by applyingthe definition of steady-state gain to a linear autoregressivemodel with exogenous inputs (ARX).

One of our aims is to go a step further and to show howsuch an algorithm can be used to constrain parameterestimation of a NARX model. Of course, the first question toanswer is: what kind of useful constraints can be written inthe form c ¼ Su? The second question is: how to obtainsuch restrictions? Having understood these issues, (5) can bereadily applied to great benefit. Thus, the two aforemen-tioned questions are addressed in the following Section.

4 Static nonlinearities in NARX polynomialmodels

Equation (1), when F‘½· is a polynomial of degree ‘; can beexpanded as the summation of terms with degrees ofnonlinearity in the range ½1–‘: Each ð p þ mÞth-order termcan contain a pth-order factor in yðk � niÞ and mth-orderfactor in uðk � niÞ and is multiplied by a coefficientcp;mðn1; . . . ; nmÞ as follows.

yðkÞ ¼X‘

m¼0

X‘�m

p¼0

Xny;nu

n1;nm

cp;mðn1; . . . ; nmÞ

�Yp

i¼1

yðk � niÞYm

i¼1

uðk � niÞ þ eðkÞ ð6Þ

where

Xny;nu

n1;nm

�Xny

n1¼1

Xny

n2¼1

Xnu

nm¼1

ð7Þ

and the upper limit is ny if the summation refers to factors inyðk � niÞ or nu for factors in uðk � niÞ: Equation (6) can bewritten in steady-state conditions as:

�yy ¼X‘

m¼0

X‘�m

p¼0

Xny;nu

n1;nm

cp;mðn1; . . . ; nmÞ�yy p �uum ð8Þ

The constantsPny;nu

n1;nmcp;mðn1; . . . ; nmÞ in (8) are the

coefficients of the term clusters Oy pum ; which containterms of the form y pðk � iÞumðk � jÞ for 0 � m þ p � ‘;where i and j are any time lags. Such coefficients are calledcluster coefficients and are represented as

Py pum [19]. In

words, a term cluster is a set of terms of the same type andthe respective cluster coefficient is the summation of thecoefficients of all the terms of the corresponding cluster.Hence, terms of the same cluster explain the same typeof nonlinearity. For instance, the terms yðk � 1Þuðk � 1Þ;yðk � 1Þuðk � 3Þ and yðk � 2Þuðk � 2Þ all pertain to theterm cluster Oyu:

In steady-state conditions �yy ¼ yðk � 1Þ ¼ yðk � 2Þ ¼ ¼ yðk � nyÞ; �uu ¼ uðk � 1Þ ¼ uðk � 2Þ ¼ ¼uðk � nuÞ and (6) can be rewritten using the definition ofterm clusters as:

�yy ¼ S0 þ Sy �yy þ Su �uu þX‘�1

m¼1

X‘�m

p¼1

Sy pum �yy p �uum

þX‘

p¼2

Sy p �yy p þX‘

m¼2

Sum �uum ð9Þ

IEE Proc.-Control Theory Appl., Vol. 151, No. 2, March 2004 175

Page 3: Imposing steady-state performance on identified nonlinear polynomial models by means of constrained parameter estimation

where the clustered terms were separated as: constant term,S0; linear terms in y, Sy �yy; linear terms in u, Su �uu; cross-termsP‘�1

m¼1

P‘�mp¼1 Sy pum �yy p �uum; nonlinear terms in y,

P‘i¼1 Syi �yyi

and nonlinear terms in uP‘

i¼1 Sui �uui: The solution of (9) willyield the fixed points equilibria of model (6) for theparticular value of the input being used.

From (9) the model static ratio Ks can be readilyobtained as:

Ks¼�yy

�uu¼ S0=�uuþSu þ

P‘m¼2Sum �uum�1

1�Sy �P‘�1

m¼1

P‘�mp¼1 Sypum �yyp�1 �uum �

P‘p¼2Syp �yyp�1

ð10ÞIn what follows it will be helpful to distinguish between thestatic function of a model (which is obviously related to thegain) and its dynamics. As a matter of fact this is preciselywhat is done when block-oriented models such as those ofHammerstein and also Wiener are used to describenonlinear systems. Equation (10) is the counterpart of thestatic nonlinearity of block-oriented models. It can beshown that terms from Ou j ; j > 1 do not influence thedynamics whereas they do directly affect the static ratio.The numerator of (10) can be expressed in terms of variablesxK that solely influence Ks: Conversely, terms from theclusters Oyu and Oy not only influence the dynamics but alsoaffect Ks [20], as can be seen directly from the denominatorof (10). Such variables will be generally denoted as xt:Using this notation, (10) can be expressed as:

Ksð�xxti;�xxKi

Þ ¼S0=�uu þ Su þ

PnK

i¼1 SxKiu �xxKi

1 � Sy �Pnt

i¼1 Syxti�xxti

ð11Þ

where the summation in the denominator includes all termsfrom clusters Oy i ; i > 1 and Oyu i ; i > 0 as seen in (10).Also, it is important to realise that the numbers Syxti

and

SxKiu are the coefficients of the clusters Oyxti

and OxKiu;

respectively, and are not to be confused with tilesummations.

It is helpful to use the following notation. KKSð·Þ ¼ �yy=�uu isthe estimated static ratio and �yy ¼ f ð�yy; �uuÞ is the staticfunction. The solutions of �yy � f ð�yy; �uuÞ ¼ 0 are the systemfixed points for the given input �uu: Clearly, f ð�yy; �uuÞ ¼ KKS �uu:Thus, if the static function depends only on the input, that is,f ð�uuÞ; as in a Hammerstein model, then terms from theclusters Oyiu j ; i ¼ 2; . . . ; p; j ¼ 0; . . . ;m cannot be includedin the model in order to have a single steady state for eachvalue of �uu: Some of these ideas will be illustrated in thefollowing Section.

5 Using auxiliary information

5.1 General procedure

The following procedure is suggested to use the availableauxiliary information during identification.

Step 1: Take the steady-state data f �uu; �yyg and fit an algebraicfunction to it, �yy ¼ f ð�yy; �uuÞ: Equation (11) together with someprior knowledge (more on this later) should be used tomotivate the form of the static function f.Step 2: Using (11) and the estimate of f obtained in step 1,write the constraints on cluster coefficients in the form of (3).Step 3: Finally, use dynamical data fuðkÞ; yðkÞg; k ¼1; . . . ;N and (5) to estimate a model, with static functionf, that fits the dynamical data in a (constrained) least-squaressense.

The last step can be readily accomplished simply by using(5), but the first two steps require a little more judgement.These steps are detailed below.

5.2 Guidelines for choosing the staticnonlinear function

Theoretical support for the guidelines provided in thisSection can be found in [20, 21].

If it is not possible to fit a particular function f to the data,the model structure or even the model type (representation)might have to be changed. A poor fit of f will probably resultin a final model with poor steady-state characteristics.

In choosing the steady-state function f, two questionsshould be answered: (i) does the system display multiplesteady states? and (ii) do the system dynamics varyaccording to the operating point or are they constant as ina Hammerstein model? In the following steps the answers tothese questions are used to determine some of the features inthe structure of f.

The answers to the aforementioned questions can be usedas detailed below.

(a) If the system has Ns simultaneous fixed points, then allclusters of the form Oy pum ; with p > Ns (regardless the valueof m) should be deleted from the set of candidate terms (seeafter (2)).(b) If the system dynamics do not vary with the operatingpoint (for instance, as in a Hammerstein model), then allclusters of the form Oy pum with p 6¼ 0 and m 6¼ 0; that is,all the cross-term clusters should be deleted from the set ofcandidate terms.

In many situations where some prior information about thesystem exists, as indicated by the questions above, thecourse of action suggested is effective in significantlyreducing the number of potential model structures at theoutset. Subsequently, the important thing to do is to fit afunction f to the steady-state data. In order to do so, it shouldbe noticed that (11) is the most general form for such afunction, given a NARX polynomial. Such an equation canbe simplified, according to the answers given to thequestions above, as follows. For systems with just onesteady state, the static function will be:

�yy ¼ f ð�uuÞ ¼

P‘m¼0

am �uum

1 � b1

8 b1 6¼ 1 ð12Þ

if the dynamics do not vary with operating point (in otherwords, there are no cross-terms in the model and thedenominator of (11) is a constant) or:

�yy ¼ f ð�uuÞ ¼

P‘m¼0

am �uum

1 � b1 �P‘�1

m¼1

bmþ1 �uum

ð13Þ

if the dynamics do vary with operating point. In (12) and(13), a0;1;2; ;‘ and b1;2; ;‘ are constants directly related to thecluster coefficients, as can be clearly seen by comparingsuch equations to (11).

Assuming the system has more than one steady state, forthe model to be able to reproduce such a feature, it musthave terms from Oy pum ; for any value of m and p > 1: In thiscase the static function will be of the form �yy ¼ f ð�yy; �uuÞ; sincethere will be y-monomials (possibly y2 and higher-ordermonomials) in the denominator of (11).

As it is assumed that there are steady-state data available,such data can be used to fit the function f. In this work aquasi-Newton BFGS algorithm [22] has been successfullyused, as will be illustrated in the following Section, but ofcourse other options could be used.

IEE Proc.-Control Theory Appl., Vol. 151, No. 2, March 2004176

Page 4: Imposing steady-state performance on identified nonlinear polynomial models by means of constrained parameter estimation

The estimated function f is an approximation to thesteady-state behaviour of the system, as conveyed by thesteady-state data f �uu; �yyg: In order to take this into accountduring the identification of a dynamical model, a connectionbetween f and such a model must be established. Asdiscussed in Section 4, this can be readily done using theconcepts of term clusters and cluster coefficients. In fact, therelationship between f and the parameters of the dynamicalmodel can be written in the form of (3), where the elementsof c are the coefficients of function f. For instance, if f is likein (13) then c ¼ ½a0 a1 a2 b1 b2 T and S is a matrixwith zeros and ones placed in such a way that all theelements of u that correspond to a given cluster are added upand equated to either an ai or bi: Having reached this point,the procedure is easily accomplished by applying (5).

5.3 Remarks on statistical properties

It should be realised that there are two different issuesinvolved, namely the estimation of the dynamical modelparameters and the estimation of the static nonlinearity,which is a function of such parameters. Starting the analysiswith the static function, it becomes clear that the issue is: arethe cluster coefficients unbiased, that is, does E½Suu CLS �Su ¼ 0 hold? Using (5) in this expectation yields:

E½SðCTCÞ�1CTy � SuuLS þ c � Su ¼ E½c � Su ¼ 0

ð14Þwhich is true by definition, see (3). The elements of c are thecoefficients of a static model. We take such coefficients to bedeterministic even if in practice the static model is estimatedby some stochastic, procedure. A similar simplification wasalso used in [11]. On the one hand, this will greatly facilitatethe analysis; however, on the other hand, this means that allthe results must be interpreted with respect to the value of cactually used.

It goes without saying that if the static nonlinearity is inerror, that is, if the cluster coefficients specified in c do notquite agree with the process steady state, it will still beimposed on the final model.

As for the variance of the estimated cluster coefficients,which are the parameters of the model taken in steady-stateconditions, the following result holds

cov½SuuCLS ¼ Scov½uuCLSST ¼ SE½ðuuCLS�uÞðuuCLS�uÞTST

¼E½ðSuu CLS� cÞðSuuCLS� cÞT

Because the estimated parameter vector uuCLS is forced toexactly satisfy SuuCLS ¼ c it follows that cov½SuuCLS ¼ 0:The previous results indicate, as expected, that (5) willguarantee that the final estimated dynamical nonlinearmodel will always have exactly the specified staticnonlinearity. It should be appreciated that whereas uuCLS isa vector of random variables, SuuCLS is deterministic andequal to c by construction. In practice this means that theproposed procedure while ensuring that the identifiedmodels will have exactly the imposed steady-state perform-ance, SuuCLS is deterministic, the estimated parameter vectorstill has many degrees of freedom, uuCLS is a vector ofrandom variables, to better adjust to the dynamics of theprocess and to reduce the effect of noise.

It remains to consider the properties related to theparameter vector uuCLS; which are the parameters of thedynamical model. These are estimated using the extendedleast-squares estimator. To this end C; S and u are extendedwith moving average terms [15]. As the moving averagepart of the model will not affect the input-output steady-state

relation of the model, c remains unchanged and S is simplycompleted with columns of zeros in order to maintain thecorrect dimensionality. By comparison with the uncon-strained estimator, which is known to be unbiased [15], theCLS estimator is necessarily biased in the sense that E½uuCLSto some degree pushed away from E½uuLS ¼ u by theconstraints. It is vital to see that in the class of problemscurrently addressed the aforementioned bias is welcome. Itis assumed that the steady-state information is, for somereason, blurred in the dynamical data and therefore it has tobe imposed using hard constraints. Without such constraintsthe parameters are unbiased but the final model does nothave the desired steady-state behaviour.

6 A numerical example

In order to illustrate the procedure, a simple laboratorysystem was considered. For this system the followinginformation was used: (i) static tests revealed that, in therange of input signals used, the static nonlinear function f isbasically quadratic without saturation; (ii) the system doesnot present two possible states for the same input; and (iii) adynamical test was performed by applying a random inputto produce identification and validation data.

Because the system does not display multiple steadystates, term clusters of the form Oy p with p > 1 need not beconsidered, see item (a) in Section 5.2. From the datacollected during steady-state tests (see Fig. 1) it was inferredthat the overall shape of the static function is basicallyquadratic. Consequently, term clusters of the form Oum withm > 2 were discarded. This action is obviously subjective asit was based on a visual analysis of Fig. 1. Omitting thisaction would result in a processing time 1 or 2 s longer.

Using both the identification data and the ERR algorithm[14] to define the priority order of inclusion of each term inthe model followed by Akaike’s information criterion [23]to determine the number of terms, and finally the orthogonalextended least-squares algorithm to estimate parameters, thefollowing model was identified:

yðkÞ ¼ 1:2929yðk � 1Þ þ 0:0101uðk � 2Þuðk � 1Þþ 0:0407uðk � 1Þ2 � 0:3779yðk � 2Þ� 0:1280uðk � 2Þyðk � 1Þþ 0:0957uðk � 2Þyðk � 2Þ þ 0:0051uðk � 2Þ2

ð15Þ

Fig. 1 The static function predictions for models (19) and (15)compared to the measured values

IEE Proc.-Control Theory Appl., Vol. 151, No. 2, March 2004 177

Page 5: Imposing steady-state performance on identified nonlinear polynomial models by means of constrained parameter estimation

It is now desired to obtain another model with the samestructure but that will have a specified steady-state response.As suggested in Section 5, the starting point (step 1, inSection 5.1) is to fit a static function to the steady-state data.Steady-state (or cluster) analysis of model (15) yields thefollowing algebraic equation.

�yy ¼ Su2 �uu2

1 � Sy � Syu �uuð16Þ

which, of course, is nonlinear in the parameters. Using aquasi-Newton FBGS algorithm and taking as initialconditions the unconstrained least-squares solution, that is,the static function of (15), the following values for thecluster coefficients in (16) were estimated:

c ¼ ½Su2 Syu SyT ¼ ½0:0615 � 0:0360 0:9128T ð17Þ

In step 2 (see Section 5.1), the following set of constraints tothe optimisation problem, in the form c ¼ Su (see (3)), iswritten:

0:0615

�0:0360

0:9128

24

35 ¼

0 1 1 0 0 0 1

0 0 0 0 1 1 0

1 0 0 1 0 0 0

24

35

�1

�2

�3

�4

�5

�6

�7

2666666664

3777777775

ð18Þ

Hence, (5) takes into account (18) and yields (step 3 ofSection 5.1) the parameters of the following model:

yðkÞ ¼ 1:2796yðk � 1Þ þ 0:0178uðk � 2Þuðk � 1Þþ 0:0408uðk � 1Þ2 � 0:3668yðk � 2Þ� 0:2565uðk � 2Þyðk � 1Þþ 0:2205uðk � 2Þyðk � 2Þ þ 0:0029uðk � 2Þ2

ð19Þ

which by design has the desired static function:

�yy ¼ 0:0615�uu2

1 � 0:9128 þ 0:0360�uuð20Þ

The steady-state and dynamical performances of model (19)are shown in Figs. 1 and 2, respectively. The normalised

root-mean-squared errors of models (15) and (19) are,respectively, 0.0094 and 0.0073. In order to compute theseindices the free-run predictions were used and normalisedby the standard deviation of the measured data. Clearly,whereas both models have similar dynamical performances,only the model obtained by constrained least squares has thespecified steady-state performance which is closer to thestatic data collected from the system, as seen in Fig. 1.

7 Discussion and conclusions

The main challenge in black-box modelling is to obtain avalid model for a system exclusively from measureddynamical data. In many cases this will be necessarily thefirst approach to system modelling. However, it oftenhappens that other pieces of information about the systemare also available. The question then is: can this informationbe incorporated into the final model? In fact, the questionhas at least a three-fold aspect in practice. First, we shouldattempt to answer what kind of information can be used toimprove model building. Second, how does this type ofinformation relate to the model (structure and parameters)?Finally, if the first two aspects have been properlyunderstood and answered, it becomes vital to learn howsuch model-related information can be used effectively.

We have discussed and suggested answers to these threequestions for the case of nonlinear discrete-time polynomialmodels. As for the kind of information, we have usedknowledge of the system steady-state behaviour, which isassumed to be available in the form of measured data. Ofcourse, if this information is available in the form of analgebraic equation, e.g. derived from first principles, all thebetter, but this is surely a less frequent situation. Knowingwhether or not the system dynamics change with theoperating point is another piece of information that can alsobe used, but in a less stringent way. This knowledge is notassumed to be available, but if it is, we have suggested away of using it.

It has been shown that the aforementioned kind ofinformation relates to the model structure and parametersvia term clusters and cluster coefficients thus providing alink between the auxiliary information and the model. Forthe type of models considered such relations are veryamenable to constrained parameter estimation. In fact, it hasbeen shown how such relations can be employed to writedown a set of constraints in such a way that auxiliaryinformation can be used effectively by well knownconstrained least-squares techniques. Statistical propertiesof the new procedure have been discussed.

Choosing model terms from spurious term clusters willoften yield models with unwanted features. For instance, wemay end up with a model with dynamics that depend on theoperating point for a Hammerstein-type process, or themodel may have multi-stationary outputs for a mono-stationary process or vice-versa. With this in view, themodel representation considered can be thought of as aparsimonious nonlinear discrete-time polynomial model incontrast to the full-polynomial counterparts which arecriticised for having too many parameter and extrapolatingpoorly beyond training data.

The suggested identification procedure would be particu-larly interesting in modelling Hammerstein-type processesfor which dynamical data are available only around aspecific operating point. Hence, such data contain ‘all’ thedynamical information. In several cases, steady-stateinformation can be either gathered independently bymeans of steady-state tests or even by screening historicalrecords. If a purely black-box modelling procedure was used

Fig. 2 The free-run predictions of model (19) compared to themeasured data. The performance of model (15) is very similar tothe one on this Figure and is therefore omitted

IEE Proc.-Control Theory Appl., Vol. 151, No. 2, March 2004178

Page 6: Imposing steady-state performance on identified nonlinear polynomial models by means of constrained parameter estimation

in this setting, the process nonlinearity would most likelynot be correctly taken into account in the model, since thedynamical data have only ‘almost linear dynamicalinformation’. The nonlinear information can be imposedon the model as detailed, thus yielding a global nonlineardynamical model. If the process dynamics change with theoperating point, then a single set of dynamical data limitedto one operating point will not convey all the dynamicalinformation required to build a global model. In this case,the described procedure can still be used but the dynamicaltest would have to excite the process over a much widerrange in order to reveal dynamical nonlinearities.

8 Acknowledgments

The authors gratefully acknowledge the financial supportprovided by both CNPq and FAPEMIG.

9 References

1 Greblicki, W.: ‘Nonlinearity estimation in Hammerstein systems basedon ordered observations’, IEEE Trans. Signal Process., 1996, 44, (5),pp. 1224–1233

2 Pottmann, M., and Pearson, R.K.: ‘Back-oriented NARMAX modelswith output multiplicities’, AIChE J., 1998, 44, (1), pp. 131–140

3 Pearson, R.K., and Pottmann, M.: ‘Gray-box identification of block-oriented nonlinear models’, J. Process Control, 2000, 10, pp. 301–315

4 Aguirre, L.A., Donoso-Garcia, P.F., and Santos-Filho, R.: ‘Use ofa priori information in the identification of global nonlinear models —A case study using a Buck converter’, IEEE Trans. Circuits Syst. I,Fundam. Theory Appl., 2000, 47, (7), pp. 1081–1085

5 Hippe, P., and Wurmthaler, C.: ‘Systematic closed-loop design in thepresence of input saturations’, Automatica, 1999, 35, pp. 689–695

6 Thompson, M.L., and Kramer, M.A.: ‘Modeling chemical processesusing prior knowledge and neural networks’, Automatica, 1994, 40, (8),pp. 1328–1340

7 Tulleken, H.J.A.F.: ‘Grey-box modelling and identification usingphysical knowledge and Bayesian techiniques’, Automatica, 1993, 29,(2), pp. 285–308

8 Eskinat, E., Johnson, S.H., and Luyben, W.L.: ‘Use of auxiliaryinformation in system identification’, Ind. Eng. Chem. Res., 1993, 32,pp. 1981–1992

9 Johansen, T.A.: ‘Identification of non-linear systems using empiricaldata and prior knowledge - an optimization approach’, Automatica,1996, 32, p. 337

10 Abonyi, J., Babuska, R., and Szeifert, F.: ‘Fuzzy modeling withmultivariate membership functions: gray-box identification and controldesign’, IEEE Trans. Syst. Man Cybern. B, Cybern., 2001, 31, (5),pp. 755–767

11 Banerjee, A., and Arkun, Y.: ‘Model predictive control of planttransitions using a new identification technique for interpolatingnonlinear models’, J. Process Control, 1998, 8, (5,6), pp. 441–457

12 Murakami, K., and Seborg, D.E.: ‘Constrained parameter estimationwith applications to blending opearations’, J. Process Control, 2000,10, pp. 195–202

13 Leontaritis, I.J., and Billings, S.A.: ‘Input-output parametric models fornonlinear systems part II: Stochastic nonlinear systems’, Int. J. Control,1985, 41, (2), pp. 329–344

14 Chen, S., Billings, S.A., and Luo, W.: ‘Orthogonal least squaresmethods and their application to nonlinear system identification’, Int.J. Control, 1989, 50, (5), pp. 1873–1896

15 Soderstrom, T., and Stoica, P.: ‘System Identification’ (Prentice HallInternational, London, 1989)

16 Korenberg, M.J., Billings, S.A., Liu, Y.P., and McIlroy, P.J.:‘Orthogonal parameter estimation algorithm for nonlinear stochasticsystems’, Int. J. Control, 1988, 48, (1), pp. 193–210

17 Mendes, E.M.A.M., and Billings, S.A.: ‘An alternative solution to themodel structure selection problem’, IEEE Trans. Syst. Man Cybern.A. Syst. Humans, 2001, 36, (21), pp. 597–608

18 Draper, N.R., and Smith, H.: ‘Applied Regression Analysis’ (Wiley,New York, 1998, 3rd edn.)

19 Aguirre, L.A., and Billings, S.A.: ‘Improved structure selection fornonlinear models based on term clustering’, Int. J. Control, 1995, 62,(3), pp. 569–587

20 Aguirre, L.A., Correa, M.V., and Cassini, C.C.S.: ‘Nonlinearities inNARX polynomial models: representation and estimation’, IEE Proc.,Control Theory Appl., 2002, 149, (4), pp. 343–348

21 Aguirre, L.A., and Mendes, E.M.: ‘Nonlinear polynomial models:Structure, term clusters and fixed points’, Int. J. Bifurcation ChaosAppl. Sci. Eng., 1996, 6, (2), pp. 279–294

22 Bazaraa, M.S., Sherali, H.D., and Shetty, C.M.: ‘Nonlinear Program-ming – Theory and Algorithms’ (Wiley, New York, 1993)

23 Akaike, H.: ‘A new look at the statistical model identification’, IEEETrans. Autom. Control, 1974, 19, (6), pp. 716–723

IEE Proc.-Control Theory Appl., Vol. 151, No. 2, March 2004 179