modeling of nonlinear discrete-time systems from input-output data

Automatica, Vol. 24. No. 5. pp. 629-641, 1988 Printed in Great Britain.

0005-1098188 $3.00 + il.00 Pergamon Press plc

(~ I988 International Federation of Automatic Control

Modeling of Nonlinear Discrete-time Systems from Input-Output Data*

HERNANDO DIAZt and ALAN A. DESROCHERSt

Polynomial difference equations are used to identify state variable models for nonlinear discrete-time systems from input-output data.

Key Words--Nonlinear systems; modeling; discrete-time systems; identification; polynomials.

Abs t rac t - -An algorithm is presented that produces a polynomial state affine model for a discrete-time nonlinear system. The method uses only input-output information and is based on a difference equation approximation to the input-output map. From the difference equation, a behaviour matrix is constructed, to be used in the state atfine realization algorithm. The difference equation approach requires the estimation of fewer parameters than the usual Volterra series. A state space model is then obtained directly from the difference equation. As a byproduct, a method for calculating the Volterra series directly from the difference equation is obtained. Practical applications to nonlinear systems are included.

INTRODUCTION

THE PROBLEM under investigation here is the modeling of nonlinear discrete-time systems from a set of input-output data. This is often the only approach to modeling since, in most cases, only external (i.e. input-output) data are available.

The proposed method fits a polynomial difference equation (relating present and past values of inputs and outputs) to the input- output information. Once the difference equation is obtained, a state space realization is obtained.

The central contribution of this work is a method for generating behavior or generalized Hankel matrices, directly from a difference equation description. This behavior matrix is then used as the basis for a state affine realization algorithm, that provides a polynomial state space model of the system.

As a result of this approach, a method for

* Received 5 January 1987; revised 13 January 1988; received in final form 5 February 1988. The original version of this paper was presented at the 10th IFAC World Congress which was held in Munich, F.R.G. during July 1987. The Published Proceedings of this IFAC Meeting may be ordered from: Pergamon Press plc, Headington Hill Hall, Oxford OX3 0BW, England. This paper was recommended for publication in revised form by Associate Editor V. Utkin under the direction of Editor H. Kwakernaak.

t Electrical, Computer and Systems Engineering Depart- ment, Rensselaer Polytechnic Institute, Troy, NY 12180- 3590, U.S.A.

629

computing the Volterra series of the system, directly from the difference equation is obtained. This, when combined with an approximation algorithm that fits a polynomial difference equation to the input-output data, provides a new method for the calculation of the Volterra series for a discrete-time nonlinear system. The method requires only the estimation of the coefficients of the difference equation (a relatively easy task). This is in contrast to the usual methods for computing Volterra series, that estimate the kernels directly from the input-output data. This approach allows the use of smaller data sets.

The main objective of this paper is to present a realization algorithm that provides a state space model of a nonlinear system described by a polynomial difference equation. This is achieved, without the intermediate step of determining a complete Volterra series description.

The next section reviews several related works. Then the theoretical framework for the development of the modeling method is established. Next, we consider the approximating properties of state affine systems, and proceed to find conditions for the existence of difference equation approximations of nonlinear systems. This provides a theoretical basis for the modeling algorithm.

The input-output description of nonlinear systems is considered next. We concentrate on the description of a system by a difference equation, relating present and past values of the input and the output functions. The Kronecker matrix tensor product notation (introduced in Appendix A) is used to describe these difference equations. The tensor products are then used to obtain a Volterra series for the system directly from the polynomial difference equation.

Finally, the method is applied to a biological system.

630 H. DIAZ and A. A. DESROCHERS

INPUT-OUTPUT DESCRIPTION OF NONLINEAR SYSTEMS

Continuous-time systems Generally, the works that use an input-output

approach to the modeling of nonlinear systems employ the Volterra series representation.

Wiener was the first to use Volterra series to analyze the response of nonlinear systems to white noise excitations (Wiener, 1958). After some initial work with Volterra series, en- thusiasm began to fade away, because of the difficulty of obtaining the Volterra kernels for practical systems. The question of convergence was also largely unanswered.

In 1976, Brockett (Brockett, 1976) developed convergence criteria for systems of the form:

£c(t) = f ( t , x(t)) + u(t)g(t, x(t)) (1)

y(t) = h(t, x(t)); (2)

x(0) = x0 (3)

where f ( . , - ) , g(', ") and h(., .) are real analytic functions defined in some neighborhood of the free response [i.e. with u( t )= 0 for all t]. Using some results from differential geometry and Lie algebra theory, Brockett proved that if the system has no finite escape time for u(t)==-0, then it admits a Volterra series representation.

Brockett also devised a general algorithm to find the Volterra kernels. His method is based on the so-called Carleman linearization (Krener, 1974) and followed a paper (d'Alessandro et al., 1974) on bilinear systems. Lesiak and Krener (Lesiak and Krener, 1978) extended the results of Brockett to cases in which the functions f( . , .) and h(., .) are only of class C k+1 and g(.) is C k. They also considered the problem of the uniqueness of the Volterra series representation.

The existence and uniqueness results were generalized by Boyd and Chua (Boyd and Chua, 1984), who introduced the concept of fading memory operators. They showed that if a system has the fading memory property, its response can be approximated with arbitrary precision for all bounded input functions with a Lip_schitz constant Q, by a Volterra series operator N. The results can also be applied to discrete-time systems (Boyd and Chua, 1984).

Discrete-time systems The input-output description of a single

input-single output nonlinear system can be expressed as a series:

z~

y(t) = ~, Lt, i(u(O) . . . . . u ( t - 1)) (4) i = 0

where L,., is a homogeneous polynomial of

degree i in the inputs u(0) . . . . . u ( t - 1), and t is now a discrete-time variable.

This type of representation is generally confronted with the same problems as the continuous-time version: the calculation of the kernels constitutes a very difficult problem.

In order to avoid this problem, several techniques have been utilized. Sontag (Sontag, 1979a, b) studied a more general type of input-output relation known as a response map.

Using the response map description, Sontag developed a very general realization theory for a class of nonlinear systems called state affine systems defined by

X( t + 1) = F(u(t ) )X( t ) + G(u(t))

y(t) = HX( t ) (5)

where F(.) and G(.) are polynomial matrices and H is a linear map.

Many related results were also obtained by Sontag and will be discussed later.

In an effort to find a method for the calculation of the Volterra kernels, the principles of noncommutative power series have also been applied (Fliess, 1978). The application has been limited to bilinear systems and the extension to more general models has not been accomplished.

The failure of the noncommutative coding led to the definition of a generalized generating series which can be constructed via tensor products of functions. Using the generating series, Fliess and co-workers have been able to obtain several controllability results, given in terms of rank conditions on a Lie algebra (Normand-Cyrot, 1982). The technique uses some tools from formal differential groups, developed by Ritt (Ritt, 1950).

Normand-Cyrot has also developed an explicit formula for the discrete Volterra series, in terms of tensor products of differential operators (Normand-Cyrot, 1981).

de Figueiredo et al. (de Figueiredo and Dwyer, 1980; de Figueiredo, 1983) considered a Volterra series as an element of a Hilbert space made up of an infinite tensor product of the functional space on which the input functions are defined. This space is known as a Fock space and every element of the Fock space can be identified with a Volterra series and so the problem of optimal modeling of Volterra series becomes a linear apppoximation problem in a Hilbert space. The resultant approximate system is a linear time-varying system with a nonlinear readout map. The time-varying model can be a serious drawback.

Another method based on input-output data was proposed by Costanza and Dickinson

Modeling nonlinear systems 631

(Dickinson, 1983; Costanza et al., 1983). It uses orthogonal polynomials in the inputs, to approximate the input-output behavior. A sequence of realizations is thus obtained, until adequate error bounds are achieved.

Our approach to the problem of modeling nonlinear systems centers on a description of the input-output map by means of a difference equation. This has the advantage of requiring fewer parameters than a full Volterra series description.

Once the difference equation model has been determined, a state affine realization algorithm to find a state space model for the system is employed. All the necessary calculations for the application of the realization step are based on the coefficients of the difference equation, as opposed to other approaches (Costanza et al., 1983) which estimate all the necessary matrices for the realization algorithm directly from the input-output data.

On the approximation of nonlinear systems by difference equations

In this section, the question of what systems can be approximately modeled by regression- type difference equations, in particular polynomial equations, is considered. It will be proven that all finitely realizable continuous input- output maps can be arbitrarily well approximated by a polynomial regression equation. It is assumed that the systems are single input-single output, however, extension to multi input-multi output systems does not pose any serious theoretical problem. The notation, however, becomes a real nuisance.

Let U (set of input values) and Y (set of outputs) be subsets of the real line, R. U' denotes the set of sequences w = { w ( 1 ) , . . . , w(t)} of length t, with w(i) in U. U ~ (respectively, U*) is the union of all U', t >-1 (respectively, t - 0).

A sequence will also be denoted by w = w(1)w(2) . . , w(t) [not to be confused with the product of the w(i)s].

A response is a map f : U'---~ Y. The response map gives the present value of the output as a function of the past values of the input.

6 r Let { i}i=o be a set of functions from U into R. Let J r = { 0 , 1 . . . . . r } a n d j t r = J r × J , x . . . x Jr (t times). The elements of J'r are multi-indices c~= a',o~2...~r,, with c~i in Jr. For any ~ in J'r, let 6o,(w) denote the product

. . .

A response f is bounded of type J = {60 . . . . . 6r} with 6i as defined above, if for each t - 1, there are finitely many numbers a,~ in

Y, such that:

f (w) = ~ ao,6~,(w) (6) of

for all w in U'. The summation runs over o~ in J'r. Without loss of generality, it will be assumed

that 60---1, and that the functions in J are linearly independent.

The definition of bounded response map includes discrete-time systems of the form: linear, bilinear, multilinear, polynomial and homogeneous of degree p. For instance, if the 6is are given by J = {1, u, u 2 . . . . . ur}, then we obtain a polynomial response. Generalization to multivariable systems can be done easily.

A system Z = (X, P, Q, 2) consists of a vector space X (the state space), maps P : X × U--+X (the state transition map), and Q : X x U---~ Y (the output or readout map), and an 2 in X, called the initial state. This definition cor- responds to a set of difference equations:

x(t + 1) = P(x(t), u(t)) (7)

y(t) = O(x(t), u(t)). (8)

A special, very important, case is the state affine (SA) system,

x(t + 1) = F(u(t))x(t) + G(u(t))

y(t) = H(u(t))x(t) + l(u(t))

where F(u) and H(u) are linear operators, G(u) is a vector, and I(u) is a real number.

An SA system is of finite type J = {6o . . . . . 6r}, if there are linear maps Fi, Gi, Hi, li, such that

X(t + 1) = ~ 6i(u(t))Fix(t) + ~ 6i(u(t))Gi i=O i=O

y(t) = ~ 6i(u(t))Hix(t) + ~ 6i(u(t))l,.. i = 0 i = 0

(9)

Now we define a concept that will be the center of our work, namely, a realization of an input-output map. This requires the concepts of reachability and observability maps.

The extended transition map P * : X x U*---*X is defined recursively by: P*(x, e) := x, where e is the empty sequence, P*(x, wu) := P(P*(x, w), u) for w in U*, and u in U, where wu indicates the concatenation operation wu := w ( 1 ) - . , w(t)u. The reachability map of Z is defined as g : U*---> X, by g(w) := P*(x, w). Therefore, g gives the resulting state after the application of an input sequence w, starting from 2. The image of the map g is called the set of reachable states.

For any w = v u i n U ÷, w i t h v i n U* a n d u i n


U, define Q(x, w):= Q(P*(x , u), u). This is called the observability map.

The response map associated to Z, f x : U * ~ Y, is defined by:

f x (w) = Q(P*(2 , v) , u). (10)

The system X is said to be a realization of fx- Now we define the concepts of reachability

and observability. X is span reachable, if X is the smallest affine manifold (translate of a linear subspace) that contains the set of reachable states. It is called observable, if the functions Q ( x , - ) : U ÷---~ Y, for x in X, are all distinct. Finally, the system is span canonical, if it is both span reachable and observable.

The following result establishes the connection between bounded responses and finite-type systems.

Theorem. A response f is bounded, of type J, iff its span canonical realization X r is of finite type J.

For a proof of this result see Sontag (1979a, b).

Two other properties of polynomial systems that are essential to our work will now be summarized. The results are taken from Sontag (1979a, b), where proofs can be found.

Theorem. Any response map f has a span canonical state affine realization Xl, and this realization is unique, up to isomorphisms.

Now we define a difference equation description of a discrete-time nonlinear system which is very important in the development to follow.

A response is said to satisfy a polynomial difference equation (of order s) if,

O(y(t) , y ( t - 1) . . . . . y ( t - s),

u ( t - 1) . . . . . u(t- s) ) = 0 (11)

where 0 is a polynomial in y( t ) . . . . . y ( t - s), u( t ) . . . . . u ( t - s), holds for all input-output pairs (y, u), and y ( t ) = f ( u ( 1 ) . . . . . u(t)) , t > s .

The equation is (output) affine, if it is of the form:

bi(u( t ) . . . . , u ( t - s ) ) y ( t - i) i = 0

+ b,+,(u( t ) . . . . . u(t - s)) (12)

with bi a polynomial, and bo ~ 0.

Theorem. A polynomial response satisfies an output affine difference equation iff it is SA finitely realizable. In this case, it also satisfies an output-linear equation (i.e. one with bs+, = 0).

This theorem is an important result because it implies the existence of difference equation descriptions of polynomial systems.

Now we proceed to prove that every response map can be approximated by a difference equation. To prove this result we need a very important theorem due to Fliess and Normand- Cyrot (1982).

Theorem (Fliess and N o r m a n d - C y r o t ) . On a finite time interval and with bounded inputs, any input-output map in which the output depends continuously on the input can be approximated arbitrarily well by state affine systems.

Proof . To prove the theorem one can show that all conditions of the Stone-Weierstrass theorem are satisfied. A modification of the proof by Fliess and Normand-Cyrot (1982) can be found in Diaz (1986).

Now consider the question of approximating the input-output map of polynomial systems by means of polynomial regression-type difference equations of the form:

y( t ) = p ( y ( t - 1) . . . . . y ( t - s ) ,

u ( t - 1 ) , . . . , u ( t - s ) ) . (13)

According to the previous theorems, a finitely realizable SA system satisfies an equation of the form:

0 = b l ( y ( t - 1) . . . . . y ( t - s), u ( t - 1) . . . . , u ( t - s ) )

+ bo(u(t - 1) . . . . . u( t - s))y( t ) . (14)

The existence of a regression-type equation is tied to the possibility of solving the last equation for y( t ) . One possibility is to use the implicit function theorem to find a solution in some open set. Or, even better, we can divide through by b0:

b l ( y ( t - 1) . . . . . y ( t - s), u( t - 1) . . . . . u(t - s))

y ( t ) = (15) bo(u(t- 1) , . . . , u(t- s))

The last expression, however, is only valid at points where b04=O (this is also the Jacobian condition of the implicit function theorem).

Since b0 is a polynomial in u ( t - 1) . . . . , u ( t - s ) , it can only have a finite number of zeros. Thus the last expression gives a rational regression-type difference equation valid for all points, except at the zeros of the polynomial b0. Therefore we have proved the following lemma.

Lemma . Any finitely realizable SA system


satisfies a regression-type difference equation, almost everywhere.

Finally, consider the approximation of the rational function in equation (15) by polynomials. Assume that the inputs are all bounded and that s (the order of the difference equation) is finite. By continuity of the input-output map, the outputs will also be bounded.

Select a compact set in the 2s-dimensional space of the ys and the us, where the Weierstrass theorem can be applied to give a polynomial approximation:

y(t) = p ( y ( t - 1) . . . . . y ( t - s ) ,

u ( t - 1) . . . . , u ( t - s ) ) (16)

to the rational difference equation. This approximation is valid everywhere on a compact set that does not include the zeros of bo.

Combining this argument with the theorems of this section, we have the following result.

Theorem. Every continuous input-output map with bounded inputs can be approximated arbitrarily well, over a finite period of time, by a system satisfying a regression-type polynomial equation, except in the neighborhood of a finite number of points.

Remarks (1) The result of this theorem constitutes the

justification for developing a modeling method for nonlinear systems based on difference equations.

(2) From a practical point of view, the restrictions of the last theorem do not appear too serious. Since the modeling has to be done over a finite number of samples, the finite time constraint is always satisfied. The second limitation, namely, that there are points where the approximation is not valid, is more serious. However, since we are dealing with a finite number of points, it is unlikely (though certainly not impossible) that the problem will appear.

MODELING OF NONLINEAR SYSTEMS

The usual description of the input-output characteristics of a single input-single output nonlinear system is a discrete Volterra series of the form:

y(t) = ho + ~ hliu(t - i) i=0

+ ~ ~ h i i u ( t - i ) u ( t - j ) + " " (17) i=0 j=O

where ho, hli, etc., are the Volterra kernels.

For a Volterra series description of a nonlinear system, consider that, in order to obtain a good approximation to the first-order kernels, it is necessary to estimate about 30 points (Billings, 1984). To obtain the same accuracy for the second-order terms requires (30 x 30)/2 = 450 points, taking into account the symmetry. The third-order kernels require (30 x 30 x 30)/3 = 9000 points. Thus, a third- order approximation to the input-output map requires the estimation of more than 9000 points. Consequently, the number of input- output data required to estimate all the parameters is extremely large. The estimation process itself requires an enormous amount of computations.

To circumvent these problems, we propose to describe the input-output behavior of a system by an equation of the form:

y(t) =p(y ( t - 1) . . . . , y(t - s),

u(t - 1) . . . . , u ( t - s ) ) . (18)

It is basically a generalization of the concept of an ARMA (autoregressive moving average) process. The rationale behind the choice of this model is the fact that the output contains a great deal of information about the state of the system.

Models of type (18) have been proposed in the literature, and have been called NARMA (nonlinear ARMA) (Billings, 1984). Previous research does not consider the relationship between difference equations and Volterra series, or state space realizations, which is the major focus of this work.

One of the main advantages of the difference equation description is the number of parameters required for its description. Generally, less than 30 parameters suffice for a polynomial difference equation. Hence, from a practical point of view, the difference equation is more convenient than the Volterra series. In addition, as we shall see in the next section, once the difference equation description is known, the Volterra kernels can be computed easily.

First, assume that the initial state is zero and that u ( t )=0 , t < 0 and that the system is described by a polynomial difference equation of the form (18). The next result provides a theoretical justification for the proposed procedure.

Theorem. If a nonlinear system satisfies a polynomial difference equation of type (18) with degree m, then the response of the system to an input o~u(t), where a~ is an arbitrary real


parameter, is given by a polynomial in ct as

y( t ) = cryl(t) + ol2yz(t) + . . . + cv'yr(t) (19)

of degree r (r < zc), where yi(t) is a function of t, to be determined.

Proof. For t=O, the condition is trivially satisfied. For t = 1,

y(1) = p ( y ( 0 ) , 0 . . . . . 0, cvu(0), 0 , . . . , 0)

and from (18),

y(1) = ayl(1) + a~2y2(1) + . ' ' + c~my,,(1). (20)

For constant y(0) and u(0), this is a polynomial in c~. Also, no constant term is needed, because when c~ = 0 (zero input), the system remains in equilibrium.

This also gives a polynomial in c~ for y(2), etc. Using induction, we will now prove that the result holds for all t.

Assume that the result is true for t - 1 , . . . , t - s , i.e. y ( t - 1 ) . . . . . y ( t - s ) are each a polynomial in ot. Obviously, c~u(t - 1) . . . . . a u ( t - s) are also polynomials in a~.

Each monomial in p [ y ( t - 1) . . . . , y ( t - s), ocu( t - 1) . . . . . a ' u ( t - s ) ] then has the form:

aod"y(t - 1) n l . . . y( t - s)'~u(t - 1) ml-. •

u ( t - s ) m~ (21)

where m = ~ mi. i=1

Because the ys are polynomials in a~, this expression gives a product of polynomials in c~ which is itself also a polynomial in 5.

Therefore, y(t) = p ( y ( t - 1) . . . . . y( t - s), u ( t - 1 ) . . . . . u ( t - s ) ) is a polynomial of the form:

y( t ) = a'yl(t ) + ol2yz(t) + oc3y3(t) + . - . (22)

as asserted. []

In what follows, we exploit this property to obtain the Volterra series of a system, from its difference equation description.

Consider a general polynomial regression equation of the form:

y(t) = p ( y ( t - 1), y( t - 2) . . . . .

y ( t - s ) , u ( t - 1 ) . . . . . u ( t - s ) ) (23)

with p a polynomial function of, say, degree m. In order to simplify the analysis to follow,

define Z( t ) to be the vector:

Z(t ) =-- [y(t - 1) . . . . . y( t - s),

u ( t - 1 ) . . . . . u ( t - s ) ] T. (24)

This notation allows the polynomial p to be

expressed as

y(t) = FIZ(t ) + F2Z~Z)(t)

+" " + F,~Z(m)(t) (25)

where Z(°(t) is the Kronecker symbol defined in Appendix A, and each F, is a coefficient matrix of dimension 1 × (2s)(

In a practical application, the coefficient matrices F, are obtained from the input-output data.

Now, consider the response of the system, when the input ore(t) is applied, where ol is an arbitrary real parameter. This response can be expressed as a polynomial in c~ of the form:

y(t) = &'yl( t ) + c~2yz(t) + ff3y3(t ) + . - ' (26)

with yi(t) a function of t, yet to be determined. Replacing this expression for y in the

input-output equation (25), an equality of two polynomials in o~ is obtained. Since c~ is arbitrary, the coefficients of similar powers of oL should be equal. In this way it is possible to obtain a sequence of difference equations for yl(t), y2(t), etc.

We will illustrate the calculation of the first few kernels. When the input is ore(t), the corresponding vector Z( t ) is:

Z( t ) = [ y ( t - 1 ) , . . . , y ( t - s ) ,

o~u(t- 1) . . . . . om(t - s)] v (27)

which, in view of equation (26), may be expressed as:

Z( t ) = o~Zl(t) + ol2Zz(t) + ot3Z3(t) + " " (28)

with

Zl(t) = [ y l ( t - 1 ) , . . . , y l ( t - s ) ,

u ( t - 1) . . . . . u ( t - s ) ] T (29)

and

Zi(t) = [ y i ( t - 1) . . . . . y i ( t - s), 0 . . . . . 0] T (30)

i = 2 , 3 . . . . .

These expressions, together with equation (25), will provide a method for the calculation of the Volterra kernels. But first, we need some Kronecker products in the development to follow. Consider the following:

Z (2) = (ot 'Z 1 -I-- (~'2Z 2 -t- (x3Z3 + . • .)

@ ( ~ ' Z 1 q-- 0{2Z2 -.[- 0{3Z3 -[.-. • .) (31)

Z (2) ~__ t-g2Zl 2) -+- t~.3(Zl @ Z2 q-- Z 2 (~ Z l )

A t - (~l,4Z~5) + . . . .

or,

Similarly,

(32)

Z (3) = o (3Zt 3) --t-. " " . (33)


Using these expressions in equation (25), along with equation (28), equating it to (26), and then keeping terms up to third order in or,

ayl(t) + ce2y2(/) + a'3y3(/) + . . .

= ocFiZl + c~2(F1Z-,z + F2Z~ 2))

-+" IT3[F1 Z3 --t- F2(Z ' @ Z 2 "at- Z 2 @ Z l )

+ F3Z~ 3)] + . . . (34)

where the dots indicate terms of degree three or higher in o~.

Equating coefficients of like powers of or:

y~(t) = F~Z~(t) (35)

yz(t) = F~Z2(t) + FzZtZ)(t) (36)

Y3(t) = F, Z3(t) + F2(Z, ~ Zz + Z2 ~ Z , )

+ F3Z~3)(t) (37)

Recalling the definition of Z;, i = 1, 2, 3, and replacing them in the equation above:

y,( t ) = F ~ y l ( t - 1) + . . . + F ~ y l ( t - s )

+ F ~ u ( t - 1) + - . . + F ~ u ( t - s ) (38)

where we have defined

F~' = F~.j (39)

and F~' = Fa.j+,. (40)

This gives a linear difference equation in yl(t) .

Similarly,

y2(t) = F { y 2 ( t - 1) + . . .

+ F~y2(t - s) + F2Z~2)(/). (41)

The last equation may be written as:

yz(t) - F{yz( t - 1) . . . . . F~y2(t - s)

= F2z~Z)(t) (42)

which is another linear difference equation with a forcing term that is already known [from the solution of the equation for y~(t)].

In a similar way, for cr 3,

y3(t) -- F~y3(t -- 1) . . . . . F~y3(t - s)

= F3Z]3)(t) + F2(Z1 ® Z2 + Zz ® Za)(t). (43)

Each of the expressions for y~(t), i = 1, 2, 3, is a linear difference equation with a forcing term depending on y~-t(t), yi-z( t ) , etc. These equations can be solved in closed form to produce the expressions for the Volterra kernels. Thus, the problem of obtaining the Volterra kernels has been reduced to the solution of a sequence of linear equations. The coefficients,

F~, of these equations are obtained from the original difference equation description.

In order to find a closed form expression for y~(t), we consider a state space realization of the difference equation (38) as follows.

Let us define the shifting operator E as E y ( t ) = y ( t + 1). Using this operator, the difference equation (38) may be written as:

y1(t) = E- ' [F{y1( t ) + F~u(t) + E -1

x {F~y1(t) + F~u(t) + . . . + E - '

× (F~y(t) + F ~ u ( t ) ) . . . }]. (44)

Now introduce the state variables:

xx(t) = yl(t)

xz(t ) = - F Y x l ( t ) - F~u(t ) + Xl(t + 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

x,( t ) = - F ~ _ , x l ( t ) - F~_,u( t ) + x ,_ , ( t + 1).

Using these expressions in the original difference equation (38) yields,

x,(t + 1) = x , ( t ) .

These equations provide the so-called first canonical form state realization of the difference equation. It may be described by the following matrices:

I F{ 1 0 0 . . . ~~1 F{ 0 1 0 . - . A = :

:

F~ 0 0 0 . - .

(45)

B - IF1 . . . . . F2] T

C=[1 , 0 . . . . . 01 D = O .

With this model, y~(t) is given by: t--1

y~(t) = ~ C W - i - ~ B u ( j ) . (46) j=0

Because of the special structure we are assuming for the different matrices, this expression becomes:

where

t--1 yl( t) = ~'~ g ( t - ] ) u ( j ) (47)

j=0

g(J) = CAi -~B = 2 (AJ-I)~,F~, (48) n=l

with (AJ)ln, the 1, n-th entry of W. To find a closed form solution for yz(t) we

need to compute F2Z~Z)(t), which is given by the


following expression:

FzZtZ)(t) = 2 2 F Y Y y l ( t - i ) y I ( t - J ) i= I j = l

+ 2 2 F ~ ' y l ( t - i ) u ( t - j ) i ~ l j = l

+ 2 2 F~"u(t-i)u(t-j) (49) i=1 j = l

where we have defined F~ y, F~ y and F~" as elements of Fz, whose meaning should be clear from this equation. To further illustrate the meaning of these symbols, consider for example that s = 2. Then, FzZ~Z)(t) is given by:

FzZtZ)(t) = [F{{, F{~, F{~ . . . . . F ~

y ( t - 1 ) y ( t - 1)

y(t - 1)y(t - 2)

y ( t - 1 ) u ( t - 1) x

y(t - 1)u(t - 2)

u ( t - 2)u(t - 2)

Replacing the expression obtained before for yl(t), we obtain a quadratic polynomial in the u(i)s.

Next, we solve the (linear) difference equation for y2(t), using a state realization with A as defined in equation (45) and,

/32 = [F{, F~, . . . , F,r] v

C = [ 1 , 0 . . . . ,01 o = a .

This state space model may be obtained in the same way as the one for yl(t). Solution of the state equations (with zero initial conditions), gives:

t--1

yz(t) = ~, h(t - j)F2ZtZ)(j) + F2Z~2)(/) (50) j = 0

where

h(t) = CA'- tBz = 2 (At-1)lnFln" (51) r l= l

The last expression may be simplified some- what, if we define h(0) = 1 which gives,

y2(t) = 2 h(t-j)F2Z~2)(]). (52) j = 0

As seen before, FzZtZ)(j) is a polynomial in the inputs u(i). Hence, y2(t) is a homogeneous quadratic polynomial in the us.

By replacing yl(t) in Zx, and plugging this in the expression for yz(t), one obtains expressions for the second order Volterra kernels.

Higher order Volterra kernels can be calcu- lated in the same way. In Appendix B,

expressions for the second- and third-order kernels are provided.

Remarks All the elements of the Volterra kernels can

be obtained using this approach. This computation is based only on the coefficients of the original difference equation.

The method requires only the computation of the matrix power A k (where A is an s × s matrix). All calculations involve these matrices and the coefficient matrices Fk of the difference equation. Since s is usually small (it is the order of the difference equation), A k can be obtained by direct multiplication without any loss of accuracy, for values of k reasonably small.

In our implementation of the algorithm, the matrix A is never formed. Instead, the vectors g(t) and h(t) are computed recursively from the vectors F r and F u, using equations (48) and (51).

Example 1. Consider a system described by the following input -output equation:

y(t + 1) = ay(t) + d y ( t - 1 ) u ( t - 1) + bu(t)

y(0) = y ( - 1) = 0. (53)

The response of this system to an input o:u(t) is given by

y(t) = ayl(t) + oc2yz(t) + c~3y3(t) + . . . . (54)

When this expression is replaced in equation (53) and the coefficients of equal powers of o~ are equated, the following sequence of difference equations is obtained:

yl(t + 1) - ayt(t) = bu(t)

yz(t + 1) - ay2(t) = dyl(t - 1)u(t - 1)

y3(t + 1) - ay3(t ) = dy2(t - 1)u(t - 1).

In general,

yi(t + 1) - ayi(t) = dyi_l(t - 1)u(t - 1).

Solving for yl(t):

yl(t) = 2 a'-Jbu(j - 1) /=1

and for y2(t):

yz(t) = 2 a(t-l) dyl(l - 2)u(l - 2) 1=1

o r ,

y2(t) ~ t-2 = ~ a ' -Z t - i -Zdbu ( j - 1 ) u ( l - 2) t = l /=1

which is a term of second order in u.


Similarly, m--2 l--2

Y3(t) = Z ~ at-2m-/-4d2bu m = l I=1 j = l

x (j - 1)u(l - 2)u(m - 2).

All successive terms can be obtained in a similar fashion. The output of the system to an input u(t) is given by,

y(t) = yl(t) + y2(t) + " "

which has the required Volterra series form. Additional transformations on the kernels give any required form (triangular, symmetric, regular, etc.).

One important observation at this point is the fact that the difference equation description of the system requires far fewer parameters than the Volterra series. The Volterra series can later be obtained from the difference equation, if necessary.

A MODELING ALGORITHM FOR NONLINEAR SYSTEMS

In this section, we propose a new method for modeling a nonlinear system. It uses a polynomial difference equation to approximate the input-output behavior of the system. This polynomial may be obtained by least-square approximation or any other means.

Once the difference equation is obtained, we proceed to generate an SA realization using a modification of an algorithm proposed by Costanza et al. (1983), based on the work of Sontag.

It is assumed that the system is initially at rest, and that the input u ( t )=0 , t < 0 and that (u = 0, x = 0) is an equilibrium point. Let the system be described by a finite sequence of input-output pairs: {(u(t), y(t)) , t = O, 1 . . . . ). The input-output response map of the system is designated as f. The modeling algorithm proceeds in the following way.

(1) Identify a polynomial regression-type difference equation that approximates the given input-output data. The response map described by this description will be denoted by/ .

(2) The behavior matrix B ( f ) of the system will be approximated by B ( f ) (the behavior matrix corresponding to f) . The elements of B ( f ) can be obtained from the difference equation using the same procedure employed for the calculation of the Volterra kernels. Some simplifications are possible at this stage: one can retain only terms of order less than some predetermined number.

(3) Construction of an SA realization. Since the

rank of B(f ) is unknown, the algorithm starts with a submatrix q~n of rank n = 1, and produces an (approximate partial) realization of dimension 1. The input-output behavior of this realization is compared to the original data. Next, search for a submatrix q~,+l with rank n + 1. The process is iterated until the input-output data are well approximated, or until the rank stops increasing.

This algorithm generates a sequence of SA models of increasing dimension. The term approximate partial realization (APR) refers to these realizations, obtained by approximating B ( f ) by B(f).

The problem of finding a nonsingular matrix of a given dimension is very sensitive to numerical inaccuracies. Therefore, a very stable algorithm developed by de Jong (1978), and first applied to SA systems by Costanza et al. (1983), was used.

The definition of behavior matrix B( f ) , is based on the following expression for the response of the system to an input sequence w:

f ( w ) = ~ 6,~(w)a~,. o t

In our approach, the function 6,.(u)s are given as:

6 i ( u ) = u i i = 0 , 1 . . . . . r.

Therefore, at any time t + l , as is the coefficient of u ( O ) ~ u ( 1 ) ~ l . . . u ( t ) ~ in the expression for the output y ( t + l ) , where a" = (ao . . . . . a~,}.

The maximum degree r of the approximations is chosen to be a small integer, that may be specified. (Usually, it is 2 or 3.)

The calculation of the entries of B(f) requires that the response of the system be determined in closed form, as a function of the input variables. This may be obtained directly from the difference equation, using the procedure developed in the previous section, to obtain y(t). The expressions given in Appendix B may be used for this purpose. For example,

y(2) = alou(O) + aolu(1) + alxu(O)u(1)

+ a2ou(0) 2 + aozu(1) 2 + a21u(O)2u(1) + . . .

(55) using equation (B.1) and subsequent equations from Appendix B.

This provides all the necessary elements of the behavior matrix, simply by taking successive values of time.

In order to determine the rank of B ( f ) in a

638 H. DIAZ and A. A. DESROCrtERS

numerically stable manner, we start with a nonsingular submatrix of dimension 1 × m [a nonzero row of a submatrix of B(f ) ] . Use is made of the following property of matrices.

Theorem. If rank (Hk, , , )= k, then Hk+l.,, has a decomposition

M~+l,k+lHk+t,,,,P = R k + l . m (56)

satisfying: Mk+l,k+~ is an orthogonal matrix, P is a permutat ion matrix and Rk+x.m is upper trapezoidal,

R k + l , m =

* ;~ * * * • . . *

0 * * * * " " " *

0 0 * * * " " *

0 0 0 0 * " " *

(57)

where the stars ( * ) , indicate (possibly) nonzero elements. In addition, the last row of Rk+l,r, is zero, iff rank (Rk+Lm) is k. Otherwise, the rank i s k + l .

The purpose of the matrix P is to guarantee that the element with the largest absolute value is on the diagonal. This is very important from a numerical point of view.

Thus, we start with the first nonzero row and add rows, one at a time, until a maximum dimension is reached, while maintaining a decomposition of the form (57). If the rank increases, the row is kept in R, otherwise, it is discarded and R restored to its original value.

Also necessary is a procedure to update the decomposition (56), when a new row is appended. This is required in order to avoid recomputing the complete decomposition every time a new row is added. Starting with the decomposition (57), apply the same decomposition to the new matrix Hk+l,m:

= R2+ ,m. (SS)

Now we eliminate the elements of the vector qk+l under the diagonal, using plane (Givens) rotations. Let S be the product of the plane rotations needed to bring R'k+l , m into the upper trapezoidal form, and let

Mk+lk*t=S[ M k ' ~ ' 0 ~] (59)

which is also an orthogonal matrix and transforms H~÷Lm into the required form. If the new row in the decomposit ion is not zero, the rank increases. Otherwise, the matrices m and R

are restored to the values they had before adding the row. An orthogonal transformation M is used because it doesn' t change the norm of the rows of the matrix.

The exact steps of the algorithm are as follows.

(1) Use the input -ou tput data to identify a polynomial difference equation of the form of equation (23). This is then put into the form of equation (25).

(2) Use the coefficients of the difference equation to generate B(f), the behavior matrix, as explained in equation (55).

(3) Find a nonzero row of B(f) and define q~ and q~i, i = 1, . . . , r. Set n = 1 (dimension of system = 1). Let MH = 1.

(4) Compute an SA model (Sontag, 1979b). (5) Save matrices R and M. Consider the next

row of B(f). Use equation (58) to add the row to H.

(6) Eliminate elements under the diagonal of H to produce an upper trapezoidal matrix.

(7) Permute columns of H to insure that H.+l..+~ has the maximum absolute value.

(8) If H.+t , .+ t = 0 then restore R and M to their original values. Go to step (5). Else, let n = n + l . If there are rows not yet considered go to step (4).

(9) End.

EXAMPLE OF THE APPLICATION OF THE ALGORITHM

Example 2. In this example, a dog was subjected to a t reatment in which a drug (Nitropruside) was infused into the dog's blood to control the blood pressure.

The input function in the set of input-output data is the drug infusion rate in mlh -1. The output is the mean arterial pressure (MAP) of the dog, measured in mm Hg.

In the actual experiment, the mean arterial pressure was measured over a period of time, prior to the administration of the drug. These measurements were then averaged and sub- tracted from all subsequent measurements.

We used the resulting set of input-output data pairs to obtain a state space model of this system. Figures 1 and 2 show the input and the output of the experiment.

A polynomial difference equation was then obtained from the input -output data:

y(t) = 0.712y(t - 1) + 0.084y(t - 2)

+ 0.055y(t - 3) + 0.019u(t - 1)

- 0. l l 8 u ( t - 2) - 0.0003u(t - 3)


5 0 . 0 0

Dog experiment input function

40.00

30.00

'5 e~

20.00

10.00

0.00 50.00 100.00 150.00 20000 250.00 Time

F{G. l. Drug infusion rate (ml h-l).

+ 0.014y(t - 1) 2 + 0.018y(t - 1 ) y ( / - 2)

- 0.066y(t - 1)y(t - 3) + 0 .011y( / - 2) 2

+ 0.007y(t - 2)y(t - 3) - 0.003y(t - 3) 2

- 0.014u(t - 1 ) y ( / - 1)

+ 0.007u(t - 1)y(t - 2)

- 0.022u(t - 1)y(t - 3)

- 0.009u(t - 2)y(t - 1)

+ 0.004u(t - 2)y(t - 2)

- 0.002u(t - 2)y(t - 3)

+ 0.018u(t - 3)y(t - 1)

- 0.021u(t - 3)y(t - 2)

- 0.001u(t - 3)y(t - 3)

+ 0.001u(t - 1) 2 - 0.002u(t - 1)u(t - 3)

- O.O03u(t - 2) 2 + O.O02u(t - 2)u(t - 3).

4000 I F Original system (solid line)

20.00F Nonlinear appr. (dashed)

/

000

-20.00

-40.00

-60.00 [

FIG.

000 50.00 100.00 150.00 200.00 250.00 Time

2. Responses of the nonlinear model and original system (mm Hg).

From this difference equation, the modeling algorithm computed the following state space model:

x(t + 1) = [Fo + u(t)F1 + u(t)2F2]x(t) + u(t)Gl + u(t)2G2

y(t) = [H0 + u(t)Hllx(t)

where

I 0.8088 1.0 0.3614 1 Fo = 0.0857 0 . 0 - 0 . 2 9 6

-0.1692 0.0 0.0898

I 0 . 0 2 4 7 - 0 . 0 2 4 1 0.00491

O OlO o ooo4/ -ooo -ooo -OOOl j

[ oooo oooo1 oo] -o.ooo o.ooo o.o

-0.0002 0.0001 0.0

G1 = [0, 1, 0] x

G2 = [0.0151, -0.0289, 0.0085] T

Ho = [-0.1024, 0.019, -0.0539]

/-/1 = [-0.0031, -0.002, -0.0004].

We simulated the response of this nonlinear model to the input sequence given. The results are shown in Fig. 2.

CONCLUSIONS

The problem of modeling nonlinear systems from input-output data has been considered. The main contribution is the relationship between the polynomial difference equation description of a nonlinear system and the corresponding Volterra series. Also considered was the problem of generating a state space model of the nonlinear system from a difference equation description.

The number of parameters required by the difference equation approach is significantly smaller than the number necessary for the Volterra series description.

We proved that the difference equations can be used to approximate the input-output behavior of a large class of nonlinear systems, over a finite interval of time. The conditions under which this approximation is possible were also established. The class of systems that can be described is more restricted than the class of systems described by Volterra series.

A method for obtaining the Volterra kernels from a difference equation description of a system was also introduced. This, combined with a parameter estimation method, constitutes a


new method for the determination of the Volterra kernels for a given system.

The implementation of the modeling method consists of two basic blocks: a nonlinear regression algorithm to estimate the coefficients of the difference equation from the input-output data, and a state affine realization algorithm, to compute a state space model. The time and memory requirements of the realization step are very small, compared to the regression program.

REFERENCES

D'Alessandro, P., A. Isidori and A. Ruberti (1974). Realization and structure theory of bilinear systems. SIAM J. Control, 12, 517-535.

Billings, S. A. (1984). Identification of nonlinear systems. In Billings, S. and J. Gray (Eds), Nonlinear System Design, Chapter 2. Peter Peregrinus, London.

Boyd, S. and L. Chua, (1984). Fading memory and the problem of approximating nonlinear operators with Volterra series. Memo No. UCB/ERL M84/96, Electron- ics Research Laboratory, University of California, Berkeley, November 1984.

Brockett, R. (1976). Volterra series and geometric control theory. Automatica, 12, 167-176.

Costanza, V., B. Dickinson and E. Johnson (1983). Universal approximations of discrete-time control systems over finite time. IEEE Tram. Aut. Control, AC-28, 439-452.

Diaz, H. (1986). Modeling of nonlinear systems from input-output data. Ph.D. Thesis, Electrical, Computer and Systems Engineering Department, Rensselaer Poly- technic Institute, Troy, NY 12180-3590.

Dickinson, B. (1983). Modeling of nonlinear systems from input-output data. Proc. 22nd Conf. Decision and Control, pp. 641-642. San Antonio, Texas.

de Figueiredo, R. (1983). A generalized Fock space framework for nonlinear system and signal analysis. IEEE Tram. Ccts Sysc, CAS-30, 637-647.

de Figueiredo, R. and T. Dwyer (1980). A best approximation framework and implementation for simulation of large-scale nonlinear systems. IEEE Trans. Ccts Syst., CAS-27, 1005-1014.

Fliess, M. (1978). Un codage non commutatif pour certain systemes echantillonnes nonlinearies. Inf. Control, 38, 264-287.

Fliess, M. and D. Normand-Cyrot (1982). On the approximation of nonlinear systems by some simple state-space models. IFAC Syrup. on Identification and System Parameter Identification, Washington, DC.

de Jong, L. S. (1978). Numerical aspects of recursive realization algorithms. SIAM J. Control Opt., 16, 646-659.

Krener, A. J. (1974). Linearization and bilinearization of control systems. Proc. 1974 Allerton Conf. Circuit and @stem Theory.

Lesiak, C. and A. Krener (1978). The existence and uniqueness of Volterra series for nonlinear systems. IEEE Trans. Aut. Control, AC-23, 1090-1095.

Normand-Cyrot, D. (1981). A group theoretic approach to the input-output description of nonlinear discrete-time systems. Proc. 20th Conf. on Decision and Control, San Diego, pp. 551-557.

Normand-Cyrot, D. (1982). An algebraic approach to the input-output description of nonlinear discrete-time systems. Proc. American Control Conf., Arlington, pp. 466-471.

Pitt, J. F. (1950). Differential groups and formal Lie theory for an infinite number of variables. Ann. Math., 52, 708-726.

Rugh, W. (1981). Nonlinear System Theory, The Volterra/Wiener Approach. The Johns Hopkins University Press, Baltimore, MD.

Schetzen, A. (1980). The Volterra and Wiener Theories o~ Nonlinear Systems. John Wiley, New York.

Sontag, E. (1979a). Realization theory of discrete-time nonlinear systems: 1. The bounded case. IEEE Trans. Ccts Syst., CAS-26, 342-356.

Sontag, E. (1979b). Polynomial Response Maps, Lecture Notes in Control and Information Sciences, Vol. 13. Springer, Berlin.

Wiener, N. (1958). Nonlinear Problems in Random Theory. MIT Press, Cambridge, MA.

APPENDIX A: KRONECKER PRODUCT NOTATION

In order to describe more general polynomial input- output equations, a more convenient notation is necessary. The Kronecker tensor product notation has been used extensively in the past for this means.

Let A and B be matrices of dimensions n x m and k x l, respectively. The Kronecker product of A and B is defined as the nk × ml matrix:

A ® B = ! (A.1)

a~,,B . . a ~ . B j

This operation has some desirable properties, such as:

(10) A ® (B ® C) = (A ® B) ® C (associativity) (11) ( A + B ) ~ ( C + D ) = A ® C + B ® C + A ® D + B ~ D (12) (AB) @ (CO) = (A ~ C)(B ® D) (13) A @ B = 0 i f a n d o n l y i f A = 0 o r B = 0 .

The Kronecker notation permits a description of polynomials and power series in several variables in a simple way, as follows.

Let X = [xl . . . . . x~] T, then,

X ® X = [x L xlx2 . . . . . x,x°, x2x~, x~ . . . . . x~]

contains all possible second-degree monomials in the components of X. Similarly, X ® X ® X contains all possible third-degree terms, etc.

Let us define the p-fold Kronecker product of X as:

X ~') = X ® X ®- • • ® X (p times). (A.2)

Notice that X (2) contains some repeated terms, e.g. xtx2 and x2x t. This does not add to the complexity of our equations, since these terms are always additive. Their coefficients may be assigned arbitrarily, as long as their sum gives the correct result. We will assume they are symmetric, i.e. the coefficient of x~x 2 is equal to that of xzx t. The same will hold for higher-order terms.

Kronecker products also simplify multivariable Taylor series expansions. If f (X) is a real analytic function, then its Taylor series approximation around X = 0 is given by:

f ( X ) = f ( O ) + F1X + F2X(2) + g3X(3) + " ' " (A.3)

where the ~s are matrices of appropriate dimension, expressing the i-th derivatives of the function f evaluated at X = 0 .

APPENDIX B: EXPRESSIONS FOR THE VOLTERRA SERIES

In this Appendix we present expressions for the Volterra series, up to third order.

In the next sections, all summations, where the limits are not explicitly indicated, run from 1 to s, e.g.

2 m e a n s 2 2 . i.1 i = l j = l

We also use the following definitions very often:

g(t) = 2 (A'-I) , ,F~, (B.1) a = l

M o d e l i n g n o n l i n e a r s y s t e m s 641

and

{]~= t=O (B.2)

h(t)= (A'-I)lnFrl. t > 0 1

The Volterra series is given by the following expression:

y(t) =yt(t) +y2(t) +y3(t) + . . . (B.3)

where the different components are defined in the next three paragraphs.

The first-order elements are given by the following expression:

t - - I

yl(t) = '~ g ( t - r)u(r) r~O

where g is as defined above. The second-order part of the Volterra series is as follows:

yz(t) = ~ h ( t - r) Fi/ u( - i ) u ( r - j ) r~O ,

r - - i - I [ + ~, F T g ( r - i - r,)u(r3u(r - j )

ri=O

+ F~[g(r-i-r ,)u(ri)u(r/)]) }. q = O

The third-order terms are given as:

Y3 = h(t - r) h(r - j -p )F~ u r=O i,j

× Y, t G ~ u ( , - i)u(p - il)u(p - / 0 tl,J1

+ p-~,-t g(P _ i, - rl)[Fi~Y/,u(r -i)u(ri)u(p -Jr) rico

rj=O r--i--1

rl =O

[,1 -- x ~ Fi,j,"" (~,)u(p ,Ou(p" -i ,) t],]l

P--/l-1 [ uy

+ ~ g ( p - i, - ri) Fit / l . (r l) . (r i)u(p - - / 1 ) q = o

+ g(P -J l - r/=O

+ ~ F~/~Uu(r -- i)u(r - - j lu(r -- k) k

r - i - I

2 { ' " " r - - t - - + Fi/k g( ri)u(rl)u(r -j)u(r - k) ri=O

• --j -- 1

+ ~ [F~[k"g(r -- i -- ri)g((r -- j -- rj)u r/=O

X (ri)u(r/)u(r -- k) r -k - I

~-~, yyy • + Fqk g ( r - t - ri)g(r - j - r/)g rk=O

t. These are the expressions used in our algorithm. Similar

expressions can be obtained for higher-order terms.

modeling of nonlinear discrete-time systems from input-output data

Documents