iterative algorithms for multiscale state estimation, part 1: · pdf filepermit a restart with...

27
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS: Vol. 111, No. 3, pp. 501–527, December 2001 ( 2001) Iterative Algorithms for Multiscale State Estimation, Part 1: Concepts 1 T. BINDER, 2 L. BLANK, 3 W. DAHMEN, 4 AND W. MARQUARDT 5 Communicated by H. J. Pesch Abstract. The objective of the present investigation is to explore the potential of multiscale refinement schemes for the numerical solution of dynamic optimization problems arising in connection with chemical process systems monitoring. State estimation is accomplished by the solution of an appropriately posed least-squares problem. To offer at any instant of time an approximate solution, a hierarchy of successively refined problems is designed using a wavelet-based Galerkin discretiz- ation. In order to fully exploit at any stage the approximate solution obtained also for an efficient treatment of the arising linear algebra tasks, we employ iterative solvers. In particular, we will apply a nested iteration scheme to the hierarchy of arising equation systems and adapt the Uzawa algorithm to the present context. Moreover, we show that, using wavelets for the formulation of the problem hierarchy, the largest eigenvalues of the resulting linear systems can be controlled effectively with scaled diagonal preconditioning. Finally, we deduce appropriate stopping criteria and illustrate the characteristics of the solver with a numerical example. Key Words. Dynamic optimization, optimal control, wavelets, nested iterations, iterative linear algebra, preconditioning. 1. Introduction Monitoring chemical process systems requires repetitive online esti- mation of the entire state, which is not accessible by measurement. Therefore, 1 This work has been supported by the Deutsche Forschungsgemeinschaft under Grant MA11886. 2 Research Scientist, Lehrstuhl fu ¨ r Prozesstechnik, RWTH, Aachen, Germany. 3 Associate Research Scientist, Institut fu ¨r Geometrie und Praktische Mathematik, RWTH, Aachen, Germany. 4 Professor, Institut fu ¨ r Geometrie und Praktische Mathematik, RWTH, Aachen, Germany. 5 Professor, Lehrstuhl fu ¨ r Prozesstechnik, RWTH, Aachen, Germany. 501 0022-3239011200-0501$21.500 2002 Plenum Publishing Corporation

Upload: hoangdien

Post on 19-Mar-2018

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS: Vol. 111, No. 3, pp. 501–527, December 2001 ( 2001)

Iterative Algorithms for Multiscale StateEstimation, Part 1: Concepts1

T. BINDER,2 L. BLANK,3 W. DAHMEN,4 AND W. MARQUARDT

5

Communicated by H. J. Pesch

Abstract. The objective of the present investigation is to explore thepotential of multiscale refinement schemes for the numerical solutionof dynamic optimization problems arising in connection with chemicalprocess systems monitoring. State estimation is accomplished by thesolution of an appropriately posed least-squares problem. To offer atany instant of time an approximate solution, a hierarchy of successivelyrefined problems is designed using a wavelet-based Galerkin discretiz-ation. In order to fully exploit at any stage the approximate solutionobtained also for an efficient treatment of the arising linear algebratasks, we employ iterative solvers. In particular, we will apply a nestediteration scheme to the hierarchy of arising equation systems and adaptthe Uzawa algorithm to the present context. Moreover, we show that,using wavelets for the formulation of the problem hierarchy, the largesteigenvalues of the resulting linear systems can be controlled effectivelywith scaled diagonal preconditioning. Finally, we deduce appropriatestopping criteria and illustrate the characteristics of the solver with anumerical example.

Key Words. Dynamic optimization, optimal control, wavelets, nestediterations, iterative linear algebra, preconditioning.

1. Introduction

Monitoring chemical process systems requires repetitive online esti-mation of the entire state, which is not accessible by measurement. Therefore,1This work has been supported by the Deutsche Forschungsgemeinschaft under GrantMA1188�6.

2Research Scientist, Lehrstuhl fur Prozesstechnik, RWTH, Aachen, Germany.3Associate Research Scientist, Institut fur Geometrie und Praktische Mathematik, RWTH,Aachen, Germany.

4Professor, Institut fur Geometrie und Praktische Mathematik, RWTH, Aachen, Germany.5Professor, Lehrstuhl fur Prozesstechnik, RWTH, Aachen, Germany.

5010022-3239�01�1200-0501$21.50�0 2002 Plenum Publishing Corporation

Page 2: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001502

process states and other derived quantities must be determined via model-based state estimation employing measured data. State estimation can beformulated as a constrained optimization problem, where the differencebetween modeled outputs y and measured data z on a receding horizon isto be minimized (Refs. 1–3). Note that z is a continuous representation ofthe discrete measurements z(tk) at sampling times tk not necessarily equidis-tant. Usually, the dynamic behavior of the plant is modeled by a large-scalesystem of differential-algebraic equations to predict, in addition to y, thestates x for given controls u. Moreover, additive model correction terms areintroduced into the model equations as functions û and w which representunmodeled phenomena. These functions have to be estimated as well. Thesemodel equations and additional inequalities, reflecting for example physicalbounds on the states, are the constraints in the optimization problem. Actu-ally, the introduction of model correction terms enhances ill-posedness; i.e.,nonuniqueness as well as high sensitivity to inevitable measurement noisemay occur (Refs. 4–5) and the generalized inverse is discontinuous (Ref. 6).In the context of dynamic data reconciliation (Ref. 4), one major regulariz-ation approach is the inclusion of û and w as quadratic penalty terms in thecost functional. In addition, the unknown initial values of the states couldbe included. This is related closely to the Tikhonov regularization (Ref. 7).For a more detailed discussion of the form of the objective function, werefer to Ref. 4.

The key requirement is to provide a reliable state estimate at anyinstant of time, since the time required to calculate a solution as well as thetime available in a multitasking workstation environment are not knownbeforehand. Further, the estimation problem need to be completed withinthe cycle time interval of the monitoring system. One faces the followingdifficulty when employing established solvers based on a fixed a priorichosen discretization. On the one hand, to guarantee the availability of anestimation within an expected time interval, one may choose a discretizationthat will then turn out to be too coarse while the remaining time may notpermit a restart with a refined discretization; on the other hand, if theresolution is chosen too fine initially, there may be no result at all withinthe cycle time interval of the estimates.

Therefore, as an alternative, we propose to develop a suitable hierarchyof optimization problems with increasing resolution. Thus, already after ahopefully very short period of time, the coarsest approximate solution canserve as a minimal response. During the remaining time interval, this initialsolution is to be upgraded so that the full available time span is exploitedin an optimal way. Moreover, for any discretization level, the correspondingdiscrete problem has to be solved only with an accuracy that is comparable

Page 3: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001 503

to the corresponding discretization error. Hence, in the very spirit of classi-cal nested iteration, the current approximation can be exploited as an initialguess and the current error has to be reduced only by a fixed factor whenprogressing to the next discretization level. Consequently, the use of iterat-ive methods suggests itself. The optimization community can nowadaysresort to a highly advanced supply of direct solvers for the linear systemsof equations that arise ultimately (possibly after linearization) from the opti-mization problem (Refs. 8–9). The above aspect makes us believe that athorough investigation of iterative schemes as solvers for these linear sys-tems of equations as an alternative is due in the present context.

Our approach is based on Galerkin discretizations. Consequently, theuse of hierarchically organized bases for realizing the above concept is quitenatural. In particular, we employ wavelets. In the context of iterative sol-vers, they are a cornerstone for constructing a good preconditioner leadingto condition numbers independent of the discretization level. Also, theyfacilitate adequate adaptive refinement strategies.

Of course, the development and validation of such a concept is a rathercomplex task. Our first experiments revealed quickly a multitude of differentinterfering effects. Therefore, we found it necessary to scale down the prob-lem to an extent that those features can be identified. Hence, we will addresshere only linear problems and suppress inequality constraints. A successfultreatment of this problem class is essential to the subsequent handling ofmore realistic problem formulations.

The objective of this paper is to develop the basic ingredients for theabove program. The outline of the paper is as follows. In Section 2, wedescribe first the model problem in more detail and discuss aspects of thesources of ill-conditioning. We proceed in Section 3 with our multiscalediscretization. In particular, the corresponding Karush–Kuhn–Tucker sys-tem (KKT system) is given in wavelet coordinates. For the solution of theoptimization problem on each discretization level, we construct and analyzea simplified Uzawa algorithm exploiting the specific problem structure inSection 4. Conceptually, this method is the application of the PCG methodto the Schur complement, which is given explicitly in Section 4.2. Section 5addresses the ill-conditioning of the KKT systems and its specific sources.We shall see that wavelet concepts can be used to control large eigenvalues.In fact, a simple diagonal scaling, referred to as (approximate) Jacobi pre-conditioner, will turn out to provide condition numbers bounded indepen-dently of the discretization level. In Section 6, we sketch the nested iterationscheme for the solution of the sequence of systems. Moreover, we deriverelevant stopping criteria based upon the preconditioned systems. We con-clude in this section formulating the complete algorithm and an application

Page 4: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001504

to a simple example. In Section 7, we summarize briefly the results and drawsome conclusions.

In Part 2 of this paper (Ref. 10), the proposed simplified Uzawa schemeand the Jacobi preconditioner will be applied to test examples with typicalobstructions. In particular, the interplay between state normalization,choice of regularization parameters, and preconditioning will be studied.

2. Estimation Problem

Our main objective in this paper is the efficient solution of the systemsof discrete equations arising from the optimization problems describedbelow. In particular, the solvers should support the multiscale concepts out-lined in the previous section. Since the same issues arise after linearizationof a nonlinear problem, we confine the discussion here to a linear modelproblem with equality constraints only.

The receding horizon will be scaled always to [0, 1]. Since all functionswill refer to this interval, we will omit in the sequel explicit references tothis domain. Then, for given z ∈ (L2)

ny , u ∈ (L2)nu , and for unknown functions

x ∈ (H1)nx , y ∈ (L2)ny, û ∈ (L2)

nû, w ∈ (L2)nw with nw⁄nx , nû⁄ny , the resulting lin-

ear optimization problem has the form

minx,y,û,w

�1

0

{(yAz)TQ(yAz)CûTRûûCwTRww} dt, (1)

s.t. xAAxAWwGBu, t ∈ [0, 1], (2)

yACxAVûG0. (3)

In principle, the weights Q, Rû , Rw could be operators, for examplechosen in such a way that the cost functional is equivalent to the square ofSobolev norms. In this way, regularity considerations could be includedelegantly in the estimation problem. Often, Q is chosen preferably as a con-stant diagonal matrix reflecting confidence in the measurements (Ref. 11).The right choice of the regularization terms Rû and Rw is still open at leastfor nonlinear problems; see Ref. 3 for a thorough discussion. Therefore,diagonal matrices are usually employed for pragmatic reasons. The matricesV and W are identity matrices, possibly extended by zero rows.

For solving the optimization problem, we reformulate the constraintsin a weak sense. Here and in the following, we denote by ⟨ , ⟩ the L2 scalarproduct,

⟨s, u⟩G �1

0

s(t)u(t) dt, for s, u ∈ L2 .

Page 5: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001 505

For convenience, we introduce for vector functions s, u ∈ (L2)n the scalar

product notation

⟨s, u⟩G �1

0

sT(t)u(t) dt

and the notation

as, ubG�1

0

s(t)uT(t) dt

for the matrix of all possible scalar products (⟨si , uj ⟩)(i, j ). In these terms, theconstraints take the form

⟨ζ 1 , xAAxAWw⟩G⟨ζ 1 , Bu⟩, for all ζ 1 ∈ (L2)nx, (4)

⟨ζ 2 , yACxAVû⟩G0, for all ζ 2 ∈ (L2)ny. (5)

For convenience, we gather the unknowns in

υ:G(xT, yT, ûT, wT)T ∈ϒ

and the test functions in

ζ_ (ζ T1 , ζ T

2 )T ∈ M ,

where

ϒ_ (H1)nxB(L2)nyCnûCnw, M _ (L2)

nxCny.

Thus, denoting by a(υ′, υ) the bilinear form with υ, υ′ ∈ϒ resulting from thecost functional in (1), and writing b(ζ , υ) for the combined bilinear formsresulting from the left-hand side of (4)–(5), i.e.,

a(υ, υ′ )G⟨ y, Qy′⟩C⟨û, Rûû′⟩C⟨w, Rww′⟩ ,

b(ζ , υ)G⟨ζ 1 , xAAxAWw⟩C⟨ζ 2 , yACxAVû⟩,

the weak formulation of the necessary conditions of the minimization prob-lem (1)–(3) reads as follows: Find υ ∈ϒ and the Lagrange multiplier

λG(µT, νT)T ∈ M such that

a(υ, υ′ )Cb(λ , υ′ )G⟨2Qz, υ′⟩ , for all υ′ ∈ϒ , (6)

b(ζ , υ)G⟨ζ 1 , Bu⟩, for all ζ ∈ M . (7)

In view of the Riesz representation theorem, the weak formulation (6)–(7) is equivalent to the operator equation

L �υλ �G�h(z, u)

g(z, u)�, (8)

Page 6: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001506

with unknowns υ ∈ϒ and λ ∈ M . Thus, in this setting, the solution λ of thedual optimization problem is automatically part of the solution.

The fact that the problem formulation (1)–(3) is well-posed means thatthe mapping L is a topological isomorphism from ϒBM to the dualϒ′BM ′ ; i.e., there exist positive constants c, c such that

c(��υ��2ϒC��λ ��2M )⁄�L �ϒλ ��2

((H1)nx)′B(Ln2xC2nyCnûCnw)

⁄ c(��υ��2ϒC��λ ��2M ). (9)

Note that some components of the unknowns are now measured in strongernorms than the L2 norm. Of course, the quotient c�c may be very large,which means that the uniquely solvable, continuous estimation problem willexhibit a very large condition number. Essentially, there are three interferingpotential origins of large condition numbers. Firstly, the original problemof state estimation without regularization (i.e., RûG0 and RwG0) may beill-posed and the way of regularization affects the constants in (9). Secondly,the underlying state equations can be very stiff. Last but not least, theobservability measure of the model equations may be very low. Observ-ability measures quantify the relative degree on how well the state x can bereconstructed from the observations. There exist quite a number of differentobservability measures (see Part 2 and Ref. 12). Which observability meas-ure quantifies best the influence on the condition numbers is not yet clear.Moreover, the regularization alters the extent to which the observabilitymeasure affects the condition of the optimization problem. Nevertheless, aquantitative understanding of this interplay is still lacking.

As discretization technique, we will apply the Galerkin method basedon wavelets. The wavelet-based Galerkin method is not new in principle, butis confined essentially to elliptic boundary-value problems and boundary–integral equations; see Refs. 13-14 and references therein. In the contextof simulation and dynamic optimization of ordinary differential equations,several related wavelet approaches appeared recently in the literature (Refs.15–17). However, to our knowledge, wavelet-based Galerkin methods forstate estimation have so far not been used and studied in the context ofnested iterations.

3. Discretization and Resulting KKT System

In order to discretize the weak formulation (6)–(7) of the optimizationproblem (1)–(3) with a Galerkin method, we have to choose appropriatefinite-dimensional subspaces ϒΛ, M Λ of ϒ, M . The idea of the refinement

Page 7: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001 507

approach presented in the introduction is based on the use of hierarchicallyorganized bases. In particular, wavelets offer the following advantages: (i)they permit a stable representation of data in a multiscale format where theinvolved coefficients represent updates with increasing resolution; (ii) theysupport preconditioning. For an introduction and overview over waveletconcepts, we refer the reader to Refs. 14, 18, 19.

To ensure possibly sparse matrix patterns, we will employ in this paperwavelets with possibly small support and hence minimal regularity. Sinceonly first-order derivatives appear in the constraints (4), globally continuouspiecewise linear spline wavelets are appropriate for discretizing the functionsx, y, û. More precisely, we employ wavelets ψ j,k determined by a dual multi-resolution of order 2, which are adapted to the interval [0, 1]; see Ref. 20.Thus, both the primal wavelets ψj,k as well as the dual wavelets ψj,k havesecond-order vanishing moments. Moreover, collecting for convenience allfunctions ψ j ,k and ψj,k in the arrays Ψ and Ψ respectively, they satisfy

aΨ, ΨbGI.

Note that the corresponding primal multiresolution is generated by theclassical hat function,

ϕ(x)_1A�x�, for x ∈ [A1, 1],

ϕ(x)_0, elsewhere.

For the last component w of the space ϒ, we choose the Haar basis ΨH.Thus, the generating scaling function is simply ϕH_χ [0,1). Note that, sincethe Haar basis is orthonormal, it equals its dual; i.e.,

aΨH,ΨHbGI.

Likewise, all components of the first group of Lagrange multipliers µ arediscretized with the aid of the Haar basis ΨH, while the dual wavelets in Ψare used for the second group of components of the Lagrange multipliers νin M Λ. Thus, the state equations (2) are tested by the Haar wavelets ψH

j,k

and the output equations (3) are tested by the wavelets ψj,k dual to thepiecewise linear wavelets ψ j,k.

Note that the chosen collections of wavelets form bases for the relevantfunction spaces. Hence, we obtain an infinite-dimensional but discretizedproblem formulation, which is still equivalent to (6)–(7). The restriction toa finite index set of employed basis functions

Λ ⊆ {( j, k) � j¤ j0 , k ∈ I j}

leads to the finite-dimensional problem, required for the numericaltreatment.

Page 8: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001508

The above-mentioned orthogonality and biorthogonality relationsensure that the Galerkin matrices obtained, when restricting (6)–(7) to suchchoices of the subspaces ϒΛ of ϒ and M Λ of M , exhibit the sparsenessstructure illustrated in Fig. 1. Of course, Λ is comprised here of finite indexsets Λxi, Λyi, Λûi, Λwi. In general, these index sets will result from an adaptiverefinement of the trial spaces for each scalar unknown and therefore maydevelop, in principle, independently of each other. However, in order tosimplify the technical exposition, we will confine the following discussion tothe case where all the states xi , iG1, . . . , nx , have the same index set Λx , allthe outputs have the same index set Λy , and likewise for the index sets ofthe remaining groups of components. Moreover, to ensure the stability ofthe discretizations, the spaces ϒΛ, M Λ have to satisfy the LBB condition(Ref. 21). This means that M Λ may not be too rich relative to ϒΛ. Specifi-cally, this requires that the index sets Λ′x for the first group of Lagrangemultipliers µ acting on (2) have to be contained in Λx. Likewise, we will

Fig. 1. KKT matrix L Λ.

Page 9: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001 509

make sure always that Λw ⊂Λ x and Λû ⊆ Λ y . For a detailed discussion of anadaptive generation of the index sets, we refer to Ref. 2.

Let us collect now in dx all the wavelet coefficients dxG(dΛxi)iG1, . . . , nx ofthe states xΛ, and likewise for dy , dû , dw , where we have suppressed forconvenience the index Λ. Inserting now the ansatz

υΛG(ΨTΛxdx ,ΨT

Λydy ,ΨTΛû

dû , (ΨHΛw)

Tdw) ∈ϒ Λ,

λ ΛG((ΨHΛ′x)

Tdµ , ΨTΛydν) ∈ M Λ

into (6)–(7) and testing with all the basis functions in ϒΛ and M Λ providesa linear system of equations, the KKT system. Under the above conventionson the index sets, the discretization matrix can be described in a rathercondensed form with the aid of Kronecker products. First, recall our short-hand notation for Gramian matrices,

aΨΛ,ΨΛ′bG[⟨ψ ( j,k),ψ ( j ′, k′ )⟩]( j, k) ∈Λ ,( j ′, k′ ) ∈Λ′ .

Moreover, the Kronecker product of a matrix A ∈ �nBm with a matrix B isdefined by

A⊗ B_ (ai, jB ) iG1,...,njG1,...,m

.

Defining now

QΛGQ ⊗ aΨΛy ,ΨΛyb, RûΛGRû ⊗ aΨΛû,ΨΛû

b, RwΛGRw ⊗ IΛw , (10)

XΛGInx⊗ aΨHΛ′x , ΨΛxb−A⊗ aΨH

Λ′x ,ΨΛxb, WΛGW ⊗ aΨHΛ′x ,Ψ

HΛwb, (11)

IΛGIny⊗ IΛy , CΛGC ⊗ aΨΛy ,ΨΛxb, VΛGV ⊗ aΨΛy ,ΨΛûb, (12)

BΛGB ⊗ aΨHΛ′x ,Ψ

HΛub, (13)

and bearing in mind that, on account of orthonormality and biortho-gonality, the wavelet mass matrices aΨH

Λ′x ,ΨHΛwb and aΨΛ′y,ΨΛû

b areidentity matrices possibly enlarged by a zero block, while the others exhibita finger structure, the restriction of (6)–(7) to (ϒΛ , M Λ) yields the followingKKT system:

�XTΛ −CT

Λ

2QΛ IΛ2RûΛ −VT

Λ

2RwΛ −WTΛ

XΛ −WΛ

−CΛ IΛ −VΛ

��dx

dy

dûdw

dµdν

�G�0

2QΛdz

0

0

BΛdu

0

� . (14)

Page 10: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001510

This linear system is simply the Galerkin discretization of (8) with respectto the finite-dimensional spaces ϒΛ , M Λ. Of course, this system arises alsofrom the necessary KKT condition for the discrete minimization problem

mindx ,dy ,dû ,dw

(dyAdz)TQΛ(dyAdz)CdT

û RûΛdûCdTwRwΛdw ,

s.t. XΛdxAWΛdwGBΛu,

IΛdyACΛdxAVΛdûG0,

obtained by restricting the minimization in (1)–(3) to the finite-dimensionalspace ϒΛ .

The linear system (14) has the following block structure:

L Λ�υΛλ Λ�_�A Λ BTΛ

B Λ 0 ��υΛλ Λ�G�hΛgΛ�, (15)

where the constraints are represented by B Λ and the vector λ Λ is comprisedof the Lagrange parameters. However, here the matrix A Λ reflecting theobjective function is only positive semidefinite, since the states x are notregularized at this point. This is also illustrated by Fig. 1. It displays thenonzero pattern of the discretization matrix for a system with four statesand one output function discretized on an equidistant grid of mesh size 2−7.

4. Numerical Algorithm

In this section, we outline the solution of the optimization problem (1)subject to the constraints (2)–(3), or equivalently, the corresponding oper-ator equation (8). We employ a refinement approach to provide a stateestimation at any instant of time. Therefore, we consider the sequence offinite-dimensional linear systems (15),

L Λl �υΛl

λ Λl�G�hΛl

gΛl� , for lGl0 , l0C1, . . . , (16)

and apply to the whole sequence of equations a nested iteration scheme asintroduced for example in Ref. 21; see Section 6 for a sketch. Moreover,each equation system is solved by an iterative solver, which we describebelow.

4.1. Simplified Uzawa Algorithm. To obtain an efficient iterativesolver for a fixed level l, the structure of the system given by the originalproblem has to be exploited. The Uzawa algorithm is known as a classical

Page 11: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001 511

scheme for saddle-point problems of the form (15); see Refs. 21, 23, 24.However, a direct application of the Uzawa technique is prohibited by therank deficiency of the upper left block A Λ in (15). Nevertheless, due to thesimplicity of the nonzero part of A Λ , we can reduce easily the system (14)by eliminating the wavelet coefficients dy , dû , dw. These coefficients can beexpressed by the dual solution dµ , dν . The resulting equations are

�(1�2)(Q−1

Λ CVΛR−1ûΛV

TΛ) 0 CΛ

0 (1�2)WΛR−1wΛW

TΛ −XΛ

CTΛ −XT

Λ 0��

dνdµdx

�G�dz

−BΛdu

0� (17)

and

2QΛdyGAdνC2QΛdz , (18)

2RûΛdûGVTΛdν , (19)

2RwΛdwGWTΛdµ . (20)

Equations (18)–(20) can be solved easily by exploiting the rule

(E ⊗ F )−1GE−1 ⊗ F−1

for the inversion of QΛ, RûΛ, RwΛ. Moreover, recall from (10)–(13) that theweight matrices Q, Rû , Rw are in general diagonal. Furthermore, due to thestability of the wavelet bases, the matrices aΨΛy ,ΨΛyb and aΨΛû

,ΨΛûb

are well-conditioned so that the systems involving these latter matrices canbe solved very efficiently by an iterative scheme.

The system (17) has again the structure (15),

L Λ�λ ΛυΛ�G�A Λ BTΛ

B Λ 0 �� λ ΛυΛ�G� hΛgΛ�, (21)

where

λ ΛG(dTν , dT

µ)T and υΛGdx .

The corresponding infinite-dimensional system is denoted by

L � λυ �G� h

g�. (22)

Considering AΛ in more detail, we observe that Q−1Λ CVΛR

−1ûΛV

TΛ has full rank

and is positive definite. So is WΛR−1wΛW

TΛ , provided that W has the same

rank as A(nwGnx). However, since the KKT matrix in (14) has full rankalways for a correctly-posed problem, the following ideas can be extendedto the more general case.

Page 12: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001512

Now, AΛ is positive definite so that the following Uzawa technique canbe applied. Recall that this algorithm is based on the reformulation of (21)by block elimination to

AΛλ ΛGhΛABTΛ υΛ, (23)

B ΛA˜−1Λ BTΛ υΛGB ΛA˜−1Λ hΛAgΛ. (24)

The system (24) involving the Schur complement B ΛA−1Λ B

TΛ is solved by

the PCG method. However, since generally the Schur complement involvesthe inverse of the upper left block A Λ , it is usually not computed explicitly.Instead each PCG step for (24) requires the solution of a linear systeminvolving A Λ , which is done again by applying the PCG method as aninner iteration. Moreover, this algorithm for (24) yields also in each stepthe search direction for solving (23). In this sense, the Uzawa algorithmcombines the solution of (24) and (23), avoiding an additional applicationof the PCG method to (23).

In our particular case, this procedure can be simplified since the inverseA

−1Λ can be computed explicitly by exploiting the Kronecker product. In

fact, one has

A−1Λ G�(Q

−1CVR−1û VT)−1 ⊗ ⟨⟨Ψ Λy ,ΨΛyb 0

0 (WR−1w WT )−1 ⊗ IΛ′x

�. (25)

Note that the inverses (Q−1CVR−1û VT )−1 and (WR−1

w WT )−1 can be computedonce and for all in advance, since they are independent of the currentdiscretization.

This leads to the following simplified Uzawa algorithm. As in the classi-cal version, the PCG method is applied to the Schur complement

S ΛGB ΛA−1Λ B

TΛ .

Of course, the matrix SΛ is not assembled. Instead, only matrix-vector multi-plications involving the matrices A

−1Λ , B Λ , B

TΛ are performed as detailed

below in (26). The inner iterations of the Uzawa algorithm, namely, theapplication of the PCG method to the systems involving A Λ, are substi-tuted by a matrix multiplication with A

−1Λ . Therefore, in this algorithm,

preconditioning refers to only the Schur complement. Like for the PCGmethod, it requires an appropriate positive-definite matrix M applied to thecurrent residuals as shown in (26) below. Thus, we obtain the followingalgorithm for the system (21). To avoid a severe cluttering of indices, wesuppress here the subscript Λ and the superscript tilde.

Page 13: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001 513

Initialization. Let υ0 be given, Then,

λ 0GA−1(hAB

Tυ0),

r0GgAB λ 0 ,

q0GMr0 ,

d0GAq0 ,

ρ0GrT0 q0 .

Iterations. For kG0, 1, . . . , compute

pkGBTdk , (26a)

skGA−1pk , (26b)

α kGρk�pTk sk , (26c)

υkC1GυkCα kdk , (26d)

λ kC1Gλ kAα ksk , (26e)

rkC1GgAB λ kC1 , (26f)

qkC1GMrkC1 , (26g)

ρkC1GrTkC1qkC1 , (26h)

dkC1GAqkC1CρkC1�ρkdk . (26i)

Note that this algorithm determines iteratively not only υΛGdx , but alsothe unknown

λ ΛG(dTν , dT

µ)T.

To obtain a rough estimate for the computational cost of this scheme,let nz(A) denote the number of nonzero entries in a matrix A. Then, for kiterations, the cost of the algorithm in terms of additions is

k[2 rank B A2Cnz(A −1)C2nz(B )Cnz(M )]

A2 rank B Arank A C1,

and in terms of multiplications is

k[3 rank B C2 rank A C2Cnz(A −1)C2nz(B )Cnz(M )]

A2 rank B A2 rank A A2.

Page 14: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001514

For our application, we have

rank B Gnx #Λx and rank A Gny #ΛyCnx #Λ′x .

Furthermore, taking into account that B consists of blocks with fingerstructure originating from the wavelet discretization, nz(B ) is of ordernS O(#Λx log(#Λx)) (see Ref. 14), where nS is at most nz(A)Cnz(C )Cnx . Ingeneral nS is of order nx due to the sparsity of the model matrices, but inthe worst case it can be n2

x . Assuming that the preconditioning matrix M issparse, more specifically it has at most c rank B nonzero entries, themethod requires nS O(#Λx log(#Λx)) flops per iteration.

Of course, one hopes to find a sufficiently good preconditioner M toensure that the number k of iterations needed to achieve a desired accuracystays far below the number of unknowns which is here

rank B Gnx #Λx .

In fact, one has the following classical estimate for the iterates d (k)x Gυ (k)

Λ interms of the spectral condition numbers κ 2_cond2(M

1�2S ΛM

1�2):

��d (k)x Adx ��S Λ⁄2��d (0)

x Adx ��S Λδk, (27)

where

δG(1κ 2A1)�(1κ 2C1)

and where the energy norm ��dx ��2A is defined as dTx Adx (Ref. 25).

As a consequence of the norm equivalence for wavelets (Ref. 14),

c1 ��df ��l2⁄ �� f ��L2⁄c2 ��df ��l2 , (28)

the above estimates yield immediately the following error bound for theiterates x(k)

Λ GΨTΛd

(k)x as approximations to the function xΛGΨT

Λdx in theenergy norm:

��x��2Sr Λ_ ⟨x, Sr Λx⟩G��Sr 1�2Λ x��2Lnx2 ,

which is induced by the corresponding Schur complement S Λ ,

��x(k)Λ AxΛ��SrΛ⁄2c ��x(0)

Λ AxΛ��SrΛδk. (29)

The constant

cGcond(aΨ,Ψb)

is less than 12 for our choice of wavelets. The design of preconditioners thatgive rise to possibly small reduction factors δ, and hence to small iterationnumbers, will be explained later; see Section 5.

Page 15: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001 515

4.2. Schur Complement. To prepare the ground for discussing theissue of preconditioning, we consider the Schur complement in more detail.The reduction of the system (14) to the unknowns dx gives the Schurcomplement

S ΛG2[XTΛ(WΛR

−1wΛW

TΛ )−1XΛCCT

Λ(Q−1Λ CVΛR

−1ûΛV

TΛ )−1CΛ ], (30)

which is positive definite. The corresponding linear system is

(1�2)S Λdx

GXTΛ(WΛR

−1wΛW

TΛ )−1BΛduCCT

Λ(Q−1Λ CVΛR

−1ûΛV

TΛ )−1dz . (31)

Abbreviating

Qr _ (Q−1CVR−1û V T )−1 and Rr w_ (WR−1

w WT )−1,

and exploiting the rules for the Kronecker product, we derive from (10)–(13) and (25) the following representation of S Λ:

(1�2)S ΛGRr w ⊗ (aΨΛx ,ΨHΛ′xbaΨH

Λ′x , ΨΛxb)

A{(ATRr w) ⊗ (aΨΛx ,ΨHΛ′xbaΨH

Λ′x , ΨΛxb)}

A{(ATRr w) ⊗ (aΨΛx,ΨHΛ′xbaΨH

Λ′x , ΨΛxb)}T

C(ATRr wA) ⊗ (aΨΛx ,ΨHΛ′xbaΨH

Λ′x ,ΨΛxb)

C(CTQr C ) ⊗ (aΨΛx , ΨΛybaΨΛy ,ΨΛybaΨΛy,ΨΛxb). (32)

Note that all the wavelet matrices appearing in (32) are singular except thosecorresponding to CTQC. In fact, the necessary variation of the initial valuesrequires reserving at least one degree of freedom in each state variable xi inexcess of the test conditions. Moreover, in general, the last summand is alsosingular due to the singularity of C. To illustrate a typical situation, Fig. 2

Fig. 2. Schur complement S Λ; see (30).

Page 16: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001516

exhibits the nonzero entries of S Λ corresponding to the KKT system inFig. 1.

5. Preconditioning

A successful application of the simplified Uzawa scheme hinges onfinding an efficient preconditioner. Thus, it is important to understand twocharacteristic features of the mapping L in (8) and its finite-dimensionalapproximation L Λ in (15). Let σmax(Λ) and σmin(Λ) denote the largestrespectively smallest singular value of L Λ. Usually, one faces the followingfacts:

(I) lim σmax(Λ)→S as #Λ→S.(II) It may happen that σmin(Λ) is very small for all Λ.

Hence, the system (14) and its modifications are generally very ill-con-ditioned. Fact I is simply a consequence of the fact that the constraintsinvolve the derivatives of the states. Thus, although according to (9) L isboundedly invertible as a mapping from ϒBM onto its dual, it does notgive rise to a well-posed operator equation on L2. On the other hand, itsdiscrete approximation L Λ acts on the wavelet coefficients, which by theRiesz basis property relate the Euclidean norm directly to the L2 norm.Hence, the spectral norms of the operators L Λ will grow with increasingthe dimension of the trial spaces. However, as we will see in the following,based upon norm equivalences for Sobolev spaces with properly-scaledwavelet coefficients [see (33), (34)], one can choose respectively scaled wave-lets, leading to a discretization of L with a bounded spectral norm. Thisscaling can be interpreted also as symmetric preconditioning of L Λ , as wedo in the context of iterative solvers.

The reason for Fact II is completely different. The original processmodel has often a low observability measure. Large changes in x may corre-spond to only very small changes in y and respectively in z. This requiresadditional efforts concerning preconditioning and its combination with asuitable choice of regularization parameters Q, Rû , Rw. The first approachesare given in Part 2 (Ref. 10).

To this end, recall from Ref. 20 that the piecewise linear wavelet basesused in Section 3 for the discretization of the states induce the followingisomorphism between l2 and H1. There exist positive bounded constants c3,c4 such that, for any f ∈ H1 whose wavelet expansion is

fG∑j,k

dj,kψ j,k ,

Page 17: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001 517

i.e.,

dj,kG⟨ f, ψ j,k ⟩,

one has

c3 ��{2 jdj,k} j,k ��l2 ⁄ (�� f ��2L2C�� f ��2L2 )1�2

G�� f ��H1

⁄c4 ��{2 jdj,k} j,k ��l2. (33)

By duality for elements in the dual space f ∈ (H1)′ and their dual waveletexpansions

fG∑j,k

dj,kψ j,k ,

the following norm equivalence holds:

c5 ��{2−j d j,k} j,k ��l2⁄ �� f ��(H1)′

⁄c6 ��{2−j d j,k} j,k ��l2. (34)

Now, let DΛ denote the diagonal matrix with diagonal entries 2 j for( j, k) ∈Λ x . We will indicate next that these norm equivalences imply boundson the condition numbers of properly-scaled Schur complements; see Refs.14, 26 for related discussions.

Theorem 5.1. If the operator L is well-posed with respect to ϒBM

and provided the used Galerkin scheme is stable, then

cond2(D−1Λ S ΛD

−1Λ )GO (1), (35)

independently of Λ.

Proof. The operator L [see (22)] maps (Lny2 BLnw

2 )B(H1)nx into itsdual (Lny

2 BLnw2 )B((H1)nx)′. Given the fact that L is well-posed with respect

to ϒBM , one concludes from (9) that there exist positive constants c7, c8for L such that

c7(��λ ��2(Lny2 BLnw2 )C��x��2(H1)nx))1�2

⁄ ��L �λx���(Lny2 BLnw2 )B((H1)nx)′

⁄c8(��λ ��2(Lny2 BLnw2 )C��x��2(H1)nx))1�2. (36)

Page 18: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001518

Now, expand the functions λ , x in the respective infinite-dimensional wave-let bases, namely,

λG(dTν Ψ, dT

µΨH) and xGd Tx D−1Ψ.

Note that we have scaled the basis for the x-component so that, in ourearlier notation,

dxGDdx .

Invoking the Riesz basis property (28) for the first two components andemploying (33) for the third component, the latter scaling of Ψ ensures justthat

c3(��dν ��2l2C��dµ ��2l2C��dx ��2l2)1�2

⁄ (��λ ��2L2C��x��2H1)1�2

⁄c4(��dν ��2l2C��dµ ��2l2C��dx ��2l2)1�2. (37)

Moreover, since the norm equivalences (28) and (34) give rise to

(1�c9)�L �λx��Lny2 BLnw2 B((H1)nx)′

⁄��A BTD−1

D−1B 0 ��(d

Tν , d T

µ )T

dx��

l2

⁄ (1�c10)�L �λx��Lny2 BLnw2 B((H1)nx)′(38)

with

c9Gmax{c2 , c6} and c10Gmin{c1 ,c5},

we conclude from (36)–(38) that

(c3c7�c9)(��dν ��2l2C��dµ ��2l2C��dx ��2l2)1�2

⁄��A

D−1B

BTD−1

0 ��(dTν , d T

µ )T

dx��

l2

⁄ (c4c8�c10)(��dν ��2l2C��dµ ��2l2C��dx ��2l2)1�2. (39)

This in turn means that the infinite matrix

�A

D−1B

BTD−1

0 �

Page 19: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001 519

as well as its inverse are bounded in Euclidean metric.Next, recall that the relation (39) remains valid for truncated finite

sequences dν , dµ , dx determined by some finite index set Λ and for the corre-sponding finite matrices

�A Λ BTΛD

−1Λ

D−1Λ B Λ 0 �

uniformly in Λ provided that the Galerkin scheme associated with these trialspaces is stable. Hence, the matrices remain uniformly bounded. Conse-quently, we can deduce that D−1

Λ S ΛD−1Λ is uniformly bounded, which in turn

proves the theorem. �

In the present context, the Galerkin scheme is stable if and only, forthe first group of variables λ on one hand and for the second group ofvariables x on the other hand, the trial spaces satisfy the so called LBBcondition; see Refs. 21, 23. The validity of the LBB condition can beensured by keeping the refinement level for the first group of variables (i.e.,the index sets for the Lagrange multipliers) in relation to the second group(e.g., as chosen).

The quantitative behavior of the above scaling D−1 of the wavelet coef-ficients and consequently of the corresponding preconditioner MGD−2 forthe Uzawa algorithm can still be improved. Note that (33) means that aproperly-scaled wavelet basis is a Riesz basis for the Sobolev space H1.Therefore, one expects that the Riesz constants c1 , . . . , c6 (and hence theeffect on preconditioning) in (28), (33), (34) can be improved by normalizingthe wavelet basis in H1 or better yet in the relevant energy space defined bythe Schur complement. This suggests to employ as a diagonal precondi-tioner the matrix MJ

Λ obtained by taking the inverses of the diagonal entriesof the Schur complement. Thus, wavelet theory proves that the applicationof the Jacobi preconditioner MJ

Λ is sufficient for bounded conditioned num-bers. Recall that we wish to avoid expensive matrix multiplications. There-fore, we consider the option of approximating the diagonal entries of theSchur complement, while avoiding its explicit computation. The involvedinverse A

−1Λ [see (25)] is substituted by the diagonals of (Q−1

Λ CVΛR−1ûΛV

TΛ)−1

and (WΛR−1wΛ WT

Λ)−1. This yields the approximate Jacobi preconditioner

MaJΛ , with

(MaJΛ )i,i ′G�∑

s

(Q−1Λ CVΛR

−1ûΛV

TΛ)−1(s,s)C

2Λ,(s,i)

C∑s

(WΛR−1wΛW

TΛ )−1(s,s)X

2Λ,(s,i)�

−1

δi,i ′ . (40)

Page 20: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001520

Recall from (25) that the computation of the above diagonal matricesamounts simply to replacing the wavelet finger block matrices in (25) bydiagonal matrices. Only four very sparse vector multiplications are neces-sary for each entry to determine MaJ

Λ .Other scaling strategies with the same qualitative behavior are conceiv-

able. Nevertheless, the numerical studies in Ref. 10 reveal that the aboveapproximate Jacobi preconditioner is clearly superior to the other testeddiagonal preconditioners. One could argue about the application of moreexpensive preconditioners. Nevertheless, as long as an indefinite variant ofthe KKT system is considered, incomplete LU or other incomplete factoriz-ation preconditioners without pivoting are ruled out because zero entriesappear on the diagonal. In turn, pivoting would destroy symmetry. In prin-ciple, such techniques apply to the Schur complements SΛ , but require theirexplicit computation. The same is true for preconditioners based on a sparseapproximate inverse. Additionally, the necessary scaling effect is notguaranteed.

6. Stopping Criteria and Final Nested Iteration Algorithm

In the nested iteration process (Ref. 21), recall that the intermediatesystems have to be solved only within some accuracy tolerance which corre-sponds to the current discretization error. Thus, when using the previouscoarse level solution as the initial guess for an iteration on the next higherdiscretization level, the current error has to be reduced only by a constanterror reduction factor α , whose size will be discussed below. Moreover, aprojection of the initial guess into the next finer trial space is not necessary,because one has to set only the additional wavelet coefficients to be zero.Moreover, the matrices of the previous level have only to be extended, butnot reassembled.

In view of (27), the maximal number of iterations maxit on each levelneeded to realize the error reduction rate α depends on the condition num-ber of the preconditioned Schur complement. Since we have seen that thewavelet-based discretization allows us to keep the condition numbers

κ 2Gcond(M1�2Λ S ΛM

1�2Λ )

uniformly bounded independently of the scale [see (35)], the maximal num-ber of iterations on each level maxit required by a fixed error reduction

��xAxΛl ��S ⁄α ��xAxΛlA1��S (41)

is independent of the discretization level. Hence, denoting by υΛl and υ(k)Λl

the exact solution of the system (16), respectively the kth iterate, and

Page 21: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001 521

recalling that these arrays are the wavelet coefficients of the respective statesxΛl and x(k)

Λl , we infer from (29) that it suffices to choose the number ofiterations maxit so as to guarantee that

��υΛlAυ(k)Λl ��S Λl

⁄α �c ��υΛlAυ(0)Λl ��S Λl

, (42)

where υ (0)Λl is obtained by appending zero entries for υl \ΛlA1 to the approxi-

mate solution υΛlA1 from the previous level. Now, there are two ways toguarantee the validity of (42). First, we can conclude from (29) and (42)that the iterations needed are at most

maxit≈ log(2c�α )�{−log[(1κ 2A1)�(1κ 2C1)]}. (43)

The second possibility is to use the estimation by the l2 norm of the residualsof the preconditioned system

��υΛlAυ(k)Λl ��S Λl

���υΛlAυ(0)Λl ��S Λl

⁄1κ 2 ��M1�2Λ res(k)

Λl ��l2���M1�2Λ res(0)

Λl ��l2 , (44)

with

res(k)Λl GS Λl (υΛlAυ

(k)Λl ).

Consequently, we obtain the stopping criterion

��M1�2Λ res (k)

Λl ��l2���M1�2Λ res(0)

Λl ��l2⁄α �c1κ 2 _α rel. (45)

In general, the exact error reduction factor α is not computable. However,in our particular situation, α can be estimated by the following consider-ations. Since piecewise linear functions are used for the discretization of thestates, the discretization error in the L2 norm for a stepsize h is at best ofthe order h2 for smooth functions. Therefore, the discrete problem has tobe solved within only that accuracy. Due to the differentiation involved inB, the approximation rate with respect to the energy norm induced by the(infinite-dimensional) Schur complement operator

S GB A−1

BT

is then of only order one. For example, if we progress from one scale j tothe next higher scale jC1 (i.e., by mesh bisection), the discretization erroris expected to decrease at most by a factor 1�4 in the L2 norm and by afactor 1�2 in the energy norm.

Hence, we fix α for scalewise refinement to αG1�2. Moreover, todetermine α rel and maxit , an upper bound for κ 2 has to be estimated. Choos-ing α relG10−3 for all j would cover condition numbers κ 2 up to 1700. Never-theless, to play safe, we set α relG10−4. Moreover, since maxit predicts the

Page 22: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001522

worst case, we use mostly maxitG20 as a pragmatic choice, which rigorouslycovers only condition numbers up to around 100. This pragmatic choice ismotivated by the typical superconvergent behavior of the PCG method(Refs. 27–28). This superconvergence has been observed in all numericalexperiments due to the clustering of the eigenvalues corresponding to thepreconditioned systems with only a few small eigenvalues as outliers.

The algorithm for solving a sequence of systems (21) can then be sum-marized as follows:

set l_ l0 and solve (21) up to machine precision;set l_ lC1;update the representation of the denoised signals and the control func-tions from the previous level;extend the discretization matrices of the previous level to the currentlevel;use the approximate solution of level lA1 as the initial guessυ (0)Λl ←υ (k)

ΛlA1;apply the approximate Jacobi preconditioner MaJ

Λl ;iterate the simplified Uzawa algorithm (k_kC1) until either (a) or (b)are satisfied:

(a) the relative residual error in the l2 norm is below α rel;(b) the maximal iteration number kGmaxit is reached;

go to the first step.

We illustrate the performance of this scheme by the following simpleexample and uniform mesh bisection; we refer for further studies to Part 2(Ref. 10). We choose

nxGnyG4.

Furthermore, we allow model uncertainties in all state equations (2), i.e.,WGI, but not in the output equations (3). This case is realized in ournumerical tests with

nûGny , RûGI, VG0,

which forces û to vanish. The model matrices,

AG�−1−10 −1

−1 −1−10 −1

�, BG�1

0

0

0�, CGI,

Page 23: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001 523

Table 1. Values of cond(M1�2Λ S ΛM

1�2Λ ).

Scale MΛGI MΛGMaJΛ maxit

1 5.88E+01 49.2 142 1.32E+02 55.5 153 4.46E+02 60.3 154 1.64E+03 63.3 165 6.33E+03 65.4 166 2.48E+04 67.0 167 9.87E+04 68.2 168 3.93E+05 69.0 179 1.57E+06 69.7 17

are normalized with respect to the L2 norm using the reference solutions xand y. These reference solutions are determined by solving the model (2)–(3) with wG0, x(0)G1, and a piecewise linear control function u, oscillatingbetween 10 and A10 with u(0)G10 and stepsize 0.25. As regularizationmatrices, we choose QGI as well as RwGI.

The condition numbers of the Schur complements and their precon-ditioned variants are recorded in Table 1 along with the iteration numbersmaxit according to (43) obtained for αG1�2 and MΛGMaJ

Λ . Obviously, thecondition numbers cond(S Λ) increase roughly by a factor of 4 in eachrefinement step. In contrast, one observes the predicted boundedness of thecondition number cond(M1�2

Λ S ΛM1�2Λ ).

Table 2 displays the following information on the above nested iter-ation algorithm: the number of iterations k on scale j, the errorejG��d k

xΛ jAd refx ��l2 between the wavelet coefficients of the obtained approxi-

mation and the reference solution, as well as the quotient ejA1�ej , where jdenotes the current scale. We measure here the error in the l2 norm instead

Table 2. Results corresponding to nested iteration.

Scale j k ej ejA1�ej dim nz(S Λ) ��res��l2

1 16 9.4525E−01 12 70 6.0166E−152 17 8.8248E−01 1.07 20 230 4.0636E−043 17 2.4626E−01 3.58 36 730 4.9402E−034 17 6.6437E−02 3.70 68 2264 3.4785E−035 17 1.7124E−02 3.87 132 6776 2.6048E−036 17 4.3262E−03 3.95 260 18846 2.3966E−037 15 1.0811E−03 4.00 516 48556 2.2025E−038 14 2.6419E−04 4.09 1028 118454 2.5533E−039 10 6.3861E−05 4.13 2052 276688 2.1355E−03

9 34 5.8380E−05 2052 276688 2.4164E−03

Page 24: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001524

of the energy norm. According to the above discussion, we expect a quotientup to 4. Moreover, to provide an idea of the size of the problems, we includealso the size of S Λ(dim), their nonzero entries nz(S Λ), and the residuals��res��l2G��S Λd

apprxΛ ArhsΛ��l2 on each scale, where rhsΛ denotes the right-hand

side of (31). The last line gives the results for solving the system on onlyscale 9 with 0 as the initial guess in the simplified Uzawa algorithm.

Due to the geometric progression of the problem sizes, we see that thecomputational cost for nested iteration up to scale 9 corresponds to 20 or21 iterations of the simplified Uzawa algorithm applied to the system onscale 9 only. Perhaps, more important than the expected higher efficiencyof the nested iteration is the fact that it provides successively improvedapproximations at a much earlier stage. We can see also that it is not neces-sary to determine on each scale the exact solution. The best approximationerror is reached closely, although the residuals are still quite large and infull agreement, with the size of the condition numbers in this example. Thequotient ejA1�ej is as good as an exact solver can provide to produce mean-ingful results associated to the discretization accuracy.

7. Conclusions

In order to meet the online requirements of state estimation, i.e., tooffer at any instant of time an approximate solution whose quality increasesin time, we have proposed a nested iteration scheme based on a hierarchyof wavelet Galerkin discretizations. For each discretization level, we exploitextensively the structure of the underlying system of equations for derivinga well-tailored iterative scheme, termed the simplified Uzawa algorithm. Thenumber of iterations needed on each level to reduce the current error byonly a fixed factor in the context of nested iteration depends on the con-dition numbers. The condition numbers can be kept bounded with the aidof wavelet concepts using an inexpensive diagonal preconditioner, namely,an approximate Jacobi preconditioner. Hence, we can conclude that thepresented iteration scheme provides at any instant of time an improved stateestimation. Moreover, depending on the system inherent features, the nestediteration process can be more efficient than solving the discrete problemdirectly for the highest level of resolution.

However, in spite of the accomplished asymptotic boundedness, largecondition numbers may still arise due to the system inherent features asfor example a poor observability measure, stiff differential equations, orinadequate regularization. These issues will be addressed in more detail inPart 2 for more challenging test examples. In particular, the interplaybetween preconditioning and regularization will be studied there.

Page 25: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001 525

The extension of the proposed algorithm to nonlinear systems withinequality constraints is possible. In particular, the linear systems arisingwhen active set methods are employed have the form (15) and the intro-duced elimination process is not affected by the nonlinearity. However,although the condition numbers will be still bounded independently of therefinement, the quantitative bounds are not clear.

Moreover, we would like to mention that the hierarchy of problemdiscretizations can also be exploited for the purpose of regularization inconnection with noisy data. On the one hand, regularization by discretiz-ation can be realized by determining an appropriate adaptively refineddiscretization level (Ref. 6). On the other hand, level depending regulariz-ation parameters can be incorporated. For example, in Ref. 29, a relateddiscussion on wavelet accelerated Tikhonov regularization is given underthe assumption of an a priori known noise level.

Last but not least, it should be noted that the numerical concepts carryover in a straightforward manner to the solution of real-time optimal con-trol problems as they arise in model predictive control applications (Ref.30).

References

1. MICHALSKA, H., and MAYNE, D. Q., Moûing-Horizon Obserûers, IFAC Sym-posium NOLCOS’92, 1992.

2. MUSKE, K. R., and RAWLINGS, J. B., Nonlinear Receding-Horizon State Esti-mation, Methods of Model-Based Control, Edited by R. Berber, NATO ASISeries, Kluwer Press, Dordrecht, Netherlands, 1995.

3. ROBERTSON, D., LEE, J. H., and RAWLINGS, J. B., A Moûing Horizon-BasedApproach for Least-Squares Estimation, AIChE Journal, Vol. 42, pp. 2209–2223,1996.

4. BINDER, T., BLANK, L., DAHMEN, W., and MARQUARDT, W., Toward Multi-scale Dynamic and Data Reconciliation, Nonlinear Model-Based Process Con-trol, Edited by R. Berber and C. Kravaris, NATO ASI Series, Kluwer AcademicPublishers, Dordrecht, Netherlands, pp. 623–665, 1998.

5. HIRSCHHORN, R. M., Inûertibility of Multiûariable Nonlinear Control Systems,IEEE Transactions on Automatic Control, Vol. 24, pp. 855–865, 1979.

6. BINDER, T., BLANK, L., DAHMEN, W., and MARQUARDT, W., On the Reg-ularization of Dynamic Data Reconciliation Problems, Journal of Process Control(to appear).

7. TIKHONOV, A., and ARSENIN, V., Solutions of Ill-Posed Problems, Wiley, NewYork, NY, 1977.

8. BIEGLER, L. T., Efficient Solution of Dynamic Optimization and NMPC Prob-lems, Nonlinear Model Predictive Control, Edited by F. Allgower and A. Zheng,Birkhauser Verlag, Basel, Switzerland, pp. 219–243, 2000.

Page 26: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001526

9. BOCK, H. G., DIEHL, M. M., LEINEWEBER, D. B., and SCHLODER, J. P., ADirect Multiple-Shooting Method for Real-Time Optimization of Nonlinear DAEProcesses, Nonlinear Model Predictive Control, Edited by F. Allgower and A.Zheng, Birkhauser Verlag, Basel, Switzerland, pp. 245–267, 2000.

10. BINDER, T., BLANK, L., DAHMEN, W., and MARQUARDT, W., Iteratiûe Algo-rithms for Multiscale State Estimation, Part 2: Numerical Inûestigations, Journalof Optimization Theory and Applications, Vol. 111, pp. 531–553, 2001.

11. BARD, J., Nonlinear Parameter Estimation, Academic Press, New York, NY,1974.

12. KAILATH, T., Linear Systems, Prentice Hall, Englewood Cliffs, New Jersey,1980.

13. COHEN, A., and MASSON, R., Adaptiûe Waûelet Methods for Second-Order Ellip-tic Problems: Preconditioning and Adaptiûity, SIAM Journal on Scientific Com-puting, Vol. 21, pp. 1006–1026, 1999.

14. DAHMEN, W., Waûelet and Multiscale Methods for Operator Equations, ActaNumerica, Vol. 7, pp. 55–228, 1997.

15. CHENG, Z., and PILKEY, W. D., Waûelet-Based Limiting Performance Analysisof Mechanical Systems Subject to Transient Disturbances, Finite Elements inAnalysis and Design, Vol. 33, pp. 233–245, 1999.

16. HSIAO, C. H., and WANG, W. J., Optimal Control of Time-Varying Systems ûiaHaar Waûelets, Journal of Optimization Theory and Applications, Vol. 103, pp.641–655, 1999.

17. ZHOU, D., CAI, W., and ZHANG, W., An Adaptiûe Waûelet Method for NonlinearCircuit Simulation, IEEE Transactions on Circuits and Systems, I: FundamentalTheory and Applications, Vol. 46, pp. 931–938, 1999.

18. CHUI, C. K., An Introduction to Waûelets, Academic Press, Boston, Massachu-setts, 1992.

19. DAUBECHIES, I., Ten Lectures on Waûelets, SIAM, Philadelphia, Pennsylvania,1992.

20. DAHMEN, W., KUNOTH, A., and URBAN, K., Biorthogonal Spline Waûelets onthe Interûal: Stability and Moment Conditions, Applied and ComputationalHarmonic Analysis, Vol. 6, pp. 132–196, 1999.

21. BRAESS, D., Finite Elemente, Springer Verlag, Berlin, Germany, 1992.22. BINDER, T., BLANK, L., DAHMEN, W., and MARQUARDT, W., An Adaptiûe

Multiscale Method for Real-Time Moûing-Horizon Optimization, Proceedings ofthe American Control Conference, Chicago, Illinois, 2000; Omnipress, Madison,Wisconsin, pp. 4234–4238, 2000.

23. BREZZI, F., and FORTIN, M., Mixed and Hybrid Finite Element Methods,Springer Verlag, Berlin, Germany, 1991.

24. GLOWINSKI, R., Numerical Methods for Nonlinear Variational Problems,Springer, New York, NY, 1984.

25. GOLUB, G. H., and VAN LOAN, C. F., Matrix Computations, Johns HopkinsPress, London, England, 1996.

26. DAHMEN, W., and KUNOTH, A., Multileûel Preconditioning, NumerischeMathematik, Vol. 63, pp. 315–344, 1992.

Page 27: Iterative Algorithms for Multiscale State Estimation, Part 1: · PDF filepermit a restart with a refined discretization; on the other hand, if the resolution is chosen too fine initially,

JOTA: VOL. 111, NO. 3, DECEMBER 2001 527

27. AXELSSON, O., Iteratiûe Solution Methods, Cambridge University Press, NewYork, NY, 1994.

28. MEURANT, G., Computer Solution of Large Linear Systems, Elsevier Science,Amsterdam, Netherlands, 1999.

29. MAASS, P., and RIEDER, A., Waûelet-Accelerated Tikhonoû–Philips Regulariz-ation with Applications, Inverse Problems in Medical Imaging and Nondestruc-tive Testing, Edited by H. W. Engl, A. K. Louis, and W. Rundell, Springer,Wien, Austria, pp.134–158, 1997.

30. MORARI, M., and LEE, J. H., Model Predictiûe Control: Past, Present, andFuture, Computers and Chemical Engineering, Vol. 23, pp. 667–682, 1999.