novel approach to accelerating newton's method for sup ...helton/billspapersscanned/hm93c.pdfin...

26
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS: Vol. 78, No. 3, SEPTEMBER 1993 Novel Approach to Accelerating Newton's Method for Sup-Norm Optimization Arising in H -Control I J. W. HELTON 2 AND O. MERINO 3 Communicated by D. Q. Mayne Abstract. We present algorithms for solving general sup-norm minimization problems over spaces of analytic functions, such as those arising in H °° control. We also give an analysis and some theory of these algorithms. Part of this is specific to analytic optimization, while part holds for general sup-norm optimization. In particular, we are proposing a type of Newton-type algorithm which actually uses very high-order terms. The novel feature is that higher-order terms can be chosen in many ways while still maintaining a second-order convergence rate. Then, a clever choice of higher-order terms greatly reduces computation time. Conceivably this technique can be modified to accelerate Newton algorithms in some other circumstances. Estimates of order of convergence as well as results of numerical tests are also presented. Key Words. Analytic disks, optimization, H°~-control, higher-order Newton methods. 1. Introduction This article concerns numerical optimization in supremum norm. We analyze the order of convergence of Newton and higher-order methods with more generality and detail than previous treatments (cf. Ref. 1). We use this analysis to propose a new class of algorithms for a particular category of problems called H~-optimization problems, Indeed, this case 1This work was partially supported by the Air Force Office of Scientific Research and the National Science Foundation. 2Professor, Department of Mathematics, University of California at San Diego, La Jolla, California. 3Visiting Assistant Professor, Department of Mathematics, Texas Tech University, Lubbock, Texas. 553 0022-3239/93/0900-0553$07.00/0 ~) 1993 Plenum Publishing Corporation

Upload: others

Post on 14-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS: Vol. 78, No. 3, SEPTEMBER 1993

Novel Approach to Accelerating Newton's Method for Sup-Norm Optimization Arising in H -Control I

J. W . HELTON 2 AND O. MERINO 3

Communicated by D. Q. Mayne

Abstract. We present algorithms for solving general sup-norm minimization problems over spaces of analytic functions, such as those arising in H °° control. We also give an analysis and some theory of these algorithms. Part of this is specific to analytic optimization, while part holds for general sup-norm optimization. In particular, we are proposing a type of Newton-type algorithm which actually uses very high-order terms. The novel feature is that higher-order terms can be chosen in many ways while still maintaining a second-order convergence rate. Then, a clever choice of higher-order terms greatly reduces computation time. Conceivably this technique can be modified to accelerate Newton algorithms in some other circumstances. Estimates of order of convergence as well as results of numerical tests are also presented.

Key Words. Analytic disks, optimization, H°~-control, higher-order Newton methods.

1. Introduction

This article concerns numerical opt imizat ion in supremum norm. We analyze the order of convergence of Newton and higher-order methods with more generality and detail than previous treatments (cf. Ref. 1). We use this analysis to propose a new class of algori thms for a particular category of problems called H ~ - o p t i m i z a t i o n problems, Indeed, this case

1This work was partially supported by the Air Force Office of Scientific Research and the National Science Foundation.

2Professor, Department of Mathematics, University of California at San Diego, La Jolla, California.

3Visiting Assistant Professor, Department of Mathematics, Texas Tech University, Lubbock, Texas.

553

0022-3239/93/0900-0553$07.00/0 ~) 1993 Plenum Publishing Corporation

Page 2: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

554 JOTA: VOL, 78, NO. 3, SEPTEMBER 1993

of H~-optimization, a subject central to worst-case frequency domain design in engineering (Refs. 2-3), motivated much of the study here and the numerical experiments in this paper pertain to that. Before specializing to H°~-optimization, we mention a general implication of our paper.

Since our new numerical method is described late in the paper (see Section 3), we now indicate the new idea involved. Instead of being at all precise, we mention an analogy. In conventional Newton's method on ~ , one has local second-order convergence. Newton's method is based on suc- cessive quadratic Taylor approximations to the objective function ~. If one would take higher (e.g., 4th order) approximations and actually solve the basic optimization problem for the 4th order subproblem, then one could have an algorithm with 4th order local convergence. Also, one guesses that many serious modifications of the third-order and fourth-order term still yield a second-order algorithm. Consequently, if one has a way of modify- ing these terms so as to yield a fourth-order subproblem which is much easier to solve than the original quadratic problem, then the running time for each iteration is much reduced, while the order of convergence remains two.

This description was just an analogy. We have a particular situation in H~-optimization and a Newton-like method. A small class of sub- problems (see the Nehari problem below) is easy for us to solve, while we cannot solve most natural subproblems. Thus, every subproblem must be converted to one of a special type. We find that a trick of the type described above results in a powerful class of methods. One wonders if there are other circumstances where this general principle might be applied.

1.1. Optimization over Spaces of Analytic Functions. While many of the results of this paper are presented at a high level of generality, the main motivation and focus of results is the basic problem of H~-optimization:

(OPT) given a map F from T x C M to R ~, find s*>~0 a n d f * 6 E such that

s* = sup F(e i°, f * ( e i ° ) ) = inf sup F(e ~°, f (e~°) ) . 0 f E E 0

Here, E is a subset of H ~, the space of CM-valued measurable functions on the unit circle T in C, which continue analytically to bounded functions on the unit disk. Typically, E = A M, where A M denotes those functions in H ~ which are continuous on the closed unit disk.

Computational efforts which could be used directly on problem (OPT) reside almost entirely within the engineering community, as opposed to the several complex variables or numerical analysis community. A partial list of groups is indicated by Refs. 3-8. It is the goal of this paper to propose

Page 3: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

JOTA: VOL. 78, NO. 3, SEPTEMBER t993 555

a natural class of algorithms and analyze it to the extent that formal analysis can be expected to pay quickly practical dividends.

1.2. Engineering Motivation. Problem (OPT) is central to the design of a system where specifications are given in the frequency domain and stability is a key issue. Suppose that our objective is to design a system part of which we are forced to use and part of which is designable; see Fig. 1.

The objective of the design is to find the admissible f which gives the best performance. Once one selects the design, the performance at frequency co is F(co, f(ico)). The worst case is the frequency co at which sup0 F(co, f(ie))) occurs. One wants to minimize this over all admissible f .

The stipulation that the designable part of the circuit be stable amounts to requiring that f have no poles in the right-half-plane (r.h.p.). This is exactly the (OPT) problem for the r.h.p. (the problem stated was based on the unit disk). Even when parts of the system other than the designable part are in H °°, one can frequently reparametrize to get problem (OPT). Consequently, problem (OPT) arises in a large class of problems (cf. Ref. 2). We believe that this and kindred problems will be one of the main concerns in the next generation of H a control work.

In addition to engineering, problem (OPT) has recently been found to occur solidly in the field of several complex variables. We shall not describe this here, but we refer the reader to Ref. 9 which points out these connections.

1.3. Nehari Problem. A classical special case is described below.

(Hint) given continuous scalar-valued functions rt, ki, l= ! . . . . , M, with r t>0, find h*= (h~' . . . . , h*) in AM such that

M

inf sup ~ rt(e iO) Ikt(ee°)-th(ei°)12 h~AM 0 l=1

M

= sup ~ r¢(e i°) Ikl(e i°) -h~'(ei°)t2. o l=1

~ Given

f

signable

Fig. 1.

Page 4: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

556 JOTA: VOL, 78, NO. 3, S E P T E M B E R 1993

Theory tells us that solutions exist and are unique. Moreover, in the last decade a tremendous effort by several hundred workers in the control and associated mathematical community has gone into solving this problem and its matrix-valued generalizations. Probably, papers on this subject number nearly a thousand, so we refer the reader only Ref. 5 which lists books on the subject and elsewhere.

1.4. General Theory. The techniques used to develop a general theory of (OPT) come from classical complex analysis and several complex variables and do not fail into the range of most engineers working in H~-control. Known theoretical results on existence, uniqueness, and smoothness of solutions f * to (OPT) are surveyed in Ref. 10; a thorough study of optimal conditions is Ref. 5; and the issues of strong uniqueness of the (OPT) and Nehari problems are treated in this paper.

An important notion in sup-norm optimization is that of strong uniqueness. We maintain that, for infinite-dimensional problems, what we call strong directional uniqueness is even more important. We say that a solution f * to (OPT) for F is p-directionally unique provided that, given h, there is a constant ~(h)> 0, so that

(pDU) sup F(e iO, f*(e i°) + th(ei°)) >I- s* + Itl p ~(h), 0

for all small real numbers t. Directional strong uniqueness is defined to be (pDU) with p = 1. The solution f * is locally strongly unique, provided that we can take p-- 1 and ~(h)=~o IlhlIM, for some ~ > 0 and all h. Clearly, in an optimization problem on a finite-dimensional space, if ~(h) is a continuous function, then strong directional uniqueness implies local strong uniqueness. For proof, set

Cto= min ~(h). IIhll~n = 1

On infinite-dimensional spaces, the two definitions are not equivalent. For problem (OPT) over AM, we shall prove the following theorem.

Theorem 1.1. Strong Uniqueness Theorem. Let F be of class C t, and let f * in AM be a strict local solution to (OPT) such that

a(e iO) = (~F/~z)(e ~°, f * (e i ° ) )

is of class C 1 and does not vanish on T. Then:

(a) f * is not a local strongly unique solution to OPT, for any M > 0; (b) if M = 1, then f * is a directionally unique solution to (OPT);

Page 5: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

JOTA: VOL. 78, NO. 3, SEPTEMBER 1993 557

(c) if M > 1 and if at least two entries of a are not identically 0, then f * is not a directionally unique solution to (OPT).

The proof of Theorem 1.1 appears in Section 3. The practical implications of this theorem will be presented in Section

3, but we summarize them now. For our algorithms, strong uniqueness implies second-order convergence, but strong directional uniqueness alone implies nothing. Practically speaking however, we have run our code extensively and always found that strong directional uniqueness corre- sponds to second-order convergence, while when M >~ 2 we get only linear convergence. More is said to justify this in Section 3.

We finish this section by emphasizing the bang-bang principle that solutions f * ~ A u to (OPT) satisfy. In short, F(ei°,f*(ei°)) is constant in 0 (cf. Ref. 5). This turns out to be very important in our results concerning (OPT).

2. General Algorithm for Sup-Norm Optimization

Now, we turn to a description of algorithms and error estimates. Proofs are in Section 4. We operate at a high level of generality, which includes (OPT) as a special case. Let V be a normed vector space, and let ~,: V ~ CgN(X ) be a two-times continuously differentiable map of V to the space cg u of continuous ~U-valued functions on X, a compact metric space. Of course, cgu(X ) has the norm

I]qlt ~ = sup )lq(x)]l ~,~. x~X

The main problem that we study is now formulated.

(P) Given ~ as above, find f * e V and s* ~> 0 such that

s* = inf sup 117(f){I ~N = sup 1lT(f*)tl ~u. f e V x ~ X x ~ X

Note that, to simplify notation, we write 7(f) for 7(f)(x). Also, by "ll~' and N'lf v we denote the norms on ~N and on V. Throughout this paper, we assume that solutions f * exist for the functions 7 considered, so that (P) will always make sense.

One key to analyzing this problem is the peak set E ( f ) f o r f e V,

E ( f ) = {x e X : Ny(f)(x)jl ~ -~ ll~(f)lf ~ },

and this plays a prominent role in our results.

Page 6: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

558 JOTA: VOL. 78, NO. 3, SEPTEMBER 1993

We shall consider a class of Newton-like descent algorithms called higher-order algorithms. Given f k ~ V, we wish to update it to fk+ 1. Consider the approximation

Qnk(h, fk ) ~= 7(fk)+ 7,(fk)[-h] + Hk[h, hi (1)

to 7( f g + h) for small h. This agrees to first order with its Taylor expansion in h. Here, H k is an element of the space -gff of all continuous bitinear symmetric H: V x V ~ Cu(X).

The update fk+ l is produced by these steps: at iteration k.

Step AI. Find h k~ V such that

sup [[Qm(h k, fk)(x)ll~u= inf sup [IQm(h, fk)(x)ll RN. x ~ X h E V x ~ X

Step A2. Find t >~ 0 that minimizes supxCx l I ' ; ( f k + thk)II ~N. Step A3. Set f k + 1 = f k + th k.

The line search in Step A3 is included to improve performance. It is not hard to prove that, if the function

O(h, x) = [IQ,k(h, fk)(x)H nN

is convex in h for every x ~ X, then

sup lI7(fk + thk)tl < sup 117(fk)ll, X X

for all small enough t>0 . Thus, solutions h in Step A1 give descent directions for the value of the supremum.

In our analysis of higher-order algorithms, the true search parameter in Step A2 will be assumed to be equal to 1, in order to obtain several local estimates on convergence. This of course corresponds to a modification of the higher-order algorithms, but we have affirmed that t = 1 occurs in prac- tice in many numerical experiments (when the space V equals the scalar- valued analytic functions). This is certainly the case in our application in Section 2, and it is consistent with the behavior of standard Newton algorithms for smooth function minimization on finite-dimensional spaces. However, this is surprising in that sup-norm optimization on R" can yield step sizes not close to 1 (Maratos effect).

As we shall see in Section 3, H~-optimization problems require special QH for the subproblem of Step A1 to be solvable with known techniques. For example, H k = 0 will not do. However, higher-order algorithms contain our numerical H~-methods in Refs. 11 and 14, and significant improvements in Section 3.

Page 7: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

JOTA: VOL. 78, NO. 3, SEPTEMBER 1993 559

The higher-order class of algorithms contains many existing algo- rithms as a special case. When H k - 0 for all k, V = Nr~, and C(X) = N'~, we obtain the Gauss-Newton algorithm, which is found in Watson (Ref. 1), who proves second-order convergence under weak hypotheses; Watson gives plenty of references. An article analyzing many different optimization algorithms is by Powell (Ref. 12); for example, he analyzes algorithms in which the basic subproblem of Step A1 is of the form

(At P) (sup ]7 ( f k )+7 ' ( f k ) [h ] l )+Ak[h , hi , x~) l (

where X is a finite set and Ak is a positive bilinear function. Our algorithms and these coincide, of course, when H k = 0 = A k ; otherwise, they are different, and as we shall see we have on occasion a different order-of- convergence theory. A very interesting approach to sup-norm optimization is in Murray and Overton's article (Ref. 13), treating it as a constrained optimization problem and applying SQP. However, in our paradigm (OPT) problem, all (infinitely many) constraints are active at the optimum, which suggests that serious modification of such techniques would be required.

The main theoretical results of this paper concern the order of convergence of higher-order algorithms.

Recall that f * • V is a local strongly unique solution to (P) if there exist e > 0, ~/> 0 such that

sup ]ly(f* + h)ll~s> sup [17(f*)lJ~+~ ][hllv, Hhllv<r/. x ~ X x ~ X

The following Theorems 2.1 and 2.2 are a particular case of Theorems 4.3 and 4.2, respectively.

Theorem 2.1. Let {H k} c ~ be uniformly bounded. Let f l , f2, f 3 , . . , be a sequence in V, generated by a higher-order algorithm with full step tk= 1 for all k. If f * is a local strongly unique solution to (P), then there exist constants r/> 0, e • [0, 1], such that, if Itf ~ - f * ] l v< r/for all k, or if trf I - f*Pt v< r /and V is finite dimensional, then

2+~ YkeN. tlf k + l - f * I t v < C 1 I I f ~ - f * v ,

Moreover, if H k = 7"(fk), then one can take e = 1.

The dim V < oe case of Theorem 2.1 is well known; for example, see Theorem 10.5 in Ref. 1. The constant e depends on the sequence {Hk}, in a sense that will be made more explicit in Section 4. The hypothesis in Theorem 2.1 that all iterates.fk are within a small distance from f * can be relaxed with good behavior of the Hk's, as shown in Theorem 4.5.

Page 8: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

560 JOTA: VOL. 78, NO. 3, SEPTEMBER 1993

Our main application is to suggest improved sup-norm algorithms. In practice, second-order convergence suffices, so Theorems 2.1 and 2.2 say that many ways of choosing H k give a good algorithm. This allows one to choose H k to reduce the computational effort in the subproblem of Step A1. Also, note that the H k can depend on fk, the current iterate.

As stated in the introduction, for our main example of optimization over the space A of scalar-valued continuous analytic functions on the unit disk, the hypothesis of Theorem 2.1 is not in principle true, as one can see in Theorem 1.1. However, in computer experiments, we observed that, for scalar-valued as opposed to vector-valued functions, the conclusion of Theorem 2.1 holds; so Theorem 2.1 is a good predictor of what happens. Our advice in analyzing higher-order algorithms for scalar-valued functions in A is to assume strong uniqueness and obtain an estimate as in Theorem 2.t. Then, one expects this to predict what happens.

A result with weaker hypotheses and conclusions than Theorem 2.1 does actually hold in the examples of Section 3. While local strong unique- ness is not true for them, they do have the property that the peak set E(f*) equals all of X. Under this hypothesis, we obtain the convergence estimate in Theorem 2.2 below.

Theorem 2.2. Suppose that N = 1 and that v(f)>~0 for all ,re V. Assume further that the sequence {H k } c ovf is uniformly bounded,

7"(f*)[h,h](x)>~O, VhEV, VxsX,

Hk[h,h](x)~O, VheV, VxeX,

and that E(f*)= X, i.e., 7(f*) has constant value on X. Then, there exist constants t/> 0, c > 0 such that, if IIf k - f * l l w< t/,

sup 7,(fk ) [fk +1 _ f , ] < c tl f k _ f , I12. X E X

(2)

Note that, under the hypothesis of Theorem 2.2, local strong unique- ness see Lemma 4.4) implies the existence of a constant ~ > 0 such that

ithll v ~ sup v'(fg)[h], Vh ~ V, x ~ X

whenever tLfk- f*l lv is small enough, so the left-hand side of (2) dominates Ilf~+l-f*LLv and at least second-order convergence rate is quaranteed.

Section 4 proves these estimates and theorems. Section 3 treats H~-optimization and presents our accelerated Newton algorithms.

Page 9: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

JOTA: VOL. 78, NO. 3, SEPTEMBER 1993 561

The paper focuses on order of convergence and does not address the issue of actual convergence of our algorithms. We feel that this is a much less important issue practically speaking. Indeed for problem (OPT), experience shows that: (i) the algorithms do converge; (ii) the speed of convergence is predicted by Theorems 2.1, 2.2, and 1.3.

Since the (OPT) problem is infinite-dimensional, actually proving convergence would be an extremely formidable technical exercise and would depend on the details of how the infinite-dimensional space of analytic functions is approximated by a finite-dimensional one.

For finite-dimensional spaces, convergence of lower-order algorithms is proved in Ref. 1, provided the subproblem has a strongly unique solu- tion. Our higher-order algorithms also converge for this special case. Even when strong uniqueness breaks down, Ref. 12 proves that, if its algorithm (with a restricted line search) converges to a point f* , then f * satisfies

sup II'g(.f*)ll~N = inf sup Ily(f*) +7'(f*)[h]ll , x ~ X IIh[l~ <~1 x~J¢"

a necessary condition for optimality (see Ref. 1). Convergence is not discussed in Ref. 12, other than to say that it usually happens.

3. Optimization over Spaces of Analytic Functions

3.1. Algorithms for (OPT). The possibility of choosing the H k term to suit one's needs is exemplified well in H°~-optimization. In this case, the subproblem of Step A1 stakes the following form:

(Sopt) At step k of the higher-order algorithm, solve

inf sup{gk + 2 ~ Re(a,ht)+ Hk[h,h]} h ~ A N 0 l

for h; here, gk(. ) = F(., fk( . )), a~(-) = (OF/0z,)(-, fk(.)).

Not all problems of this type are quasicircular. Indeed, the quasi- circular problems are precisely the ones where H k has the form

(QC) HkEh,, h2] =[/lAh2, where A is a continuous function on T whose values are self-adjoint, positive-definite M x M matrices. For problem (OPT) with H k like this, the subproblem in Step A1 gives a solution h that is a descent direction.

Recall from the introduction that quasicircutar problems (OPT) have been studied extensively and successfully, so they are an excellent choice for

Page 10: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

562 JOTA: VOL. 78, NO. 3, SEPTEMBER 1993

a subproblem which must be solved repeatedly. Henceforth, we shall use the fact that the Nehari problem, denoted (Hint) in the introduction, has unique solution and is easily solved.

To use the (Hinf) code on the optimization problem (Sop0, with H k of the form

M Hk[h, h](e iO) = ~ t~(e i°) [ht(ei°)[ 2,

1 2 1

we complete the square in (Sopt) to get

(Sops) inf sup ta~t2/t~+ ~ [(t~/t~+hl] 2 . h 6 A M 0 1 1=t

Observe that e i° has been suppressed in (S'vt) to simplify notation. Now, solving (Hi,r) does not immediately solve (Sort). To do this, we select k "~'m,

then find 2 defined by

2 = inf sup t, ~,,-gk+Ela~12 t . l~ / t~+h, I 2, h ~ A m 0 /=1 j

and stop if )~ is close enough to one. Otherwise, obtain an update k 7,.+ 1 by either increasing y~ if 2 > 1, or decreasing 7kin otherwise. Thus, our program has an inner loop which optimizes yk as well as an outer loop which moves us from fk to fk + ~" Then, one can solve (Sopt) by solving repeatedly (Hinf).

We tested the following choices for Hk[h, hi with M = 1:

H~[hl, h2] = 1-Re(hlh2), (3a)

H~[ h~, h2] = d2F/Oz de(., fk) . Re(hlh), (3b)

H~[hl, h2] = 1-10F/~z(., fk)12/F(., f k ) ] . Re(hl/~). (3c)

While choice (3b) incorporates some information from the Hessian, choice (3a) is simpler and in some cases requires less computational effort than (3b) when solving (Sop0. They were suggested in Ref. 2, and some properties of their performance are described and analyzed in Ref. 14. Finally, choice (3c) is the new method that we are emphasizing in this article. The main point is that, with choice (3c), we get that

M

(S~pt) inf sup ~ ([a~lZ/gk) lgk/a~+h[ 2. f E AM 0 l = 1

This is a Nehari-type problem, directly solvable by the codes mentioned above; i.e., no inner iterations are needed, contrary to (3a) and (3b). Thus, judicious selection of H k greatly simplifies computation of (Sort).

Page 11: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

JOTA: VOL. 78, NO. 3, SEPTEMBER 1993 563

3.2. Experimental Comparison of Algorithms, Recall that the peak set E ( f * ) defined in Section 2 equals all of T. Thus, the hypothesis of Theorem 2.2 holds for all algorithms in Section 3.1. We know that they have convergence properties in (2). While this is weaker than true second- order convergence as in Theorem 2.l, it is still a powerful conclusion.

As we saw, the hypotheses of Theorem 2.1 which guarantee true second-order convergence fail. However, the following numerical experi- ments give a good idea of what we observed in practice. They are the source of our contention that, in the context of H~-optimization, directional strong uniqueness ( p = 1) implies second-order convergence, while p = 2 implies first-order convergence.

In Tables 6-8, f * was computed by hand, so the exact value is used in diagnostic d k. In Tables 1-5, f * could not be calculated in closed form, so f k was obtained in advance numerically on a larger (512) grid to high accuracy. Tests were conducted on a Sun 3-280 workstation in double- precision arithmetic.

Example 3.1. Here,

F(e i°, z) = [Re( l ie i° + z)] 2 + ([e i° - 0.5[/4). Jim( 1/e i° + z)] 2.

For each e i°, the level sets in z are ellipses in the complex plane with center - e i0.

Tables 1-3 correspond to higher-order algorithms with three choices of the Hessian given by (3a), (3b), and (3c). In each case, our initial guess at f * is fo = 0.

The order of convergence in all computer runs in Tables 1-3 can be seen to be two, as can be read from the error column. Note that the computer time in the runs for Tables 1 and 2 is three to four times that of the run for Table 3. Observe also how the line-search parameter quickly becomes 1.

In Tables 1 and 2, the error tolerance for solving the subproblem is a small constant. To require extreme accuracy when solving the subproblem is wasteful in earlier iterations. In Tables 4 and 5, computer runs are reported where the error tolerance for the subproblem is automatically updated so as to minimize unnecessary effort. This requires less accuracy in earlier iterations.

Note that in Tables 4-5 the running time is decreased with respect to Tables 1-2, but not enough to compete with Table 3. Also note that the order of convergence is still 2. As a practical aside, we remark that the

Page 12: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

564 JOTA: VOL. 78, NO. 3, SEPTEMBER 1993

T a b l e 1. R e s u l t s for E x a m p l e 3.1, Hk[h, h ] = Ihl 2,

Iter Inner iter I [ f k - f * [ I ~ Time (sec)

0 - - 0.3104 1 12 0.282E - 01 223.87 2 8 0 . 3 4 1 E - 03 169.39 3 5 0 . 4 6 1 E - 07 131.45 4 4 0 . 3 9 0 E - 15 83.11

T a b l e 2. R e s u l t s for E x a m p l e 3.1, Hk[h, hi = ~2F/63z c~y.(., f k ) . thl2.

Iter Inner iter /] fk _ f , 11 ~ Time (sec)

0 0.3t04 - - I 10 0.299E - 01 183.03 2 7 0.315E--~ 03 144.09 3 5 0 . 3 9 2 E - 07 108.73 4 4 0.763E - 15 84.20

Tab l e 3. R e s u l t s for E x a m p l e 3.1, Hk[h, h i = [ [0F/Oz( . , fk)12/F(., f k ) ] . [hi e.

Iter Inner iter l l f k - - f * l l ~ Time (sec)

0 - - 0.3104 - - 1 1 0 . 3 2 4 E - 01 40.34 2 1 0.705E -- 03 41.44 3 1 0 . 1 3 5 E - 06 41.68 4 1 0 . 4 5 6 E - 14 36.26

T a b l e 4. Resu l t s for E x a m p l e 3.1, H~[h, h] = [hi 2.

Iter Inner iter Ilf k - f * I1 ~ Time (sec)

0 0.3104 - - 1 7 0.289 10 -1 145.70 2 3 0.137 10 -2 70.42 3 4 0.311 t0 -6 94.86 4 3 0.162 10 -l~ 76.20 5 4 0.179 10 -15 80.58

Page 13: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

JOTA: VOL. 78, NO. 3, SEPTEMBER 1993 565

T a b l e 5.

Iter

R e s u l t s for E x a m p l e 3.1, H k [ h , h i = 42 ~ 4- o F/c,z oz(., f~) . Ihl 2. ill,, , i i ,

Inner iter l l f k - f * ]l ~ Time (sec)

0 - - 0.3104 - - 1 6 0 . 2 7 8 E - 01 124.10 2 3 0 . 1 0 2 E - 02 77.65 3 3 0 . 1 1 9 E - 05 78.95 4 3 0 . t 4 5 E - 01 76.35 5 4 0.198E-- 05 77.20

T a b l e 6. Resu l t s for E x a m p l e 3.2, H~[h, h ] = 1 • [hi 2.

Iter Inner iter Ftf k - . f* [ i ~ Time (sec)

0 - - 0.5527 -~- t 12 0.153E + 00 930.21 2 9 0 . 1 5 3 E - 01 1033.52 3 8 0.222E -- 03 1028.16 4 7 0 . 5 7 9 E - 07 839.36 5 6 0 . 2 9 8 E - 13 642.09

I I

T a b l e 7. Resu l t s for E x a m p l e 3.2, H ~ [ h, h] = Ela r/az(., f k ) t Z / F ( ' , f k ) ] " fhl 2. i i l i i I I l i l i l~i IIII II IIIIIIIIIIIIII I

Iter Inner iter IIf k - f * Ir ~ Time (sec)

0 - - 0.5527 1 1 0.153E + 00 99.95 2 1 0 . 1 9 5 E - 01 83.05 3 1 0 . 4 0 5 E - 03 84.80 4 1 0 . 1 8 4 E - 06 82.66 5 1 0 . 5 7 0 E - 13 89.36

T a b l e 8. R e s u l t s for E x a m p l e 3.2, H k [ h , h i = Iht 2.

lter Inner iter t l f k - - f* l l~ Time (sec)

0 - - 0.5527 - 1 8 0 .153E+00 651.82 2 4 0 . 1 5 0 E - 01 575.05 3 3 0.279E-- 03 534.57 4 3 0.893E-- 07 502.88 5 5 0 . 2 3 8 E - t 3 580.2t

Page 14: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

566 JOTA: VOL. 78, NO. 3, SEPTEMBER 1993

running times here are slower than one would be forced to use in practice. A simple grid refinement scheme speeds up this type of procedure tremendously. Grid refinement, however, is not the subject of this paper.

Example 3.2. Here,

F ( e i°, z) = I0.8 - (1 / e i° + z)I 2.

The level sets are lemniscates with center at - e -i°. These are not convex sets.

For Example 3.2, algorithms (3b) and (3c) are identical, since the identity

~2F/Oz ~ = I~F/~zl 2/F

holds. Consequently, we present only two tables (Tables 6-7). Algorithm (3b) with no inner iterations (Table 7) was 10 times as fast as algorithm (3a) in Table 6.

Again, we tried to improve the run times for algorithm (3a) by using the stopping tolerance (5). This is reported in Table 8 and produces a modest improvement over Table 6 (65 percent). Thus, the algorithm with no inner loops is still a factor of 6 improvement over this version of (3a).

3.3. Strong Directional Uniqueness vs. Order of Convergence. Strong uniqueness is an essential hypothesis for Theorem 2.1, but Theorem 1.1 says that it does not hold in H~-optimization.

Proof of Theorem 1.1.

(a) It is enough to do the case M = t, since M > 1 follows from (c). As we shall see in Lemma 4.1, local strong uniqueness of f * is

equivalent to the existence of constants r/> 0, ~ > 0 so that

[IhlL 0o <~ sup Re(a. h), 0

for all h satisfying Hh][ o~ < q. For each le N, letf~ be the one-to-one analytic map from the closed unit disk onto a region Rx in the complex plane bounded by an ellipse with vertices at + i, + 1/l normalized by f t ( 0 ) = 0, f~(0) > 0. If n is the winding number of a about 0, Theorem 2.1 in Ref. 15 says that n > 0. Pick continuous, periodic real-valued functions u, v such that

a( eiO) = ( eiO),~ . e,(O) + iv~o) = ( eiO)n e,{O) + ~*(o) . e ~*(o) + i~(o)

Page 15: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

JOTA: VOL. 78, NO. 3, SEPTEMBER t993 567

Here, v* denotes the harmonic conjugate of v, so the function

b( e i°) = e-~*(o)+ ~(o)

can be extended to a function which is analytic and nonzero on the closed unit disk, hence invertible there. Now, let

h/(z ) = f l(z")/( l . z n . b(z ) ),

where h I is defined by continuity at 0. Then, for z = e ;°, we have

sup 2 Re(aht)/flhl ti 0

= sup 2 Re(eU+~ft)/(l. Ilht Ir ~) = sup 2 Re(e~+~ft(z~'))/su p If~(z")/b(z)l 0 0 0

<<. (Irblt oo/[Ifz II ~)" sup 2e ~+ ~. Re(f,(z")) ~< 2 I[blP ~ lie ~+ vii ~/1. 0

H e n c e ,

sup 2 Re(ahj)/lthtIl~ - , O, as l---, co. 0

(b) If h e A l \ { 0 } , then by Lemma 2.9 in Ref. 5, we have that

q := sup Re{a(e i°) h(ei°)} >0. e iO ff T

Therefore,

sup F(e i°, f * ( e i°) + th(ei°)) e iO ~ T

= sup {F(e i°, f*(e~°)) + 2t Re{a(e i°) h(e ~°) } + t20(fh(e~°)l 2) d o t T

= s* + 2ttl + O(t2),

so f * is a SDU solution for (OPT). Here, SDU stands for directional uniqueness.

(c) Since M > 1 and by Theorem 2.11 in Ref. 5, the set

~,U(a) := {h e AN: a(ei°) ' h(e '°) = O, Ve iOe T}

is nonempty. Using an expansion like the one in (5), we obtain

r(e ;°, f * ( e i°) + th(ei°)) = s* + O(t2).

Thus, f * is not a SDU solution to (OPT).

(4)

strong

D

Page 16: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

568 JOTA: VOL. 78, NO. 3, SEPTEMBER 1993

Our numerical experiments have convinced us that, in practice, the conclusion of Theorem 2.1 is valid when M = 1. We suspect that the reason is that certain pathologies seldom occur, and we now try to formulate this idea. Any computer program must replace the infinite-dimensional space AM with some parameter set 9.1 contained in AM. For example, our program does the simplest thing and takes ~l to be M copies of the space Pm of polynomials of degree <~m. Often, rn ~ 64.

We say that a cone 9.1 is a set of SDU directions provided that SDU holds when the set of admissible directions h is restricted to 9.1.

Theorem 3.1. Under the hypotheses of Theorem 1.1, the following results hold:

(a) If M = 1, then Pm is a set of SDU directions for all m ~ N. (b) If M > 1, then Pm is not a set of SDU directions for all large

enough m ~ N.

Proof. Suppose M--- 1. Let a( . ) be as in Theorem 1.1. If M > 1, it is enough to consider the case where a(. ) has rational entries, since the set of such functions is dense in the set of all continuous functions. Thus, let at = Pt/qt, where Pt, ql are analytic polynomials. Pick two entries of a which are not identically zero (say, al , a2), and set

ht = ( - q 2 P l , qlP2).

For m large enough, h ~ Pm× Pm, and also

a(ei°) ~ h(e i°) = O, Ve i° ~ T.

This proves (b). To prove (a), we claim that it is enough to prove that ~0: Pm--" R,

defined by

~0(h) = sup Re(ah), o

is a norm. This is so because all norms o n Pm are equivalent, and one can then proceed as in the proof of (b) of Theorem 1.1. We show first that ~0 is nonnegative. If h ~ Pm does not have zeros on T, we have from Ref. 5 that

wind(ah; 0) = wind(a; 0) > 0,

which implies that, for this h, ~0(h)> 0. Here, wind (q*, 0) is the winding number of a complex valued function q with respect to 0. Since polyno-

Page 17: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

JOTA: VOL. 78, NO. 3, SEPTEMBER t993 569

mials are continuous functions of their zeros, it follows that (p(h)/> 0 for all h e_P m.

If now h s Pm is such that (p(p) = 0, then h has zeros ~1 . . . . . c~ on T, for some 1 ~< m. Consider the polynomial

l

j = l k = l , . . . , l k ¢ j

where the constant CJ is chosen so that

p(cg) a(ej) < O.

Now, for every e > O,

h+e,p~Pm

and

Re[a(~j)(h(~j) + ~p(aj))] < O, ~i = 1 , . . . , l.

Let r > 0 be such that

Re(a(z) p(z)) < O,

and pick e > 0 so small that

Re a( h + eo p) < - Co,

Hence,

for every z ~ T¢~ f

o n ( T \ K ) .

sup Re(a(h + cop)) 0

= m a x { s u p R e ( a ( h + e o p ) ) , sup Re(a (h+~op) ) }<O, z ~ K z ~ ( T \ K )

and this contradicts the fact that ~0 is nonnegative. Hence, h = 0 and (p is a norm.

4. Estimates of Order of Convergence

This section proves theorems stated in Section 2 on order of convergence of higher-order algorithms and also gives additional results. Our first result gives a basic relationship between the behavior of the kth subproblem in a higher-order algorithm and the error at the kth iterate.

Page 18: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

570 JOTA: VOL. 78, NO. 3, SEPTEMBER 1993

Recall that the objective function in the kth subproblem is denoted by Qm(b ' fk) and that the kth subproblem is Step A1 in Section 2. We need to define an acceptable behavior of the Hk's. As we shall see, very little restriction is required. Given 7 as before, f * e V, e e [0, 1 ], and sequences {fk} c V, {H k} ~ out', we say that {H g} is e-convergent with respect to { i f} and f * if there exists a constant co > 0 such that

sup ]lHk[hl, h23 - - (1/2) 7"(fk)[hl, ha] II ~N . r ~ X

<~collfk-f*H~vllh~llvllh21lv, Vhl, h26V.

Note, when e =0, this means that {H k} is a uniformly bounded sequence in ~ . Our first error estimate is given by the following theorem.

Theorem 4.1. Let f * ~ V, and let f l , f 2 , . . , be a sequence generated by a full step higher-order algorithm in which the Hk's are ~-convergent. There exist positive constants c, q such that

sup fIQm(f k + t k k . 2 +~ - f , fk)H~u~<supffT(f*)fERN+clIf - f IIv , (5) x e X x ~ X

whenever IIf k - f * l l v < q.

Proof. Expand 7( f k) in (1) about f * to obtain

O H 4 f - f f , i f ) = 7(f*) + 7 ' ( f * ) [ f f - f * ]

+ (1/2) 7 " ( f * ) [ f f - f * , f k _ f , ]

H k ~ k + 7 ' ( f k ) [ f - f k] + [ f - - f , f - f ] + Gl(fk),

where GI: V~ CN(X) is such that

sup [IGI(fk)IIN <~ C 1 I I f t - / * l l 3 V' xEX

if ] ] fk_f , [ ] v is small enough. We get

O r e ( f - i f , fk) = 7(f*) + ~ / ( f f ) [ f - f * ]

+ ? ' ( f * ) [ f f - f * ] - 7 ' ( f k ) [ f k - f * ]

+ (1/2) y , ( f , ) [ f k _ f , , f k _ f , ]

+ H k [ f _ f k , f _ f k ] + Gl(ff) .

Page 19: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

JOTA: VOL. 78, NO. 3, SEPTEMBER 1993 571

Since

7 ' ( f* ) [ f g - f * ] - 7'(Jk) [ f k _ f , ]

= _ 7 , , ( f , ) [ f k _ f , , f k _ f , ] + G2(fk) ,

for some function G2: V-~ C N ( X ) such that

sup 11G2(f k) II ~u <<. c2 II f~ - f * [t 3, x ~ X "

we can write

Q u k ( f _ f k , fk ) = 7( f*) + 7 ' ( f k ) [ f -- f * ] + H k [ f -- f k, f _ f k ]

,t , k * - (1/2)7 ( f ) [ f - f , f k _ f , ] + G l ( f k ) + G 2 ( f k ) . (6)

S e t f = f * in (6); then, for some constant c>O,

sup ][Q/4k(f* _ f k , fk)H ~N ~< sup [[7(f*)11 ~ + c Hf k -- f * IF 2V+,. (7) x ~ X x ~ X

Also, by the definition o f f k+ 1,

sup IPOm(f k+~ _ f k , f~)tl ~N ~< sup IfQH~(f* _ f k , fk)t I ~N. (8) x ~ X x ~ X

The result follows from (7) and (8).

The following Theorem 4.2 applies to the situation described in Section 3 (H~-optimization). A small modification of its proof yields Theorem 2.2.

Theorem 4.2. Under the hypothesis of Theorem 4.1, suppose that N = 1, 7(f ) (x) > 0 for a l l f ~ V and all x e X , and that the peak set E ( f * ) is all of X. Then, there exist constants ~/> 0, c > 0 such that, if k is such that H f k - f * f l v<rl,

s up { 7 ' ( f k) I f ' +1 _ f , ] + H k i f * +~ _ f , , f * + 1 _ _ 2f* + f * ] } x ~ X

<~ c l l f k - - f*l l 2 +~v • (9)

Proofi From Theorem 4.1 and the definition o f f k÷l, there exist a positive constant c, such that

sup Q u , ( f *+1 - f * , f k ) <~ sup ?,(f*) + c, IIf k - f * j l v + ~, x ~ X x E X

Page 20: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

572 JOTA: VOL, 78, NO. 3, SEPTEMBER 1993

whenever [ [ f ~ - f * [ l v is small enough. Now, (6) implies that, for some constant c2 > 0 and I f f k - f*J l v small,

sup {7(f*) + 7 ' ( f k ) [ f - - f * ] + H ~ [ f k+~ _ f k , f k , _ f k ] x ~ X

(1/2) " * k , k - 7 ( f ) [ f - f , f - - f * ] }

~< sup 7( f*) + c2 t[f ~ -- f*lt Zv+ L (10) x E X

Note that 7 ( f*) cancels in both sides of (10), since it is constant on X. On the other hand, since H k is bilinear and symmetric,

H k[ fk + 1 _ fk , f k + i _ f~] = H [ f k +1 _ f , , f k +, _ 2fk + f , ]

+ H [ f k _ f , , f k _ f , ] . (11)

Finally, (10), (11), and the e-convergence of the Hk's give the result. [~

If the sequence {fk} in Theorem 4.1 remains in a small enough neighborhood o f f * , Theorem 4.3 asserts that it has to converge at a rate which is at least order 2. Of course, this is not known a priori.

Theorem 4.3. Let {f~}, {Hk}, and e be as in Theorem 4.1, and suppose that f * is a local strongly unique solution to (P). Then, there exist t /0>0, c > 0 such that, if I I f k - f * l l v<r lo for all k, then

[[fk+ 1 _ f , [ [ v<C t l f x - f*[l~ +~.

Note that, in Theorem 4.3, we can take ~---0, a fact that we use in Theorem 2.1.

Before proving Theorem 4.3, we need a lemma.

Lemma 4.1. equivalent:

( I )

Let f * ~ V. The following three statements are

There exists e > 0 such that

sup ( 7 ( f * ) , 7 ' ( f * ) [ h ] ) > ~ e [Ihllv, Vhe V. (12) x e E ( f * )

(II) There exist e > 0, t /> 0 such that

sup lfT(f*)+ y'(f*)[h]t!~N x ~ E ( f * )

1> s u p 117(f*)ll ~. + ~ IlhJl v, x Cli X

Vh, llhtt v < u .

Page 21: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

JOTA: VOL. 78, NO. 3, SEPTEMBER 1993 573

(III) There exist ~ > 0, q > 0 such that

sup It~(f* + h) l l~ x ~ E ( f * )

) s u p II?(f*)ll~N+ ~ ]lhll v, Vh, Ilhllv<q.

These statements are equivalent to statements (I'), (II'), (III'), obtained from (I), (II), (III) by taking the suprema over the set X instead of the set E(f*) ; i.e., all of them are different formulations of the local strong unique- ness o f f* . If N = 1 and ? ( f ) > 0 for a l l f e V, then (I) is also equivalent to the following statement:

(IV) There exists ~ > 0 such that

sup ) , ' ( f*)[h] >~:~ IlhItv, ghe V. x e E ( f * )

Proof of Theorem 4.3. Following steps similar to the ones in the proof of Theorem 4.2, we see that there exist constants c 1, ql > 0 such that, whenever 0 < q < r/t and l]f k - f * i ] v< r/,

sup tiT(f*) + v ' ( f k ) [ f k+l - f * ] + H k [ f k+~ - - f * , fk+~ _ 2fk +f*]lL ~' x ~ E ( f * )

< sup 117(7")II ~,v + cl t lf k - f * 2 +~ l!v • (13)

Note that, if Hf k + 1 - f * I/v < ~/also holds, then

sup tlT(f~)[f ~+1 - f * ] - 7 ' ( f * ) [ f k+l - f * ] x ~ E ( f * )

+ H k [ f k+l - - f * , J'* + ~ -- 2/~ + f * ] il ~,,'

<~ C2/~ ilfk+l f , llv+ ljfk+t f * * 2 _ f , - ~ l v + 2 r / I l f k+l ]Iv

<q (c2+ 3 ) I l f * + l - f * l l v . (14)

Here, c2 > 0 is a constant such that

sup liT(f ) [h i - 7 ' ( f* ) [h] II ~,~' <~ c2 Ilhll v it f - f* l l v, x ~ J (

for all h e V, all f in a neighborhood of f * with radius tl. Let c~ be the constant in (II) of Lemma 4.1, and pick t/ so small that r / (c2+3)<~/2. Then, from (13) and (14),

sup l t~ ' ( f* )+~ ' ( f~ ) [ f k+~ - f * ] l i R ~ - (~/2) [If k~ ~ - f* l [ v

~<sup tl~'(f*)II ~ + c~ ll.f* - - f * II 2+~v • x ~ X

Page 22: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

574 JOTA: VOL. 78, NO. 3, SEPTEMBER 1993

Now, Lemma 4.1 implies that

sup llT(f*)ll ~N + (C(/2) [If ~ + t _ f , tl V ~ sup tlY(f*)t] RS + Cl tlf k -- f * 2 +~ V x • X

and the theorem follows. []

Proof of Lemma 4.1. Let

s* = sup tlY(f*)ll aN, x • X

and assume (I). Let el = 2a/3s*, and find q > 0 so small that

Then,

c~ Ilhll2-4 - sup II~'(f*)l-h311~<cqs* Ilhl[ v, X•X

[Ihll <~.

2s*~ Ilhllv+c~ Ilhall~+ sup l[7'(f*)[h3ll2N x • E ( f * )

~<3~1s* [[hll v~<2c~ Hhll v~<2 sup <~(f*), 7 ' ( f * ) [ h ] ) . x • E ( f *)

Hence,

(s* + ~1 Ilhll v) 2 ~<s .2 +

~< sup x e E ( f * )

= sup x ~ E ( f * )

and this proves (II).

sup 2<7(f*), 7 ' ( f * ) [ h ] ) x • E ( f ~')

{s .2 + 2{7(f*), 7 ' ( f* ) [h ] ) + ltT'(f*)[h]tl~N}

I17(f* ) + ? / ( f*) [h ] II ~N,

Let e, ~/be as in (II), and take ?]1 <t] so small that

sup Ile(h)ll~N< (~/2)hlhll v, x • E ( f * )

where by definition

e (h ) = 7 ( f * + h ) - 7 ( f * ) - 7 ' ( f* ) [h ] .

Then (III) follows, with constants t/l as above, and ~ =c~/2. Note that a similar proof can be used to show that (III) implies (II).

Also, the same proofs above with only trivial changes show that (I') implies (II') and that (II') is equivalent to (III'). To finish the proof of the

Page 23: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

JOTA: VOL. 78, NO, 3, SEPTEMBER 1993 575

lemma, except for (IV), it is enough to show that (II') implies (I) and that (II) implies (I'). Suppose that (I1) holds. Then, if llhIl v < ~/,

(s* +c~ lthllv)~ ~ sup 1tT(f*)+ ~'(f*)[h]ll~N x ~ E ( f * )

~< s .2 + sup 2 (~/(f*), 7 ' ( f* ) [h ] ) + sup HT'(f*)Eh] It ~N. x ~ X x E X

Hence,

2s*~ l[h[[ v + c~2 t l h I I ~ sup 2(~(f*) , 7 ' ( f* ) [h ] > + sup [l~"(f*)[h][[ ~,. x~X XEX

Now, pick r h < q such that, if [[h[t v< rh, then

te 2 IlhJi2v-sup [17'(f*)[h]ll~NI <S*~ ]thH v, x ~ X

so that in this case one has

s*~ Hh[[ v < sup <7(f*), v'(f*)Eh] ); x ~ J (

i.e., (I') holds with constants q~ and el = s*e. Finally, assume (II') and fix h ~ V with [[h]l v< q. Then, for each n/> 1,

sup 117(f*) + 7'(f*)[(1/n)h ] II a~ >~ s* + (s/n) [[h[[ v. x ~ X

(15)

For each n e N, let x , ~ X such that the supremum in (15) is attained at x , ; i.e.,

l[~(f*)(xn)+ )"(f*)[(1/~l)h](xn)Iini,~>~s* +(~/n) i[hlIv. (t6)

Dropping terms, if necessary, we may assume that the sequence {x~} converges to an element x* in X. Letting n ~ ~ in (16), we see that [[7(f*)(x*)ll ~>s*; and since this inequality holds trivially in the other direction, we obtain that x*~ E(f*). Also squaring both sides of Inequality (16), we get

2<),(f*)(x,,), y'(f*)[h ](x,) ) + ( l /n)I]7 '(f*)[h](x,) i[ ~,

/> 2~s* tt/,ll ~ + (~2/n) Ilh!l v, (17)

and we let n -~ oo in (17) to have

<7(f*)(x*), v'(f*)[h](x*) ) >~ ~s* [[h[[ v- (18)

Page 24: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

576 JOTA: VOL. 78, NO. 3, SEPTEMBER 1993

Now, h can be rescaled to obtain (I), since both sides of (18) are homogeneous in h. Also note that, under the hypothesis of (IV), statement (I) is just another formulation of (IV). E3

fk Theorem 4.4. Let { } {Hk}, and e > 0 be as in Theorem 4.1. Suppose that f * is a local strongly unique solution to (P); i.e., there exists

> 0 such that, for all h e V,

sup ]]T(f*)-t-y'(f*)[h]+Hk[h,h]ll~>sup [l~2(f*)ll ~ -4 -~ Ilhll v. (19) x ~ X x ~ X

Then, there exist constants t />0, c > 0 such that, if k is such that llf k - f * ] l v< t/, then

iifk+ 1 - f * I I v < c' l lf k - f * I I 2+L (20)

Proof. From (13), for some constants c l, r/l, whenever IIf k - f * t l v< t/, we have

sup lJ7(f*) + 7 ' ( f * ) [ F + 1 _ f , ] + Hk[fk+ 1 _ f , , f ,+1 _ f , ] 1[ ~N x E X

- sup ILT'(fk) [ f k+ 1 _ f , ] _ 7 , ( f , ) [ f k+l _ f , ] x ~ - X

_ 2Hk[fk + 1 _ . f , , f k _ f , ] It ~u

< sup II 7(7* )11 ~N + cl IINk - f * II 2v+, x ~ X

Now, (19), the continuity of 7' in a neighborhood of F*, and the uniform boundedness of the sequence {H k } give

{tf k + l sup ItY(f*)ll~N+ ~ llfk+~--f*llv--C2 --f*tl v" l l fk-f*l l2v ~ x ~ X

< s u p II~(f*)ll~N+cl I l f k - f * l l v, (21) x E X

whenever llf k --f*[I v is smaller than some constant, say r h > 0. It is clear now that, if

Ill ~ - f * II v < n = min { rh c~/2c2 },

Inequality (20) follows from (21). D

Inequality (19) is more restrictive than local strong uniqueness, since the term supx~x IIHk[h, h] II ~N is small with respect to Ilhll w when the latter is small. Our next result gives a sufficient condition for (19) to hold.

Page 25: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

JOTA: VOL. 78, NO. 3, SEPTEMBER 1993 577

Theorem 4.5. Let f * 6 V, h ~ ovY, and s*=supx~xlB7(f*)jl~u. Assume that, for every Xo 6 E ( f * ) such that

(i) (7( f*)(xo) , 7 ' ( f*)[h](xo) ) >1 as* Ilhll v, (ii) (7(f*)(Xo), H[h, h ](xo) ) >~ (1/2) ~ 2 Ilh],12v.

Then (19) holds.

> 0, and set h e V, there exists

Proof. Pick any h e V, and let x o ~ E ( f * ) such that it satisfies both (i) and (ii). Then,

(sup lt7(f*) + 7 ' ( f * ) [ h ] + H[h, hi It ~u) 2 x e X

~> [[7(f*)(Xo) + 7 ' ( f*)[h](xo) + H[h, h](xo)[I 2

= s .2 + 2(7(f*)(Xo) , 7 ' ( f*)[h](xo) + H[h, h] (xo) )

+ 117'(f*)[h](xo) + H[h, h](xo)[I ~N

~>s*R+2~s * Ilhtlv+~ 2 Ilhliav= (s* + ~ Ilhlt v) 2. []

Note that (i) in Theorem 4.6 states precisely the fact that f * is a local strongly unique solution to (P). Condition (ii) in the same theorem is a positivity-type one, and in particular implies that H[h, h] cannot be zero unless h = 0.

References

1. WATSON, G. A., Approximation Theory and Numerical Methods, Wiley, New York, New York, 1980.

2. HELTON, J. W., Worst Case Analysis in the Frequency Domain: An H ~° Approach to Control, IEEE Transactions on Automatic Control, Vol. 30, pp. 1154-1170, 1985.

3. DOYLE, J., The l~-Analysis and Synthesis Toolbox for Use with Matlab (to appear).

4. FAN, M., KONINCKX, J., TITS, A., and WANG, L., Documentation for Console: A CAD Tandem for Optimization-Based Design Interacting with Arbitrary Simulators, Report Systems Research Center, University of Maryland, 1992.

5. HELTON, J. W., and MERINO, O., Conditions for Optimality over H ~, SIAM Journal on Control and Optimization (to appear).

6. MAYNE, D. Q., NYE, W. T., POLAK, L., and Wu, T., DELIGHT MIMO: An Inter- active, Optimization-Based Multivariable Control System Design Package, Computer-Aided Control Systems Engineering, Edited by M. Jamshidi and C. J. Herget, North-Holland, Amsterdam, Holland, 1985.

Page 26: Novel Approach to Accelerating Newton's Method for Sup ...helton/BILLSPAPERSscanned/HM93c.pdfin an optimization problem on a finite-dimensional space, if ~(h) is a continuous function,

578 JOTA: VOL. 78, NO. 3, SEPTEMBER 1993

7. STREIT, R., Solution of Systems of Complex Linear Equations in the L~ Norm with Constraints in the Unknowns, SIAM Journal on Scientific and Statistics Computation, Vol. 7, pp. 132-149, 1986.

8. SIDERIS, T., Robust Feedback Synthesis via Conformal Mappings and H~ Optimization, PhD Thesis, University of Southern California, 1985.

9. HELa'ON, J. W., Optimal Frequency Domain Design vs an Area of Several Com- plex Variables, Plenary Address, International Symposium on Mathematical Theory of Networks and Systems, 1989.

10. HELTON, J. W., and MARSHALL, D. Frequency Domain Design and Analytic Selections, Indiana Journal, Vol. 34, pp. 157-184, 1990.

11. B~NCE, J., HELTON, J. W., and M~RINO, O., Worst Case Analysis in the Frequency Domain, Part 2 (manuscript).

12. POW~LL, M. J. D., General Algorithms for Discrete Nonlinear Approximation Calculations, Approximation Theory IV, Edited by G. K. Chui, L. L. Schumaker, and J. D. Ward, Academic Press, New York, New York, 1983.

13. MURRAY, W., and OVERTON, M. L., A Projected Lagrangian Algorithm for Nonlinear Minimax Optimization, SIAM Journal on Scientific and Statistics Computation, Vol. 1, pp. 345-370, 1980.

14. B~NCE, J., HELTON, J. W., and MARSHALL, D. E., Optimization over H ~, Proceedings of Conference on Decision and Control, Athens, Greece, 1986.

15. HELTON, J. W., Optimization over Spaces of Analytic Functions and the Corona Problem, Journal of Operator Theory, Vol. 15, pp. 359-375, 1986.