progress in nondifferentiable optimizationpure.iiasa.ac.at/2019/1/cp-82-708.pdf · system,...
TRANSCRIPT
![Page 1: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/1.jpg)
I IASA COLLABORATIVE PROCEEDINGS SERIES
PROGRESS IN NONDIFFERENTIABLE
OPTIMIZATION
![Page 2: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/2.jpg)
I IASA COLLABORATIVE PROCEEDINGS SERIES
LARGE-SCALE LINEAR PROGRAMMING Proceedings of an IIASA Workshop G.B. Dantzig, M.A.H. Dempster, and M.J. Kallio, Editors
THE SHINKANSEN PROGRAM: TRANSPORTATION, RAILWAY, ENVIRONMENTAL, REGIONAL, AND NATIONAL DEVELOPMENT ISSUES A. Straszak, Editor
HUMAN SETTLEMENT SYSTEMS: SPATIAL PATTERNS AND TRENDS Selected Papers from an l IASA Conference T. Kawashima and P. Korcelli, Editors
RISK: A SEMINAR SERIES H. Kunreuther. Editor
THE OPERATION OF MULTIPLE RESERVOIR SYSTEMS Proceedings of an International Workshop, Jodlowy Dwor, Poland Z. Kaczmarek and J. Kindler. Editors
NONPOINT NITRATE POLLUTION OF MUNICIPAL WATER SUPPLY SOURCES: ISSUES OF ANALYSIS AND CONTROL Proceedings of an I IASA Task Force Meeting K.-H. Zwirnmann. Editor
MODELING AGRICULTURAL-ENVIRONMENTAL PROCESSES IN CROP PRODUCTION Proceedings of an IlASA Task Force Meeting G. Golubev and I. Shvytov, Editors
LIQUEFIED ENERGY GASES FACILITY SITING: INTERNATIONAL COMPARISONS H. Kunreuther, J. Linnerooth, and R. Starnes, Editors
ENVIRONMENTAL ASPECTS IN GLOBAL MODELING Proceed~ngs of the 7th I IASA Symposium on Global Modeling G. Bruckmann, Editor
PROGRESS IN NONDIFFERENTIABLE OPTIMIZATION E.A. Nurminski. Editor
![Page 3: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/3.jpg)
PROGRESS IN NONDIFFERENTIABLE
OPTIMIZATION E.A. Nurminski
Editor
INTERIUATIONAL INSTI'TUTE FOR APPLIED SYSTEMS ANALYSIS Laxenburg, Austria
1982
![Page 4: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/4.jpg)
lnternational Standard Book Number 3-7045-0050-X
Collaborative papers in this Special series sometimes report work done a t the lnternational lnstitute for Applied Systems Analysis and sometimes work done elsewhere. They are reviewed a t IIASA, but receive only limited external review, and are issued after limited editorial attention. The views or opinions they express do not necessarily represent those of the Institute, its National Member Organizations, or other organizations supporting the work.
Copyright O 1982 lnternational Institute for Applied Systems Analysis
All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher.
![Page 5: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/5.jpg)
PREFACE
The System and D e c i s i o n S c i e n c e s Area p l a y s a d u a l r o l e a t
t h e I n t e r n a t i o n a l I n s t i t u t e f o r A p p l i e d Sys tems A n a l y s i s ( I I A S A ) ,
b o t h p r o v i d i n g m e t h o d o l o g i c a l a s s i s t a n c e t o o t h e r r e s e a r c h g r o u p s
and c a r r y l n g o u t fundamen ta l r e s e a r c h on new methods and models
f o r u s e i n a p p l i e d s y s t e m s a n a l y s i s . The O p t i m i z a t i o n Task o f
t h e Sys tem and D e c i s i o n S c i e n c e s (SDS) Area c o n t r i b u t e s i n b o t h
of t h e s e f i e l d s by h e l p i n g t o s o l v e o p t i m i z a t i o n p rob lems a r i s -
i n g i n a p p l i e d a r e a s and a l s o by p r o v i d i n g an i n t e r n a t i o n a l forum
f o r La rge - sca l e and dynamic l i n e a r programming and f o r non-
d i f f e r e n t i a b l e o p t i m i z a t i o n . I n t h e second o f t h e s e c a p a c i t i e s ,
SDS s p o n s o r s a n n u a l t a s k - f o r c e m e e t i n g s on v a r i o u s a s p e c t s o f
o p t l m i z a t i o n , b r i n g i n g t o g e t h e r r e s e a r c h w o r k e r s from b o t h E a s t
and West t o d i s c u s s advances i n methodology and i m p l e m e n t a t i o n .
T h i s volume grew o u t o f t h e second m e e t i n g on n o n d i f f e r e n -
t i a b l e o p t i m i z a t i o n , a f i e l d whose most i m p o r t a n t a p p l i c a t i o n s
l i e i n t r e a t i n g problems o f decision-making u n d e r u n c e r t a i n t y .
Many i m p o r t a n t advances were made between t h e f i r s t mee t ing i n
1977 and t h e second i n 1978--new r e s u l t s w e r e o b t a l n e d i n t h e
t h e o r y o f o p t i m a l i t y c o n d i t i o n s , and t h e r e was more u n d e r s t a n d -
l n g o f t h e r e l a t i o n s h i p s between v a r i o u s c l a s s e s o f n o n d i f f e r e n -
t i a b l e f u n c t i o n s . A l l o f t h e s e new deve lopmen t s were d i s c u s s e d
![Page 6: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/6.jpg)
at the meeting, the reports presented by the participants cover-
ing the theory of generalized differentiability, optimality
conditions, and the numerical testing and applications of
algorithms.
After the meeting the participants prepared extended ver-
sions of their contributions; these revised papers form the core
of this volume, which also contains a bibliography of over 300
references to published work on nondifferentiable optimization,
prepared by the Editor.
It is hoped that this volume will be of use to those al-
ready working in nondifferentiable optimization and will stimu-
late the interest of those currently unfamiliar with this new
and rapidly expanding field.
![Page 7: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/7.jpg)
CONTENTS
Introduction T h e Zditor
Methods of nondifferentiable and stochastic optimization and their applications Yu. ,Y. Ermo l iev
Acceleration in the relaxatLon method for linear inequalities and subgradient optimization J . L . Cojfin
Numerical experiments in nonsmooth optimization C. iemarichal
Convergence of a modification of ~emarhchal's algorithm for nonsmooth optimization 2. M i f f Zin
Subgradient method for minimizing weakly convex functions and 5-subgradient methods of convex optimization Y. d . #urminski
Favorable classes of Lipschitz-continuous functions in subgradient optimization 2.1. 2ocka jellar
Sondifferentiable functions in hierarchical control problems A. S u s z c z y h s k i
Lagranglan functions and nondifferentia~le optimization .A. 2 . K i z r z b i c i i
Bibliography on nondifferentiable opt~mlzation . . . :..A. . ' t ' : ' z > ~ L i : S < :
![Page 8: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/8.jpg)
![Page 9: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/9.jpg)
INTRODUCTION
E. Nurminski ( E d i t o r ) I n t e r n a t i o n a l I n s t i t u t e f o r App l i ed Systems A n a l y s i s , Laxenburg, A u s t r i a
I IASA's i n t e r e s t i n n o n d i f f e r e n t i a b l e o p t i m i z a t i o n (i4DO)
1 s u a s e d on t h e g r e a t p r a c t i c a l v a l u e o f NDO t e c h n i q u e s . T h i s
new f i e l d o f ma themat i ca l programming p r o v i d e s s p e c i a l i s t s i n
a p p l i e d a r e a s w i t h t o o l s f o r s o l v i n g n o n - t r a d i t i o n a l problems
a r i s i n g i n t h e i r work and w i t h new approaches and i d e a s f o r
t r e a t i n g t r a d i t i o n a l problems. N o n d i f f e r e n t i a b l e o p t i m i z a t i o n
is conce rned w i t h t h e new t y p e o f o p t i m a l d e c i s i o n problems
wnicn have o b j e c t i v e s and c o n s t r a i n t s r e s u l t i n g from t h e be-
h a v i o r o f d i f f e r e n t complex subsys tems , t h e s o l u t i o n s o f a u x i l -
i a r y extremum problems, and s o on. A common f e a t u r e o f t h e s e
problems i s t h a t t h e o b j e c t i v e s and c o n s t r a i n t s i n e v i t a b l y have
poor a n a l y t i c a l p r o p e r t i e s .
Good a n a l y t i c a l p r o p e r t i e s a r e e s s e n t i a l b o t h f o r perform-
i n g comprehensive t h e o r e t i c a l a n a l y s i s and f o r p roduc ing e f f i -
c i e n t c o m p u t a t i o n a l methods which a r e a c c e p t a b l e i n p r a c t i c e .
Tne most i m p o r t a n t o f t h e s e a n a l y t i c a l p r o p e r t i e s a r e t h e e x i s -
t e n c e and c o n t i n u i t y o f d e r i v a t i v e s o f v a r i o u s o r d e r s .
U n f o r t u n a t e l y , d e r i v a t i v e s a r e v e r y sensitive t o manipula-
tron--many s t a n d a r d o p e r a t i o n s and r e p r e s e n t a t i o n s used i n eco-
nomics o r o p e r a t i o n s r e s e a r c h d e s t r o y t h e p r o p e r t y o f
d i f f e r e n t i a b i l i t y .
![Page 10: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/10.jpg)
A s an example, c o n s i d e r t h e p iecewise method o f r e p r e s e n t -
i n g t h e response f u n c t i o n f o r d i f f e r e n t ranges o f v a r i a b l e s .
T h i s t y p e of approximat ion o f t e n has a d i s c o n t i n u i t y i n t h e
f i r s t - o r d e r o r h igher -order d e r i v a t i v e s a t t h e boundary between
c o n s e c u t i v e i n t e r v a l s .
The i n c o r p o r a t i o n o f u n c e r t a i n t y i n env i ronmenta l pa ramete rs
o r systems c h a r a c t e r i s t i c s by means of t h e minimax p r i n c i p l e
p r o v i d e s a n o t h e r example. I n t n i s c a s e , t h e r e s u l t i n g c r i t e r i a
and c o n s t r a i n t s w i l l a lmost c e r t a i n l y have d i s c o n t i n u o u s d e r i v a -
t i v e s , r e g a r d l e s s of how well-behaved t h e i n i t i a l e q u a t i o n s may
have been.
Many o f t h e p rocedures used i n m u l t i o b j e c t i v e o p t i m i z a t i o n
c r e a t e an a u x i l i a r y n o n d i f f e r e n t i a b l e problem i n t h e s e a r c h f o r
a compromise s o l u t i o n , and game- theore t i ca l approaches , e q u i l i b -
rium f o r m u l a t i o n s and decomposi t ion a r e a l s o i m p o r t a n t s o u r c e s
of n o n d i f f e r e n t i a b l e problems.
The absence o f d e r i v a t i v e s l e a d s t o many t h e o r e t i c a l d i f f i -
c u l t i e s and numerous p r a c t i c a l f a i l u r e s i n s o l v i n g c e r t a i n prob-
lems i n o p e r a t i o n s r e s e a r c h and systems a n a l y s i s . The l a c k of
con t inuous d e r i v a t i v e s makes it very d i f f i c u l t t o p r e d i c t w i t h
a qood degree of accuracy t h e e f f e c t of s m a l l changes i n c o n t r o l
va r iab les - -and t h i s h i n d e r s t h e performance of many numer ica l
algorithms.
These, t h e n , were t h e main motivations f o r t h e s t u d y of
n o n d i f f e r e n t i a b l e f u n c t i o n s - - t h a t i s , f u n c t i o n s f o r which d e r i v -
a t l v e s do n o t e x i s t i n t h e t r a d i t i o n a l s e n s e of t h e word. The
problem was approached from many a n g l e s : t h e s e i n c l u d e d t h e
s t u d y o f g e n e r a l i z e d d i f f e r e n t i a b i l i t y and p r o p e r t i e s of
g e n e r a l i z e d d e r i v a t i v e s , t h e a n a l y s i s o f extremum problems and
o p t i m a l i t y c o n d i t i o n s , and t h e development of computer a l g o r i t h m s .
Tne f i r s t I I A S A meeting on n o n d i f f e r e n t i a b l e o p t i m i z a t r o n
was concerned malnly wi th t h e development of a l g o r i t h m s . I t
summarized p a s t developments i n b o t h E a s t and West, o u t l i n e d t h e
f i e l d s of a p p l i c a t i o n , provided t e s t examples, and gave a com-
prehens ive b i b l i o g r a p h y compiled by p a r t i c i p a n t s and o t h e r
c o n t r i b u t o r s th rougnout t h e world.
![Page 11: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/11.jpg)
S i n c e t h e n many i m p o r t a n t r e s u l t s have been o b t a i n e d i n t h e
g e n e r a l t h e o r y o f d i f f e r e n t i a b i l i t y , and t h e s e and o t h e r r e s u l t s
a r e d i s c u s s e d i n t h e e i g h t p a p e r s c o n t a i n e d i n t h i s volume.
I n t h e f i r s t p a p e r , Yu.M. Ermol i ev c o n s i d e r s t h e fundamenta l
c o n n e c t i o n between n o n d i f f e r e n t i a b l e and s t o c h a s t i c o p t i m i z a t i o n ,
and t h e v a r i o u s q u e s t i o n s r a i s e d by t h i s r e l a t i o n s h i p .
J . L . G o f f i n ' s p a p e r i s conce rned w i t h a c c e l e r a t i o n i n t h e
r e l a x a t i o n method f o r l i n e a r i n e q u a l i t i e s . T h i s i s c l o s e l y
r e l a t e d t o a c c e l e r a t i o n i n s u b g r a d i e n t o p t i m i z a t i o n p r o c e d u r e s
and i t is shown t h a t t h e r e a r e c e r t a i n f e a t u r e s i n t h e p e r f o r -
mance o f t h e s e methods which a r e n o t e x p l a i n e d by c u r r e n t t h e o r y .
Exper imen t s conduc ted by t h e a u t h o r show t h a t s u b g r a d i e n t o p t i -
m i z a t i o n t e c h n i q u e s pe r fo rm b e t t e r t h a n e x i s t i n g t h e o r y would
p r e d i c t on t h e b a s i s o f t h e w o r s t - c a s e e s t i m a t e s . Using a rgu-
ments t a k e n from t h e t h e o r y o f s u c c e s s i v e o v e r - r e l a x a t i o n , r a t e s
of convergence a r e snown t o improve i n s e l e c t e d e x p e r i m e n t s .
The p a p e r by C . ~ e m a r g c h a l i s devo ted t o numer ica l e x p e r i -
ments i n wnich v a r i o u s t y p e s o f algorithms a r e a p p l i e d t o a
number o f t e s t problems. The a l g o r i t h m s c o n s i d e r e d r a n g e from
t h o s e whlch have a good pe r fo rmance i n smooth c a s e s t o t h e r o b u s t
e l l i p s o i d a l g o r i t h m .
The p a p e r c o n t r i b u t e d by R. M i f f l i n d e s c r i b e s an a l g o r i t h m
f o r t h e minimization of c e r t a i n semismooth f u n c t i o n s d e f i n e d by
t h e a u t h o r . These f u n c t i o n s a r e q u i t e g e n e r a l and a r e l i k e l y t o
c o v e r most p r a c t i c a l a p p l i c a t i o n s . The a l g o r i t h m combines t h e
i d e a o f t h e c u t t i n g p l a n e method w i t h a q u a d r a t i c t e rm wnich t h e
a u t h o r s u g g e s t s a s a second-o rde r approx ima t ion o f t h e Lagrang lan
a s s o c i a t e d w i t h t n e o p t i m a l m u l t i p l i e r s o f t h e subproblems.
A number of a l g o r i t h m s f o r convex o p t i m i z a t i o n a r e based on
t h e i d e a o f E - s u b g r a d i e n t s , which have c e r t a i n r emarkab le theo-
r e t i c a l p r o p e r t i e s . The p a p e r by E . Nurminski d i s c u s s e s g e n e r a l
a s p e c t s o f t h e use o f c - s u b g r a d i e n t s i n n o n d l f f e r e n t i a b l e convex
optimization.
The l a t e s t r e s u l t s I n t h e t h e o r y o f o p t i m a l i t y c o n d i t i o n s
and d i f f e r e n t i a b i l i t y were p r e s e n t e d a t t h e mee t ing by
![Page 12: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/12.jpg)
R.T. Rockafellar. His paper on this subject deals with the
refinement of the properties of generalized gradients for func-
tions which satisfy regularity requirements in addition to the
Lipschitz condition. Additional properties of this type make it
possible to establish useful connections between the directional
differentiability of nondifferentiable functions and the mono-
tonicity properties of subdifferential mappings.
The links between nondifferentiable optimization and
structured decision-making problems are considered in the paper
by A. ~uszczy6ski. A two-stage decision problem is shown to
give rise to nondifferentiable problems with specific types of
nondifferentiability for which simple subgradient-type algorithms
are proposed. An important feature of this approach is that it
also allows random factors to be included in the formulation of
the problem, and this makes it more realistic in terms of
applications.
A. Wierzbicki discusses the theoretical and computational
possibilities connected wlth the use of augmented Lagrangian
functions in the last paper of this volume.
![Page 13: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/13.jpg)
METHODS OF NONDIFFERENTIABLE AND STOCHASTIC OPTIMIZATION AND THEIR APPLICATIONS
Yu. M . Ermollev International Institute for Applied Systems Analysis, Laxenburg, Austria
Optimization methods are of great practical importance in
systems analysis. They allow us to find the best behaviour of a
system, determine the optimal structure and compute the optimal
parameters of the control system, etc. The development of non-
uifferentiable and stochastic optimization allows us to state,
ana effectively solve, new complex optimization problems which
are impossible to solve by classical optimization methods.
The term nondifferentiable optimization (iJDO) was introduced
by Balinski and Wolfe [ I 9 7 5 1 for extremal problems wlth an objec-
tlve function and constraints that are continuous but have no
contrnuous aerivatrves. This term is now also used for problems
with discontinuous functions, although in these cases it might
be better to use the terms nonsmooth optlmization (NSO) or, in
particular, discontinuous optimization (DCO).
The tern stochastic optimization (STO) is used for stochas-
tic extremal problems or for stochastic methods that solve deter-
mlnistic or stochastic extremal problems.
Nondifferentlable and stochastic optlmization are natural
aevelopments of classic optimization methods. Some important
![Page 14: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/14.jpg)
classes of nondifferentiable and stochastic optimization problems
are well-known and have been investigated long ago: problems of
Chebyshev approximations, game theory and mathematical statistics.
It should also be noted that, from the conventional viewpoint,
there is no major difference between functions with continuous
gradients which change rapidly and functions with discontinuous
gradients. Each of the above mentioned classes was investigated
by its own "homemade" methods. General approaches (extremum
conditions, numerical methods) were developed at the beginning
of the 1960's. The main purpose of this article is to review briefly some important applications and nondescent procedures
of nondifferentiable and stochastic optimization. Clearly, the
interests of the author have influenced the content of this
article.
Let us consider some applied problems which require
nondifferentiable and stochastic optimization methods.
2. OPTIMIZATION OF LARGE-SCALE SYSTEMS
Many applied problems lead to complex extremal problems
with a great number of variables and constraints. For example,
there are linear programming problems in which the number of
variables or constraints is of the order of 1 0 0 ~ 0 0 . Formally,
such problems have the following form:
here Y is a given discrete set. For example, the use of
duality theory for solving discrete progrming problems [Balinski
and Wolfe 1975; Lasdon 19701 necessitates the minimization of
![Page 15: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/15.jpg)
nondifferentiable functions of the kind
where Y is some discrete set. This problem reduces to problems
of the krnd (1) - (3).
Clearly in this case the total number of constraints
may be equal to 1 OOloO. However, these constraints have a form
which does not impose heavy demands on the computer core and one
can try to find their solution with the finite methods of linear
programming. But the number of vertices of the feasible polyhed-
ral set for such problems is so large that the application of the
conventional srmplex method, or its variants, yields very small
steps at each iteration and consequently very slow convergence.
Moreover, the known finite methods are not robust agalnst computa-
tional errors. The use of nondifferentiable optimization has made ~t
possible to develop easily implementable iterative decomposition
schemes of the gradient type. These approaches do not use the
basic solution of the linear programming problem which enables
one to start the computational process from any point, and leads
to computational stability. Nondifferentiable decompositlon
techniques (see, for instance, Shor [ l 9 6 7 ] and Ermoliev and Errcolieva
[19731) are based on the following ideas.
Let the linear programming problem have the form
We assume that for fixed x it 1s easy to find a solution
y ( x ) with respect to y. For example, the matrix D may nave a
block diagonal structure, with x b e ~ n g the connecting variable.
The main difficulty here is to find the value x* for the optrma!
![Page 16: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/16.jpg)
solution (x*, y (x*) ) . The search for x* is equivalent to the
minimization of the nonsmooth function
Another approach is to consider the dual ptoblem:
Let us examine the Lagrangian function
(u,b) + (=-U,X) = (c,x) + (u,b - Ax)
subject to constraints
In this case the search for x* is equivalent to the minimi-
zation of the nonsmooth function
f(x) = (c,x)+ max (u,b-Ax) for x,O . UD c d u, 0
The well-known Dantzig-Wolfe decomposition is also based on this
principle. A subproblem of minimization with respect to variables
u, subject to
![Page 17: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/17.jpg)
IS solved easily because the matrix D is assumed to have a
special structure.
A parametric decompasition method [Ennoliev and Ennolieva
19733 reduces linear programming problems, which may not have
block diagonal structure, to nondifferentiable optimization pro-
blems by introducing additional parameters. In this case, there
is the possibility of splitting the linear programming problem
lnto arbitrary parts, in particular, of singling out subproblems
which correspond to blocks of nonzero elements in the constraint
matrix.
Let us analyse the general idea of the method using the
concrete example
y = min
where
Let it be necessary to cut this problem, for example, into
three parts as shown in constraints (8).
Consider the following subproblem: for the given variable
X = !xl, , x ~ ~ , x ~ ~ , x ~ ~ , x ~ ~ ) 2 0 find y 1 2 0 , y 2 ~ 0 , y j 2 0 for which
y = min
![Page 18: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/18.jpg)
This problem produces three subproblems with the desired
structure. If the minimum value y is denoted as f(x) then it 3 is easy to show that solving the problem (7) - ( 8 ) is equivalent
to solving (9) for the value of x which minimizes the nondiffer-
entiable function f(x) under the constraints:
3 . MINIMAX PROBLEMS, PROBLEMS OF GAME THEORY
The problem (4) is the simplest minimax problem. More
general deterministic minimax problems are formulated as follows [Danskin 1967; Demyanov and Malozemov 19741.
For a given function
it is necessary to minimize
for x EX. Independently of the smoothness of g(x,y) the function
f(x), as a rule, has no continuous derivatives. A particular
class of minimax problems thus arises in approximation theory
e.g., in problems of the best Chebyshev approximation, in appro-
ximation by splines, and in mathematical statistics.
The solution of a system of inequalities
for g (x, y) = d (x) , y E Y = ( 1 , 2 , . . . ,m) can also be found by Y
mlnmization of the function (11). The solution of tile general
![Page 19: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/19.jpg)
problem of nonlinear programming,
0 min if (x)fi(x)lO, l = = x E X I ,
A S also reciuced to thls probler;~, i f lt LS assumed hat
In game theory more complex problems arise in the
mlnlmlzation of the functlon
for x E X , where y (x) is such that
Independently of the smoothness of the functions g(x,y),
11 ( x , y) tne funct~on f (x) ln any j1.Jen case nay have no contlnuous
derlvar1'~es and may be dlscontlnuous at any polnt. For
n l x , y ) = ? c . : i , 31~~:') = X + y, Y = 1-1 , l l , we obtaln
The functlon h(x,y(x)i =xy(x) = x i is ccntlnuous, but does not
nave contlnuous derivatives at tne 2olnt x = d . The function
f : x ) = x + y(x) may be regarded as discontinuous at x = 0.
4 . OPTI."lIZATION OF PROBABILISTIC SYSTE?IS
Taking tne influence of >Jncertaln random factors lnto
acccunt, even In the simplest rxtrenal ~roblems, leads to
![Page 20: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/20.jpg)
complex extremal problems with nonsmooth functions. For example,
for deterministic w a set of solutions to the inequality
where w, x are scalars, defines a semi-axis. If w is a random
variable it is natural to consider the function
and to find x which maximizes f (x) . If w = 2 1 with probability
0.5, then f (x) is a discontinuous function (see Figure 1).
Figure 1.
A quite general stochastic programming (stochastic
optimization) problem can be formulated as follows [Ermoliev and
Nekrilova 1967, Ermoliev 19763:
0 - min {F (x) I Fi(x) (0,i = 1 , m , x E X I ,
where
![Page 21: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/21.jpg)
- Here f ( x , ~ ) , J = 0.m are random functions, and is a
random factor whlch we shall conslder as an elnnent of the
probablllty space (I;,A,P). For example, condltlons llke
become constraints of the type (13) - (14) .if we assume that
The problem ( 1 3) - ( 1 4) is more difficult than the conventional
nonlinear programming problem. The main difficulty, besides the
nondifferentlability, is connected with the condition (14). As
a rule, it is practically impossible to compute the precise
values of the integrals (14) and therefore one cannot calculate
the precise values of the function ~ ' ~ ( x ) . For example, it is
only rarely possible for special kinds of distributions and
functions g1 ( x , ~ ) to flnd the expression p i g 1 (x,.~) 2 O j as a tunctlon of x. Usually only values of the random quantities
f . ' ( x , . , , ) are ava~lable - !;ot values o f F":XJ. TO determine
wllrtner the po~nt x satlsf l es she constraints
oecomes a compl~cated problem of verlfylng the stat~stlcal hypo-
thes~s that the mathematical expectation of the random quantlt~es - A
; (x ,, i 1s nonposltlve.
5. ON EXTREMUM CONDITIONS
The difference between nondifferentiable and stochastic
optimization problems on the one hand, and the classic problem
of deterministic optimization on the other, is apparent in che
optimality conditions. If f(x) is a convex differentiable
functlon then the necessary and sufficient conditions for th.e
![Page 22: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/22.jpg)
minimum nave the form:
where
In the nondifferentiable case this condition transforms into the
requirement (Figure 2)
where
is a set (the subdifferential) of generalized gradients (the
subgradients) . These vectors ix (XI satisfy the inequality
Figure 2.
![Page 23: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/23.jpg)
I t s h o u l d be no ted t h a t t h e n o t a t i o n Z X ( x ) f o r a s u b g r a d i e n t
u s e d h e r e i s c o n v e n i e n t i n c a s e s where a f u n c t i o n depends on
s e v e r a l g r o u p s o f v a r i a b l e s and t h e s u b g r a d i e n t 1s t o be t a k e n
w i t h r e s p e c t t o one o f them.
The complexity o f n o n d i f f e r e n t i a b l e o p t i m i z a t i o n p rob lems
r e s u l t s from t h e impossibility o f u s i n g ( 1 6 ) i n p r a c t i c e t o d i s -
c o v e r w h e t h e r a s p e c i f i c p o i n t x i s a p o i n t c o r r e s p o n d i n g t o t h e
minimum o f f ( x ) .
T h i s discussion r e q u l r e s one t o t e s t w h e t h e r t h e 0 - v e c t o r
b e l o n g s t o t h e se t , fx !x ) ?!, which u s u a l l y h a s no c o n s t r u c t i v e
d e s c r i p t i o n . A f u r t h e r c o m p l i c a t i o n a r i s e s from c h e c k i n g t h e
c o n d i t i o n s ( 1 5 ) and (1 6 ) i n s t o c h a s t i c optimization prob lems .
G e n e r a l l y s p e a k i n g , even c h e c k i n g t h e c o n d i t i o n s ( 1 5 ) i n t h e
stochastic c a s e l e a d s t o a verification of t h e s t a t i s t i c a l hypo-
t h e s i s t h a t f o r f l x e d x t h e mathematical e x p e c t a t i o n of t h e
random v e c t o r f x i x , - ) 1s 0 , t h a t i s ,
I n s u c h c a s e s , t h e deve lopmen t of d i r e c t n u m e r i c a l p r o c e d u r e s
f o r f i n d i n g o p t l m a l solutions becomes e x t r e m e l y i m p o r t a n t .
6 . DETERMINISTIC METHODS OF IJONDIFFERENTIABLE OPTIMIZATION
There a r e two d i f f e r e n t c l a s s e s of n o n d i f f e r e n t i a b l e
o p t x n l z a t i o n methods : t h e n o n d e s c e n t methods which s t a r t e d t h e i r
deve lopmen t i n t h e e a r i y 6 0 ' s a t t h e I n s t i t u t e o f Cybernetics I n
Klev [Shor 1964 ; E rmol l ev 1 9 6 6 ) and t h e d e s c e n t metnods whlch
a p p e a r e d Ln t n e w e s t e r n scientific l i t e r a t u r e i n t h e 7 0 ' s (see
5 a i i n s X l and X o l f e ( 1 9751 f o r a b i b l i o g r a p h y ) .
L e t u s d l s c u s s briefly t h e b a s l c l d e a s o f t h e s e two
a p p r o a c h e s .
![Page 24: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/24.jpg)
L e t u s a t t e m p t t o g e n e r a l i z e t h e known g r a d i e n t methods o f
t h e k i n d
where xS is an approximate s o l u t i o n a t t h e s - t h i t e r a t i o n , and
pS a r e s t e p - s i z e m u l t i p l i e r s , f o r convex f u n c t i o n s f ( x ) w i t h a
d i s c o n t i n u o u s g r a d i e n t . D i f f i c u l t i e s a r i s e connec ted w i t h t h e
c h o i c e o f s t e p m u l t i p l i e r s ps i n t h e s i m i l a r p rocedure
o r more g e n e r a l l y
S where Zx (xs) i s a s u b g r a d i e n t o f f ( x ) a t x = x and IT, ( - ) i s a p r o j e c t i o n o p e r a t o r on set X.
I n p r a c t i c e , i t i s d i f f i c u l t t o rev iew t h e whole s e t o f
s u b g r a d i e n t s and choose t h e o n e which l i e s i n t h e o p p o s i t e d i r e c -
t i o n t o t h e domain of s m a l l e r v a l u e s of t h e o b j e c t i v e f u n c t i o n
( s e e F i g u r e 3 ) . Usual ly one can ge t o n l y o n e s u b g r a d i e n t , and
t h e r e f o r e t h e r e i s no g u a r a n t e e t h a t a s t e p a c c o r d i n g t o proce-
d u r e ( 1 8 ) w i l l l e a d i n t o t h e domain o f s m a l l e r v a l u e s o f f ( x ) .
The nondescen t p rocedure (18) was proposed by N. Z . Shor 119641
and c a l l e d t h e method of g e n e r a l i z e d g r a d i e n t s .
\
F i g u r e 3 .
![Page 25: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/25.jpg)
I t a l l o w s t h e u s e o f any s u b g r a d i e n t i n t h e s u b d i f f e r e n t i a l .
G e n e r a l conditions f o r i t s c o n v e r g e n c e were f i r s t i n d e p e n d e n t l y
o b t a i n e d by E r m o l i e v [I9661 and by P o l y a k [I9671 , where P s
s a t i s f i e s t h e c o n d i t i o n s
These c o n d i t i o n s a r e v e r y n a t u r a l a s ( 1 8 ) , (1 9 ) a r e non-
d e s c e n t p r o c e s s e s , i - e . , t h e v a l u e o f t h e o b j e c t i v e f u n c t i o n
d o e s n o t n e c e s s a r i l y d e c r e a s e f rom i t e r a t i o n t o i t e r a t i o n e v e n
f o r a r b i t r a r i l y s m a l l J ~ .
The i n f l u e n c e and c l o s e r e l a t i o n s h i p o f I . I . E r e m i n ' s
r e s e a r c h on s o l u t i o n s o f s y s t e m s o f i n e q u a l i t i e s a n d nonsmooth
p e n a l t y f u n c t i o n s on t h l s a r e a o f work s h o u l d be n o t e d [Eremin
19651.
More r e c e n t l y t h e method ( 1 8 ) h a s heen d e v e l o p e d f u r t h e r
(see Shor [ I9761 f o r a r e v i e w ) a n d r a t e s o f c o n v e r g e n c e have been
s t u d i e d .
C . A . Nurmlnskl [ I9731 s t u d l e d t h e c o n v e r g e n c e o f methods o f
t y p e ( 1 8 ) f o r t h e functions satisfying t h e f o l l o w i n g c o n d l t l o n :
N o r e o v e r , h e p roposed a new t e c h n l q u e f o r p r o v i n g c o n v e r g e n c e
Lased on t h e reduozls L C z b a ~ r l ' ~ c r n a r g u m e n t ; he t h e n a d a p t e d
t h l s t e c h n l q u e f o r s t u d y l n g t h e c o n v e r g e n c e of n o n d e s c e n t methods
f o r nonconvex , nonsmooth o p t l m l z a t l o n .
As h a s a l r e a d y Seen s a i d , t h e a l g o r i t h m s c o n s t r u c t e d on t h e
b a s i s o f ( 1 8 ) a r e s i m p l e and r e q u i r e r e l a t i v e l y l i t t l e s t o r a g e .
Thus l e t u s c o n s i d e r an application o f t h e method ( 1 9 ) t o t h e
![Page 26: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/26.jpg)
development o f i t e r a t i v e schemes of decompos i t ion . For t h e s .
f u n c t i o n ( 5 ) one of t h e g e n e r a l i z e d g r a d i e n t s a t p o i n t x 1s
S where us a r e d u a l v a r i a b l e s c o r r e s p o n d i n g t o y ( x I . T h e r e f o r e ,
t h e i t e r a t i v e scheme of decompos i t ion a c c o r d i n g t o t h e p rocedure
(19) h a s t h e form
For t h e problem ( 7 ) - (8) , i f yS i s an approx imate s o l u t i o n of t h e S subproblem ( 9 ) f o r x = xS = i x S . . j and u a r e d u a l v a r i a b l e s c o r -
1 I r e spond ing t o y s , t h e n
where J~ ( ) i s t h e p r o j e c t i o n o p e r a t o r on t h e s e t ( 1 0 ) . There i s
a v e r y s i m p l e a l g o r i t h m f o r obtaining : X ( w ) on s e t s o f t y p e ( 1 0 ) .
For t h e minimax problem (121 i n t h e c a s e when g ( x , y ) f o r
each y E Y i s a convex f u n c t i o n w i t h r e s p e c t t o x , t h e s u b g r a d i e n t
is d e f i n e d a s ( x ) = g ( x , y ) l - x x ;,= y ( X ) - q X ( x , y ( x ) ) . I f g i x , ~ ) i s c o n t ~ n u o u s l y d i f f e r e n t i a b l e w l t h r e s p e c t t o x t h e n
I t shou ld be no ted t h a t t h e above-mentioned n o n d i f f e r e n t i a b l e
p r o c e d u r e s of decompos i t ion make it r e l a t i v e l y s i m p l e t o t a k e i n t o
a c c o u n t t h e s p e c i a l s t r u c t u r e o f a cjiven p r o b l s , . For example ,
![Page 27: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/27.jpg)
conslder the linear problems of optimal control: to find a
control x = (x (0) , . . . ,x (N-1 ) ) and a tralectory z = (z (0) , . . . ,z (N) ) , satisfying the state equations:
the constraints
and minimize the objective function
where x (k) E R", z (k) E R ~ . The difficulty of this problem is
connected wlth the state constraints. If matrlx G(k) - 0 , we
can solve this problem with the help of Pontz7agrn1s
principle.
The dual problem is to find dual control X = (X(N-I), ..., %(0)) and dual trajectory p = (p(N), ...,p (O)), subject to state
equations
and constraints
?(k+l) B ( k ) + .i(k)D(k) ~ d ( k )
' ik) - > O , k = N-1,. . . ,3,
![Page 28: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/28.jpg)
which minimize
We have the following analog of the iterative scheme of decom-
position considered above:
Xs+l (k) =max{O,xS (k) - os [Ps ik+l)~(k) - is (k)~(k) - d(k) 1).
where XS (k) , pS (k) , k = N-1 , . . . ,O is a solution of the subproblem. Minimize the linear function:
under constraints
We may use the well-known Pontzjagin's principle for solving this
problem. Its solution is reduced to the solution of N simple
static llnear programing problems.
Some original work carried out individually by Wolfe and
Lemarechal (see Balinski and Wolfe (19751) on descent methods is,
on the one hand, a generalization of the E-steepest descent algo-
rithms studied by Demyanov [I9741 and, on the other hand, formally
similar to algorithms of conjugate gradients, coinciding with
them in the differentiable case.
Since it is impossible to obtain the whole set {zx(xS) 1 at the point xS, Wolfe and Lemarechal tried to construct it approxi-
mately at each iteration. The further development of subgradient
schemes resulted in the creatlon of E-subgradients, which were
![Page 29: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/29.jpg)
i n t r o d u c e d by R o c k a f e l l a r [ 197U] . The e a r l y r e s u l t s i n t h i s f i e l d
a r e due t o R o c k a f e l l a r [1970] , B e r t s e c a s and M i t t e r (19731,
Lemarechal (1 9751, and Nurminski and Z h e l i k h o v s k i [ I 9771. Recent
r e s e a r c h r e v e a l e d p r o p e r t i e s o f i - s u b g r a d i e n t mappings such a s
L i p s c h i t z c o n t i n u i t y which make € - s u b g r a d i e n t methods a t t r a c t i v e
b o t h i n t h e o r e t i c a l and p r a c t i c a l r e s p e c t s .
7 . STOCHASTIC METHODS
Two c l a s s e s of d e t e r m i n i s t i c methods have been d i s c u s s e d :
nondescen t methods and d e s c e n t methods . The f i r s t c l a s s i s e a s y
t o u s e on t h e computer b u t d o e s n o t r e s u l t r n a monotonic
d e c r e a s e rn t h e o b j e c t i v e f u n c t i o n . The second c l a s s g l v e s mono-
t o n i c d e s c e n t b u t h a s a complex l o q i c . Both c l a s s e s have a
common shor t coming : t h e y r e q u i r e t h e e x a c t computa t ion of a sub-
g r a d i e n t ( i n a d i f f e r e n t i a b l e c a s e t h i s would be t h e g r a d i e n t ) .
O f t e n , however, t h e r e a r e problems i n which t h e computa t ion o f
s u b g r a d i e n t s i s p r a c t i c a l l y i m p o s s i b l e . Random d i r e c t i o n s of
s e a r c h is a s i m p l e a l t e r n a t i v e method of c o n s t r u c t i n g n o n d i f f e r e n -
t i a b l e o p t i m i z a t i o n s t o c h a s t i c d e s c e n t p r o c e d u r e s t h a t do n o t
r e q u i r e t h e e x a c t computa t ion o f a s u b g r a d i e n t and which a r e
e a s y t o u s e on t h e computer .
There a r e v a r i o u s i d e a s on how t o c o n s t r u c t s t o c h a s t i c
d e s c e n t methods i n d e t e r m ~ n i s t ~ c problems which o n l y r e q u i r e t h e
e x a c t v a l u e s o f o b j e c t i v e and c o n s t r a i n t f u n c t i o n s . One o f t h e
s l m p l e s t methods is a s f o l l o w s : from t h e p o i n t x s , t h e d i r e c t i o n
o f t n e d e s c e n t r s chosen a t random and t h e mot ion i n t h i s d i r e c -
tLon L S made wr th a c e r t a l n s t e p . The l e n g t h of t h l s s t e p may
be chosen l n v a r i o u s ways, rn particular s u c h t h a t :
Stochastic nondescen t methods o f random s e a r c h ! s t o c h a s t i c
o p t m l z a t l o n ) a r e of pr lme rmportance r n t h e s o l u t l o n of t h e most
d ~ f f l c u l t problem a r l s l n g rn s t o c n a s t l c p r o g r m l n g , L:I which r t 1s
![Page 30: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/30.jpg)
i m p o s s i b l e t o compute e i t h e r s u b g r a d i e n t s o r e x a c t v a l u e s o f
o b j e c t i v e and c o n s t r a i n t f u n c t i o n s . The p r e s e n c e o f random com-
ponen t s i n t h e s e a r c h d i r e c t i o n s o f nondescen t p r o c e d u r e s a l l o w s
one t o overcome l o c a l minima, p o i n t s o f d i s c o n t i n u i t y , e t c . A
q u i t e g e n e r a l scheme o f nondescen t methods i n s t o c h a s t i c o p t i m i -
z a t i o n was s t u d i e d i n Ermol iev and N e k r i l o v a 119671, and Ermol iev
[I9761 under t h e name s t o c h a s t i c q u a s i g r a d i e n t (SQG) methods , SQG
methods g e n e r a l i z e t h e well-known s t o c h a s t i c approx ima t ion
methods f o r o p t i m i z a t i o n o f t h e e x p e c t a t i o n o f random f u n c t i o n s
t o problems i n v o l v i n g g e n e r a l c o n s t r a i n t s w i t h d i f f e r e n t i a b l e
and n o n d i f f e r e n t i a b l e f u n c t i o n s . For d e t e r m i n i s t i c n o n l i n e a r
o p t i m i z a t i o n problems t h e s e methods c a n be r e g a r d e d a s methods
o f random s e a r c h . C o n s i d e r t h e problem
0 min I F ( x ) F i ( x ) 2 0 , i = l , m , x E ~ }
We assume h e r e t h a t F V ( x ) , 0 = c m a r e convex f u n c t i o n s , and X i s a
,-,onvex s e t . L e t 6: d e n o t e a s u b g r a d i e n t o f t h e f u n c t i o n F ' ~ ( x ) :
I n s t o c h a s t i c q u a s i g r a d i e n t (SQG) methods t h e sequence o f 1 a p p r o x i m a t i o n s x O , x . . . , x S . . . , is c o n s t r u c t e d w i t h t h e h e l p o f
random v e c t o r s c V ( s ) and random q u a n t i t i e s jV ( 5 ) which a r e s t a t i s -
t i c a l e s t i m a t e s o f t h e v a l u e s of s u b g r a d i e n t s $ i ( x S ) and o f t h e
f u n c t i o n s F ' ~ ( x S ) :
0 where a " ( s ) i s a v e c t o r , b V ( s ) 1s a number depend ing upon x , 1 x , . . . , x S , . . . , where u s u a l l y a V ( s ) -+O,SJ ( s ) + O ( i n any s e n s e ) f o r
s-+-. Thus i n t h e s e methods , i n s t e a d o f e x a c t v a l u e s o f
![Page 31: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/31.jpg)
F;(xS) ,F' (xS) , quantities ;' ( s ) , ib ( 5 ; are used. For further
unaerstanalnq, lt 1s important to see that the random values
;, (s) and vectors iJ(s) are easily c~~lculatea. For example, lf
tnen i , (5) = f ' (xS , dS) where the ds result from mutual1 y lndependent draws of w. We have
~f functions f'j ( x , ~ ) are differentiable with respect to x and
then under reasonable assumptions we will have
It should be stressed that SQG methods are applicable not onlv to
stocllas tlc proqrammlnq ~roblems, but also to NDO determlnlstlc nra-
blsrns, wltnout navlnq to compute values of sutqraalents. FDr example,
for tne determlnlstlc mlnlrnax problrr 11 2 ) conslder the vector
where A s > O,hS 1s the result of lndependent random draws of the
random vector h = (h, , . . . ,h ) whose components are independently
and uniformly dlstr~buted over i - 1 , l I . (22) satisfies the condrtion
' 0 where fx (xS) is a subqradient of the f.~nction ( ! 2) and 0 la (s) ,I : constel: if q(x,:/) has unlforml:,. ilmited second - s'
![Page 32: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/32.jpg)
derivatives with respect to xEX. It is remarkable that
independent of the dimensionality of the problem, the vectors
(22) can be found by calculating the function g (x,y) at two
points only. This is particularly important for extremal problems
of large dimensionality.
Let us now discuss briefly one particular type of SQG
method - the stochastic quasigradient projection method. 0 Let it be required to minimize the convex function F (x),
x E X , where X is a convex set.
The method is defined by the relations:
where a ( a ) is a projection operation on X, p s are step multipliers. X
The method (23) has been proposed in Errnoliev anC Lgekrilova [1967] .
The characteristic requirements under which the sequence {xSl con- k
verges with probability 1 to the solution, are: if I1 x I1 5 B , k = 6 ; 5 : then E { ~ ~ ~ ( S ) ~ ~ ~ ( X ~ , . . . . X ~ ~ LCB where B,CB are constants;
0 1 "s are step multipliers which may depend upon x ,x,...,xS and
30
pS L O , L " = - with probability 1 , s=o
In the special case when tne p s are deterministic and independent
of (xO,. . .xS) then, under (24), (25) we obtain from method (23)
using the random direction (22) that
l'ne important application of SQG methods is concerned wlth the
optimization of probabilistic systems by simulation. To obtain
![Page 33: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/33.jpg)
the desired parameters of the probabilistic system, one can often
use a Monte-Carlo simulation technique. Let
0 be the outcome of the s-th simulation, where f (x,wS) is the
random quantity, which depends on an unknown vector of param-
eters, and noise observations ds. Suppose that it is desired
to choose the vector x = Cxl, ..., x,), which minimizes the ex-
pected value of the performance function defined by the expression
The main aifficulty of this problem is that the distribution
P(dh) is unknown and we cannot get the precise value of the func- 0 tion F (x) (it is theoretically impossible). A statistical esti-
0 mate cO(x) of the gradient for a differentiable function F (x)
could be calculated analogously to ( 2 2 ) :
0 51 0 where f (xS + ;,hS, ) , f ( X ~ , & ~ O I are outcomes of simulations for
x=xS+: hS and x=xS.
0 If the second derivatives of the function F (x) are Sounded
for x €9, then
since Eh h = 0, lf 1 # 1 ana ~ h f = 2 / 3 . The step-slze As of tne 2
flnlte alfrerence approxlmatlon csuld be chosen according to
conultlons (26).
![Page 34: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/34.jpg)
8. CONCLUSION
Some applied NDO and ST0 problems have been briefly
discussed in this work. Deterministic, stochastic, descent and
nondescent methods were considered. Each one requires some defi-
nite information about objective and constraint functions.
Deterministic descent methods use the exact values of these func-
tions and their subgrad~ents; stochastic descent methods use only
the exact values of functions; deterministic nondescent methods
require only the exact values of subgradients; stochastic non-
descent methods use neither the values of functions nor the exact
values of their subgradients. Obviously, every method reveals
its advantages in a specific class of extremum problems: for
instance, complex stochastic programming problems are soluble only
by stochastic nondescent methods.
REFERENCES
Balinski, M.L., and P. Wolfe (Eds.) 1975. Nondifferentiable Optimization. Mathematical Programming Study 3. Amsterdam: North Holland Publishing Co.
Bertsecas, D.P., and S.K. Mitter. 1973. A descent numerical method for optimization prcblems with nondifferentiable cost functionals. SIAM Journal on Control 11:637-652,
Danskin, J.M. 1967. The Theory of Min-Max. Berlin: Springer- Verlag.
Demyanov, V.F., and V.N. Malozemov. 1974. Introduction to Minr-Max. New York: John Wiley and Sons.
Eremin, I. 1965. A generalization of the Mozkin-Agmon relaxation method. Uspekhi Mathematischeski Nauk 20:183-187.
Ermoliev, Yu.M. 1966. Methods of solutlon of nonlinear extremal problems. Kibernetika 2(4).
Ermoliev, Yu.M. 1970. On the stochastic penalty functions method. Kibernetika 1.
Ermoliev, Yu.M. 1976. Methods of Stochastic Programming. Moscow: Nauka (in Russian) .
Ermoliev, Yu.M., and L.G. Ermolieva. 1973. A method of parametric decomposition. Kybernetika 9(2).
![Page 35: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/35.jpg)
Ermoliev, Yu-M., and Z. Nekrilova. 1967. Stochastic Gradient Methods and Their Application in Optimal Decision Theory. Notes, Seminar on the Theory of Optimal Solution. Academy of Sciences of the U.S.S.R.. Kiev.
Lasdon, L.S. 1970. Optlmization Theory for Large Systems. London: Macmlllan .
Lemarechal, C. 1975. Nonaifferentiable Optlrnization: Sub- qradlent and t-Subgradient Yethods. Lecture Notes on Optimization and Operations Research.
Nurminskl, E.A. 1973. The quasiqradient method for the solving of nonlinear programming problems. Cybernetics 9(1) :145-150. New York/London; Plenum Publishing Co.
Nurrnlnski, E.A., and A.A. Zhelikhovski. 1977. E-Quasigradient method for solving nonsmooth extremal problems. Cybernetics 1 3 1 : 1 0 9 - 1 1 New York/London: Plenum Publishing Co.
Polyak, B.T. 1967. A general method for solving extremal problems. Doklady Academii Nauk SSR 17U:33-36.
Rockafellar, R.T. 1970. Conjugate convex functions in optlmal control and the calculus of variations. Journal of Mathematical Analysis and Applications 32:174-222.
Rockafellar, R.T. 197U. The multiplier method of hesten and Powell applled to convex programming. Journal of Optiinization Theory and Applications 12:9.
Shor, N.A. 1964. On a Structure of the Algorithms for Numerical Solution of the Optimal Plannlnq and Designing. Dissertation, Institute of Cybernetics, Kiev.
Shor, N.Z. 1967. Application of the generalized qradient method in block-programmlny. Kibernetlka 3(3).
Shor, N . Z . 1976. Generallzatlons of qradlent methods for non- smooth functions and thelr applications to mathematical ~rogrammlng. Economics dnd Mathematical Methods 12(2) :337-356.
![Page 36: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/36.jpg)
![Page 37: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/37.jpg)
ACCELERATION I N THE RELAXATION METHOD FOR LINEAR INEQUALITIES AND SUBGRADIENT OPTIMIZATION'
J .L . G o f f i n M c G i l l U n i v e r s i t y Mont rea l , Canada
1 . INTRODUCTION
S u b g r a d l e n t o p t l m l z a t l o n has been shown I n many experiments
t o be an effective s o l u t l o n technique t o t h e problem of maxlmlz-
rng p l e c e w l s e l l n e a r concave functions d e f l n e d th rough t h e use
of t h e Dantzlg-Wolfe decompos l t lon p r l n c l p l e a p p l l e d t o some
c o m b l n a t o r l a l problems. T h l s effectiveness showed up I n some
problems o f r a t h e r r e s p e c t a b l e s l z e [ l o , 1 2 , 1 5 1 , even though l t
can be shown t o pe r fo rm a r b ~ t r a r l l y ~ a d l y l n two-dlmenslonal
problems.
A convergence t h e o r y deve loped oy Shor [ 2 1 , 2 2 1 and t h e
a u t h o r L71 q u a n t l f l e s t h e r a t e s o f convergence o f s u b g r a d l e n t
o p t l n u z a t l o n , I n f u n c t l o n o f c o n d ~ t l o n numbers whlch measure t h e
good b e h a v l o u r o f t h e f u n c t l o n t o o p t l m l z e . When a p p l l e d t o a
quadratic f u n c t l o n , s u b g r a d l e n t o p t l m l z a t l o n can a c h l e v e t h e
r a t e o f convergence of t h e s t e e p e s t a s c e n t method p r o v l d e d t h a t
t h e f u n c t l o n be n o t t o o w e l l c o n d l t l o n e d , t h a t t h e c o n d l t l o n
n w e r t o r t h e e l g e n v a l u e r a t l o o f t h e m a t r l x d e f l n l n q t h e
*Th i s r e s e a r c h was s u p p o r t e d r n p a r t by the D.G.E.S. (Quebec ) and t h e N . R . C . of Canada under g r a n t A J 1 5 2 .
-29 -
![Page 38: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/38.jpg)
q u a d r a t i c f u n c t i o n ) b e known, and t h a t a n o v e r e s t i m a t e o f t h e
d l s t a n c e between t h e i n i t i a l p o i n t and t h e s o l u t i o n p o i n t be
known.
When a p p l i e d t o p i e c e w i s e l i n e a r p rob lems , s u b g r a d i e n t o p t i -
m i z a t i o n a lways performed s i g n i f i c a n t l y b e t t e r t h a n t h e t h e o r y
o f [ 2 2 ] and [ 7 ] i n d i c a t e d , t h e gap i n pe r fo rmance between e x p e r i -
ence and t h e o r y b e i n g wide enough t o j u s t i f y f u r t h e r s t u d y .
One key f a c t to n o t i c e i s t h a t s u b g r a d i e n t o p t i m i z a t i o n i s
c l o s e l y r e l a t e d t o t h e r e l a x a t i o n method f o r s o l v i n g sys tems o f
i n e q u a l i t i e s , t o t h e e x t e n t t h a t t h e r a t e s o f convergence t h a t
have been proved f o r b o t h methods a r e i d e n t i c a l [ 8 , 9 1 , under some
assumpt ions (namely t h a t t h e f u n c t i o n n o t b e t o o w e l l c o n d i t i o n e d ,
t h a t t h e r e l a x a t i o n p a r a m e t e r b e e q u a l to o n e and t h a t e v e r y non-
z e r o ex t reme p o i n t o f e v e r y s u b d i f f e r e n t i a l o f t h e f u n c t i o n have
t h e same norm).
A s econd key f a c t t o n o t i c e i s t h a t t h e r e l a x a t i o n method
f o r s o l v i n g sys tems o f l i n e a r i n e q u a l i t i e s is r e l a t e d t o t h e
Kacmarz p r o j e c t i o n method, t h e s u c c e s s i v e o v e r r e l a x a t i o n method
and t h e S o u t h w e l l method f o r s o l v i n g sys tems o f l i n e a r e q u a l i t i e s .
I t h a s been o b s e r v e d , and proved f o r some c a s e s , t h a t t h e S.O. R.
method c a n be a c c e l e r a t e d by u s i n g v a l u e s o f t h e r e l a x a t i o n
p a r a m e t e r g r e a t e r t h a n one ( i f t h i s p a r a m e t e r i s e q u a l t o o n e ,
t h e n t h e S.O.R. method i s known a s t h e Gauss -Se ide l , o r Nekrasov
method) . T h i s i n d i c a t e s t h a t acceleration is c o n c e i v a b l e f o r t h e
r e l a x a t i o n method f o r l i n e a r i n e q u a l i t i e s , and t h u s a l s o f o r
s u b g r a d i e n t o p t i m ~ z a t i o n .
A s t u d y o f t h e a c c e l e r a t i o n o f t h e r e l a x a t i o n method f o r a
s y s t e m o f two i n e q u a l i t i e s done i n [ 6 ] and (91 shows t h a t t h e
same v a l u e o f t h e r e l a x a t i o n pa ramete r which a c c e l e r a t e s t h e
s o s t t h e S.O.R. t e c h n i q u e i n a r e l a t e d l i n e a r sys t em o f e q u a t i o n s
a i s o a c c e l e r a t e s t h e convergence o f t h e r e l a x a t i o n method f o r
~ n e q u a l i t l e s , b u t t n a t t h i s a c c e l e r a t e d r a t e i s s t i l l bad i f t h e
sys t em i s b a d l y c o n d i t i o n e d (even i f t h e r a t e i s improved by an
" o r d e r o f m a g n i t u d e " ) . Exper iments s u g g e s t t h a t i f t h e r e l a x -
a t i o n p a r a m e t e r goes above t h e o p t i m a l v a l u e g i v e n by S.O.R.
t h e o r y , t h e convergence a c c e l e r a t e s s t i l l more, f o r r e a s o n s
![Page 39: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/39.jpg)
d i f f e r e n t from t h e ones used i n t h e S.O.R. t h e o r y . And t h u s
a c c e l e r a t i o n , even i f it c o u l d be p roved s a t i s f a c t o r i l y , i s o n l y
p a r t o f t h e e x p l a n a t i o n o f t h e e f f e c t i v e n e s s o f s u b g r a d i e n t
o p t i m i z a t i o n .
The o t h e r p a r t o f t h e e x p l a n a t i o n w i l l have t o r e l a t e t h e
c o m b i n a t o r i a l s t r u c t u r e o f t h e o r i g i n a l problem s t u d i e d t o t h e
c o n d i t i o n number o f t h e f u n c t i o n d e f i n e d t h r o u g h t h e Dantzig-Wolf e
decompos i t ion p r i n c i p l e . T h i s p a p e r w i l l a t t e m p t t o show ( " i n d i -
c a t e " o r " a l l u d e t o " might be more a p p r o p r i a t e ) by u s i n g e x p e r i -
ments and some t h e o r y t h a t :
1 . The r e l a x a t i o n method f o r l i n e a l i n e q u a l i t i e s , and sub-
g r a d i e n t o p t i m i z a t i o n can be a c c e l e r a t e d ( and i n some
c a s e s f a s t e r t h a n t h e S . 0 . 3 . t h e o r y would p r e a l c t ) . 2 . For some c l a s s e s o f c o m b i n a t o r i a l p rob lems , uni form
Lower bounds (dependen t upon t h e d imension o f t h e prob-
lem) on t h e c o n d i t i o n number o f t h e a s s o c i a t e d f u n c t i o n s
p robab ly e x l s t .
3 . The " h a r d e r " t h e c o m b i n a t o r l a l problem IS, t h e b e t t e r
t h e c o n d i t i o n number o f t h e a s s o c i a t e d f u n c t i o n is .
Given t h e r a t h e r incomple te n a t u r e o f t h e t h e o r y o f a c c e l -
e r a t l o n i n t h e S.O.R. method, l t s h o u l d be c l e a r t h a t a g e n e r a l
proof o f t h e s e t h r e e p o l n t s i s p r o b a b l y u n a t t a i n a b l e .
2 . SUBGRADIENT OPTIMIZATION AND THE RELAXATION YETHOD
L e t f ( x ) be a f i n l t e concave f u n c t l o n d e f l n e d on R" and
;f ( x ) be L ~ S s u b d l f f e r e n t l a l a t x , l . e . ,
For e v e r y x , ; f ( x ) LS a convex, compact s e t , and l f f 1s
d l f f e r e n t l a b l e t h e n ; f ( x ) = c ? f ( x J , t h e g r a d l e n t o f f .
I n t n l s pape r f ( x ) ' d l 1 1 always be a p i e c e w l s e l l n e a r
f u n c t i o n :
![Page 40: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/40.jpg)
i where I i s a f i n i t e set , a E Rnrbi E R.
L e t t i n g I ( x ) = {i E I : f ( x ) = < a i , x > + b i l , t h e n
i s t h e convex h u l l o f t h e set {a1: i E 1 ( x ) 1 .
I t w i l l be assumed t h a t t h e o p t i m a l v a l u e f*=max{f ( x ) :x E R ~ I
is r e a c h e d o n t h e o p t i m a l set P = ( x E Rn:f ( x ) = £ * I . I t i s w e l l
known t h a t P = { x E Rn:O E a f ( x ) ) . The p r o j e c t i o n o f x on P w i l l
be d e n o t e d by x * ( x ) , and t h e d i s t a n c e f rom x t o P by d ( x )
= I x-x* ( x ) 1 , where 1 . . . . 1 means Euclidian norm.
The t e r m s u b g r a d i e n t o p t i m i z a t i o n w i l l be used i f f 4 is n o t
known, and t h e t e r m r e l a x a t i o n method w i l l b e used i f f * i s known.
The r e a s o n f o r t h i s is t h a t i f f * i s known, t h e o p t i m a l s e t c a n
be d e f i n e d by a sys t em o f l i n e a r i n e q u a l i t i e s :
A description o f bo th s u b g r a d i e n t o p t i m i z a t i o n and o f t h e
r e l a x a t i o n method f o l l o w s :
1 . Choose x0 E R".
2 . Compute a s u b g r a d i e n t o f f a t xq:uq E a f ( x q ) ( o r it c o u l d
be r e s t r i c t e d t o uq E i a l : i E l ( x q ) 1 . I f uq = 0 , a n o p t i m a l
p o i n t h a s been found ( a l s o i f f ( x q ) = f * i n t h e c a s e o f
t h e r e l a x a t i o n me thod) .
3 . The n e x t p o i n t xq+l o f t h e sequence w i l l be o b t a i n e d by
moving from xq i n t h e d i r e c t i o n o f uq by a c e r t a i n s t e p
s i z e . Go back t o 2 w i t h q + l r e p l a c i n g q . *
The s e l e c t i o n o f t h e s t e p s i z e depends on whe the r f i s
known o r n o t :
xq+l = x q + h - uq s u b g r a d i e n t o p t l m l z a t i o n ( f * unknown) ( 2 . 1 ) q uUqn
xq+ = xq + 0 '*-' (x') uq r e l a x a t i o n method i f * known). ( 2 . 2 ) q fluql12
![Page 41: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/41.jpg)
The sequences ( A and { o 1 t h a t w i l l be s t u d i e d a r e g i v e n by : 9 ' 4
and
I n b o t h c a s e s a convergence t h e o r y is available: i n [ 2 2 ] ,
( 7 1 f o r s u b g r a d i e n t o p t i m i z a t i o n and i n [ I ] , [6] and [ 9 ] f o r t h e
relaxation method. Ra tes o f convergence depend upon c o n d i t i o n
numbers which a r e d e f i n e d below. For s u b g r a d i e n t o p t l r n i z a t i o n
one d e f i n e s :
* .. ( x ) = x i n < u , x ix l -x
f o r e v e r y x B P u E > f ( x ) u l ~ x t ( x ) - x
and
and f o r t h e r e l a x a t i o n ne thod :
s i x ) = min f - f ( x ) f o r e v e r y x f P
uE3f ( x ) u x+ 1x1-Y. I
and
The c o n c a v i t y o f f ~ m p l l e s t h a t i l x ) -. J ( x ) and t h u s 5 , A . -
![Page 42: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/42.jpg)
The convergence theorems a r e a s f o l l o w s :
Theorem 2 . 1 For s u b g r a d i e n t o p t i m i z a t i o n (where A = X o p q ) , l e t q
Then
( A ) p ' z ( u ) and d ( x O ) E [AoC,XoDI i m p l i e s t h a t f o r a l l q :
d (xq) :d(xO) p q ,
(B) p 2 z ( u ) and d ( x O ) < XoC i m p l i e s t h a t f o r a l l q:
d ( x q ) - < X ~ C Q ~ ,
(C) p c z ( u ) o r d ( x 3 j > p D may l e a d t o t h e convergence o f 0
( x q ) t o a non-optimal p o i n t [ 7 ] .
Theorem 2 .2 For t h e r e l a x a t i o n method:
and f u r t h e r m o r e I£ dim P = n , t h e r e e x i s t s a oL E [ I , 2 ) such t h a t
i f o E ( o * , ~ ] , t h e n convergence i s f i n i t e [ 9 ] .
I t s h o u l d be c l e a r t h a t i n g e n e r a l u and a r e d i f f e r e n t , b u t
a l s o t h e r e i s a neighbourhood o f P s u c h t h a t p ( x ) = i i ( x ) . One
c o u l d d e f i n e condition numbers c l o s e t o t h e o p t i m a l s e t ( s a y u c and i : n o t e t h a t t h e y a r e e q u a l ) : t h e number ;jc c o u l d be used
I n Theorem 2 . 2 I n s t e a d o f 7 , p r o v i d e d t h a t q i s l a r g e enough, s o
t h a t t h e I t e r a t e s a r e c l o s e t o t h e o p t i m a l s e t ; a s l m i l a r s t a t e -
ment c a n n o t be made e a s i l y a b o u t Theorem 2 . 1 , b e c a u s e t h e p roof
t e c h n i q u e is q u i t e d i f f e r e n t (and more g l o b a l i n n a t u r e ) .
![Page 43: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/43.jpg)
Thus t h e r a t e s o f c o n v e r g e n c e o f b o t h methods i n v o l v e c o n d i -
t i o n numbers wh ich a r e r e l a t e d t o o n e a n o t h e r .
I n o n e c a s e o f i n t e r e s t [ e l , where a l l t h e e x t r e m e p o i n t s o f
a l l t h e s u b d i f f e r e n t i a l s o f f h a v e t h e same norm ( u n l e s s t h e
norm i s z e r o ) t h e n p = 6 = sc = i c , a n d t h u s i n t h a t c a s e t h e
r a t e s o f c o n v e r g e n c e ( a s g i v e n by t h e a v a i l a b l e t h e o r i e s ) o f t h e
r e l a x a t i o n method w i t h a = 1 , a n d o f s u b g r a d i e n t o p t i m i z a t i o n
w i t h p = z ( p ) ( a n d ~ ( $ 1 2 ) a r e i d e n t i c a l . I n t h i s c a s e lt a l s o
f o l l o w s t h a t c o n d i t i o n n u h e r s f a r f r om t h e o p t i m a l set c a n b e no
wor se t h a n c l o s e t o t h e o p t i m a l se t . The c o n d i t i o n t h a t a l l
norms b e e q u a l i s m e t by t h e r e l a x a t i o n method o f Agmon u n d e r i t s
maximal d i s t a n c e i m p l e m e n t a t i o n ( a n d a l l norms a r e o n e ) : i t
s i m p l y means t h a t a l l a i h a v e b e e n n o r m a l i z e d .
T h i s who le t h e o r y , t h o u g h c o r r e c t , is i n c o m p l e t e , i n t h e
s e n s e t h a t it f a i l s t o a c c o u n t f o r t h e i n t e r a c t i o n be tween s u c -
c e s s i v e i t e r a t e s : t h e t h e o r y o n l y c o n s i d e r s t h e w o r s t s t e p - t o -
s t e p b e h a v i o u r o f t h e s e q u e n c e s , p u t t i n g bounds o n t h l s , b u t
l g n o r i n q t h e f a c t t h a t it is ~ m p o s s i b l e f o r a l l s u c c e s s i v e I ter-
a t e s t o b e h a v e a c c o r d i n g t o t h i s w o r s t bound. T h i s reasoning 1s
v e r y s l m i l a r I n n a t u r e t o t h e o n e u s e d i n p r o v i n q t h e c o n v e r -
g e n c e , a n d t h e a c c e l e r a t i o n , i n t h e S.O.R. technique f o r s o l v i n g
s y s t e m s o f l i n e a r e q u a l i t i e s : t h e l i n e a r o p e r a t o r s t h a t r e l a t e
an l t e r a t e t o t h e n e x t o n e a l l h a v e a s p e c t r a l r a d i u s o f o n e , b u t
t h e p r o d u c t o f n o p e r a t o r s (wh lch d e f i n e s a c y c l e o f 5 i t e r a t i o n s
i n o u r terminology, o r o f one i t e r a t i o n a c c o r d i n g t o t h e S.O.R.
t e r m i n o l o g y ) h a s a s p e c t r a l r a d i u s l e s s t h a n o n e .
I n t h l s p a p e r we w l l l a t t e m p t t o e x t e n d t h e i d e a s o f t h e
acceleration o f t h e S.O.R. method t o t h e r e l a x a t i o n method f o r
inequalities and t o s u b g r a d i e n t o p t i m l z a t l o n .
3 . EXAMPLES AND EXPERIMENTATION
:;ampi2 1: Sys t ems o f l i n e a r e q u a l i t i e s
(whe re u s u a l l y i c o n t a i n s n e l e m e n t s ) .
![Page 44: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/44.jpg)
S o l v i n g t h i s sys t em i s e q u i v a l e n t t o maximizing
f ( x ) = r a i n rnin { ( a i , x ) + b i t - ( a i , x ) - b i } ( 3 . 2 ) i EI
where f * = 0 i f and o n l y i f t h e sys t em h a s a s o l u t i o n .
Zmrple 2 : 3 u a l o f a t r a n s p o r t a t i o n problem
...
f ( x ) = r s x . + 1 dk min ( c j k - x j ) j = l 3 3 k = ~ j = l , . . . , n
n m where s . = 1 dk •
j=1 J k=l
Zzample 3: Dual o f an a s s ignment problem
i n example 2 , l e t m = n , s . = 1 , dk = 1 V j , k . ( 3 . 4 ) 3
We experimented on TR 48 , t h e d u a l o f a t r a n s p o r t a t i o n problem
w l t h n = m = 48, t h e d a t a o f which c a n be found i n [ 1 7 ] . This
problem h a s t h e peculiarity t h a t t h e o p t l m a l s o l u t i o n i s
"un ique" (i. e . , up t o an a d d i t i v e c o n s t a n t ) , and t h u s we were
a b l e t o compute p (xq) f o r t h e ' s u c c e s s i v e i t e r a t e s . We o b s e r v e d
t h a t f o r a l l e x p e r i m e n t s performed i n £ + ( x q ) 2 .0021, and t h u s
u and 6 a r e l e s s t h a n .0021, from which it f o l l o w s t h a t
i ( u i = Yv - > .999498. A c t u a l e x p e r i m e n t a t i o n w i t h t h e r e l a x -
a t r o n method w i t h o = 1 showed an a s y m p t o t i c r a t e o f convergence
o f .999986 ( q u i t e d i s a s t r o u s a s it i n d i c a t e s t h a t i n o r d e r t o
improve t h e a c c u r a c y o f t h e optimum by one d i g i t , one s h o u l d go
th rough something l i k e 150,000 i t e r a t i o n s ) . Th i s p e s s i m i s t i c
r e s u l t s h o u l d be tempered by t h e o b s e r v a t i o n t h a t t h e methods
( s u b g r a d i e n t o p t i m i z a t i o n and r e l a x a t i o n ) behaved v e r y w e l l i n
t h e o p e n i n g game ( t h e e a r l y i t e r a t i o n s ) b e f o r e s l o w i n g down.
Exper iments were performed i n o r d e r t o measure t h e a c t u a l
convergence r a t e s o f t h e r e l a x a t i o n method a s a f u n c t i o n o f o .
The t h e o r y s a y s t h a t t h e r a t e o f convergence i s , i . e . ,
it worsens when o goes from one t o two.
![Page 45: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/45.jpg)
F i g u r e 1 . R a t e o f c o n v e r g e n c e o f t h e relaxation method
t o TR 4 8 a s a f u n c t i o n o f s.
The r e s u l t s , shown l n F l g u r e 1 , c l e a r l y i n d l c a t e t h a t a c -
c e l e r a t l o n t a k e s p l a c e , a n d t h a t a a e s t r a t e o f c o n v e r g e n c e o f
aDOUt .99di o c c u r s f o r 3 6 ( 1 . 3 6 , 1 . 9 8 ) . T h l s 1s a n Improvement
~ y 3 f a c t o r o f o n e h u n d r e d o v e r w h a t h a p p e n e d when 2 = 1 .
C l o s e r e x a m l n a t l o n o f t h e c o n v e r g e n c e showed t h a t :
1 . C o n v e r q e n c e was v e r y good I n t h e f l r s t few h u n d r e d l t e r -
a t l o n s : g o o d o p e n l n g game.
2 . I t w o r s e n e d considerably a f t e r t h l s ( t o a r a t e o f he
~ r d e r o f . 9 9 9 8 ) : weak m l d d l e game.
3 . I t l m p r o v e d a g a l n , s t a r t l n g a b o u t ' Inen a l l t n e
s ; l ~ g r a d l e n t s u s e d d e f l n e l l n e a r p l e c e s w h l c n g o t h r o u g h
t h e op t lmum p o l n t , t h u s s h o w l n g t h a t acceleration 1s a n
asymptotic D e h a v l o u r ; I t a l s o shows t h a t t n e (x') e n -
z o u n t e r e d a r e a l l I n t h e r a n g e . 1 6 t o . 2 6 : good e n d
game . I t seemed l o g l c a l t h e n t o experiment w l t h s u b g r a d l e n t o p t l -
r n l z a t l o n ; w e t r l e d w l t h 8 . =- 30013, , = . 9 9 9 , x ' = 0 ( & h e r e d ( x J ) i - 1 9 7 8 . ) , f o r 30 ,300 ~ t e r a t l o n s . Even t h o u g h we n a d some M o r r l e s
![Page 46: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/46.jpg)
abou t t h e middle game it converged a t a r a t e g i v e n by D (on t h a t
t h e t h e o r y i s c l e a r : i f it c o n v e r g e s , i t must be a t a r a t e p ) .
I t is n o t c l e a r a t t h i s p o i n t whether t h l s i s l u c k o r an i n d i c a -
t i o n o f an u n d e r l y i n g t h e o r y o f a c c e l e r a t i o n of s u b g r a d i e n t o p t i -
m i z a t i o n (my own f e e l i n g i s t h a t g e t t i n g o v e r t h e midd le game i s
l u c k , b u t t h a t a c c e l e r a t i o n i n t h e f i n a l game is probab ly prov-
a b l e ) . A s t u d y o f t h e a c c e l e r a t i o n phenomenon i n t h e r e l a x a t i o n
method f o r i n e q u a l i t i e s w i l l b e a t t e m p t e d i n t h e n e x t s e c t i o n s on
a few very p a r t i c u l a r examples.
4 . SYSTEMS OF LINEAR EQUALITIES
L e t a sys tem of l i n e a r e q u a l i t i e s be g i v e n by:
i where t h e a a r e l i n e a r l y independen t column v e c t o r s . Another
n o t a t i o n i s o f c o u r s e R x + b = 0 , where
Let D be a d i a g o n a l m a t r i x , w i t h Dil = a i 1 2 , t h e n ( 4 . 1 ) can be
x r i t t e n a s
T I f one d e f i n e s n and 6 by x = i T 5 = A 0, t hen e q u i v a l e n t sys tems
may be d e f i n e d by
i ; f i + 6 = 0 where i; = G~ ( 4 . 3 )
![Page 47: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/47.jpg)
I £ x* d e n o t e s t h e (un ique1 s o l u t i o n t o ( 4 . 1 ) and ( 4 . 2 ) and 5 *
and q * t h e ( u n i q u e ) s o l u t i o n s t o ( 4 . 3 ) and ( 4 . 4 ) , c l e a r l y
i The m a t r i x i s t h e Grammian o f t h e v e c t o r s a , w h i l e T i s t h e
Grammian c o n s t r u c t e d o n t h e v e c t o r s o f uni t norin zi = a i / / a i , . C l e a r l y is s i m p l y t h e m a t r i x o f t h e c o s i n e s o f t h e a n g l e s be-
tween t h e v e c t o r s a i , i = 1 , .. . , n . Of c o u r s e ; and ? a r e b o t h
p o s i t l v e d e f i n i t e s y m m e t r i c matrices.
The S . O . R . t e c h n i q u e ( o r e x t r a p o l a t e d G a u s s - S e i d e l ) c a n be
w r i t t e n a s f o l l o w s :
1 . x 3 arbitrary ( q = O )
2 . d e f l n e 1 by = q(mod n ) + 1
t h e n - q + 1 3 . 1 - fi*= (1-,E1?) , 5q - ? * \ q + q + l and go t o 2 .
E i s a m a t r i x w i t h a o n e i n p o s l t i o n ( i , i ) a n d z e r o s e l s e w h e r e ;
nAte a l s o t h a t iq+' i s c o m p u t a b l e i n f u n c t i o n o f 7q r < * 1s n o t
known b u t ? ? * = -b i s ) . - - -
I f we l e t i ( T i ) = :q + b , t h e r e s l d u e , t h e n a l s o
I n t h e S . O . R . hheory o n e d e t l n e s a s e q u e n c e by - , .. . , 1 = 7 , . . . ,.. -k to ?kt1 t'k = q k n ; t h e " ~ t e r a t l o n " f rom 3 ~111 be
characterized h e r e a s a c y c i e of n ~ t e r a t i o n s . The r e d s o n f o r
~ n i s i s o f c o u r s e t i ~ a t t n e " i t e r a t i o n " f rom ;jk :Q <:crl is g ~ v e n by a l l n e a r o p e r a t o r , and t h u s t h e c o n v e r g e n c e c a n be
s t u d i e d , a n d r a t e s o f c o n v e r g e n c e c a n be identified w l t h t h e
s p e c t r a l r a d i u s o f t h i s o p e r a t o r . I f w e d e f l n e L and 6 by . - ' - 1 - L - U, v h e r e - L ( - U ) a r e r e s p e c t i v e l y , t h e s t l r l c t l v l o w e r
![Page 48: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/48.jpg)
( u p p e r ) p a r t o f 7 , t h e n
t h e c l a s s i c a l fo rmula ; c l e a r l y ( 1 - a i ) -' ( ( 1 - a ) I - 06) 1s e q u a l t o - 1 i I -u rn? ) (1-OE~- i ) . . . i I - U E ~ F,) . I f w e l e t R = i 1-0;) ( ( 1 - 0 ) I + aG)
t h e n a l s o E, ( f i k n c n ) = ';RT-' ( f iKn)
How l e t u s d e f i n e t h e Kacmarz ( e x t r a p o l a t e d ) method a p p l i e d t o
( 4 . 1 ) o r ( 4 . 2 ) :
1 . x0 a r n i t r a r y ( q 3 0 )
2 . 1 d e f i n e d by i = q (mod n ) + 1
3 . xq+l-x* = p) (Xq - ,'*) q + q + 1 and go t o 2 .
-iTx* = -sit xq+' i s computab le i n f u n c t i o n o f xq. Again , a s a
~f w e l e t r ( x ) = h + 6 , and r ( x ) = Ax + b, t h e r e s i d u e s , t h e n i t
can be checked t h a t :
Observe t h a t :
and t h u s
Also , (xkn+n) = r R ? - 1 - kn r ( x )
We have t h u s shown t h e f o l l o w i n g " theorem" .
?&orem 4 . 2
I f w e u se t h e Kacmarz method a p p l i e d t o (4 .1 ) ( 4 . 2 ) o r t h e
Gauss -Se ide l method a p p l i e d t o ( 4 . 3 ) and ( 4 . 4 ) , w i t h t h e same
![Page 49: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/49.jpg)
relaxation paramete r o and s t a r t l n g p o l n t s x' ( f o r 4 .1 and 4 . 2 1 , - 0
1' t i o r 4 . 4 ) , 7 ( f o r 4 . 3 ) s a t l s f y l n g
x o = iT?o = A T .,' -
t h e n a t e v e r y i t e r a t i o n , i t i s a l s o t r u e t h a t
.q = AT? = ATlq . q
T h i s f a c t was n o t i c e d by Kahan [ 1 4 ] . And t h u s t h e r a t e s o f con-
ve rgence of b o t h methods a r e identical, and g i v e n by t h e s p e c -
t r a l r a d i u s o f R, (which is known t o be l e s s t h a n one i f J E ( 0 , 2 ) ) . A s t o e v e r y p o s i t i v e d e f i n i t e symmetr ic m a t r i x r one can
associate ( r e a l ) Cholesky f a c t o r s by 7 = PAT, i t is a l s o c l e a r
t h a t t h e S.O.R. t e c h n i q u e a p p l i e d t o :q + b = 0 i s e q u i v a l e n t
t o t h e e x t r a p o l a t e d Kacmarz method a p p l i e d t o hx + b = 0 .
A l l o f t h i s can be found i n N i c o l a i d e s L I E ] .
The following tneorem characterizes t h e convergence of t h e
.:arlous methods I n t e rms o f t h e a s y m p t o t i c b e h a v i o u r of t h e
directions p o i n t i n g from t h e i t e r a t e s t o t h e op t l rna l p o i n t .
ASSLTe t h a t t h e l a r g e s t modulus a i g e n v a l u e i?.) o f R 1s r e a l ,
p o s l t l v e and t h e un ique r o o t o f modulus A o f t h e characteristic
polynomial, w l t h u n i t ( r i g h t ) e l g e n v e c t o r e l l , t h e n t h e sequence
g e n e r a t e d by t h e S.O.R. method a p p l i e d t o ( 4 . 3 ) w i l l , u n l e s s
7" - 7 i s p e r p e n d i c u l a r t o t h e l e f t e i g e n v e c t o r e, c o r r e s p o n d i n g
t o A , s a t i s f y :
..Xn - * l i m ' -I = 5 e (where 3 i s e i t h e r + 1 o r - 1 ) k--- ''1 I :"-?I* 1
I ,kn+n--- * l i m J " 1 L = ? , - icn- . * k-- 1" 1
![Page 50: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/50.jpg)
L e t R = SJS-I where J i s t h e J o r d a n c a n o n i c a l form a s s o c i a t e d
t o R. Then t h e r e i s o n l y one J o r d a n c a n o n i c a l box c o r r e s p o n d i n g
t o A , and it i s o f d imension 1 . L e t ( O , O ) be t h e l o c a t i o n o f
A i n J.
C l a s s i c a l l y : J~ = A ~ E ~ + o ( A q ) , where o ( A q ) r e p r e s e n t s a
( n x n ) m a t r i x which goes t o z e r o f a s t e r t h a n A q .
where siT i s t h e 0- th row o f 5 - l , and s t i s t h e 0- th column o f S
which c a n be shown t o be p r o p o r t i o n a l t o eA (it c o u l d have been
chosen e q u a l t o e A ) , w h i l e s i i s p r o p o r t i o n a l t o e i .
The theorem f o l l o w s p r o v i d e d t h a t (ei, ~ O - f i * ) # 0 .
To summarize, t h e convergence o f fik t o q * t a k e s p l a c e a l o n g
n one - s ided a s y m p t o t e s , used i n a c y c l i c o r d e r ; t h e convergence
i s g e o m e t r i c , v i t h a c r i s p e r d e f i n i t i o n o f t h i s c o n c e p t t h a n i n
g e n e r a l ( t h e d e c r e a s e o f t h e d i s t a n c e o v e r n s t e p s t e n d s a s a l i m i t
t o A , and t h u s t h e a v e r a g e r a t e p e r i t e r a t i o n i s "fi). Of c o u r s e
an a l m o s t i d e n t i c a l theorem h o l d s f o r t h e Kacmarz method.
I f t h e e i g e n v a l u e ( s ) o f l a r g e s t modulus i s complex, t h e n
t h e r a t e o f convergence o f t h e Kacmarz and o f t h e S.O.R.
method i s st111 g i v e n by t h e n - th r o o t o f t h e modulus, b u t
t h e e x i s t e n c e o f a sympto tes depends on d i s c u s s i o n s o n t h e r a t i o -
n a l i t y o f t h e argument ( i n d e g r e e s ) o f t h e r o o t o f l a r g e s t
modulus. I f t h e e i g e n v a l u e o f l a r g e s t modulus were n e g a t i v e ,
t h e n t h e r e would be n two-s ided asympto tes .
Now t h e r e l a x a t i o n method o f Agmon d e f i n e d i n ( 2 . 2 ) and
a p p l i e d t o ( 3 . 1 ) is i d e n t i c a l i n s p i r i t t o t h e Kacmarz method ( a
s l m l l a r argument c o u l d be made t o compare S o u t h w e l l ' s r e l a x a t i o n
![Page 51: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/51.jpg)
method a p p l i e d t o ( 4 . 3 ) and t h e S.O.R. method a p p l i e d t o ( 4 . 1 ) ) ;
t h e d i f f e r e n c e i s t h a t t h e sequence o f i n d i c e s used i s n o t g l v e n
by r e p e a t i n g t h e c y c l e 1 , 2 , 3 , . . . , 11 b u t by c h o o s i n g a t e a c h I t e r a -
t i o n t h e l n d e x o f t h e l a r g e s t r e s i d u e , i n a b s o l u t e v a l u e . The
same l i n e a r o p e r a t o r s a r e used t o go from one i t e r a t i o n t o t h e
n e x t , b u t i n a n o r d e r , n o t n e c e s s a r i l y u n i q u e l y d e f i n e d , d e t e r -
mined i n a d i f f e r e n t f a s h i o n ; t h e o r d e r i s n o t d e f i n e d a p r i o r i ,
b u t 1s dependen t on t h e i t e r a t i o n sequence .
I t mlght be wor th p o l n t l n g o u t t h a t I n t h e Agmon's r e l a x -
a t l o n method o f ( 2 . 2 ) ( and I n S o u t h w e l l ' s m e t h o d ) , t h e normal l za -
t l o n o f a l , l = 1 , ..., n , changes t h e sequences g e n e r a t e d , a s t h e
l n d e x which g l v e s t h e maxlmum r e s l d u e depends on t h e normal l za -
t i o n .
The r e a s o n f o r t h i s s e c t i o n is g i v e n by t h e o b s e r v a t i o n t h a t
i f I n t h e r e l a x a t i o n method of Agmon ( 2 . 2 ) a p p l i e d t o ( 3 . 1 ) t h e
i n d i c e s chosen form c y c l e s r e p e a t i n g t h e l n t e g e r s 1 , 2 , 3 , ..., n i n
sequence t h e n t h e convergence t h e o r y o f t h e Kacmarz method and o f
t h e S.O.R. method a p p l i e s e x a c t l y . I t a l s o i n d i c a t e s why t h e
d i s c u s s i o n o f 5 . 2 is needed; i f i n t h e relaxation method ( 2 . 2 )
t h e c y c l e 1 , . . . , n has r e p e a t e d ~ t s e l f o f t e n enough f o r t h e non-
domlna t inq e i q e n v a l u e s t o have l o s t t h e i r powers , t h e n t h i s
c y c l e w l l l r e p e a t ~ t s e l f ad i n f i n i t l u n , s o t h a t an e x a c t asymp-
t o t i c t h e o r y o f ( 2 . 2 ) would be available.
I f t h e r e l a x a t i o n method o f ( 2 . 2 ) used t h e i n d e x i ( q ) a t *
i t e r a t i o n q = 0 , 1 , 2 , ... , and i f t h e r e e x l s t s an i n d e x q and a n * i n t e g e r p s u c h t h a t i ( q + p ) = ( q ) f o r a l l q l q (i . e . , t h e ~ n d i c e s
used a r e c y c l i c ) , t h e n a l l t h a t has been s a i d above h o i d s , muta-
t i s mutand i s : t h e 5.9. R. t h e o r y can be a p p l l e d w i t h a Grammian
c o n s t r u c t e d on t h e v e c t o r s a i (" , a i ('+' ) , . . . , a i ( q + p - l ) , ( i n t h a t
o r d e r ) , where t h e Grammian mignt be p o s i t i v e s e m l d e f i n l t e l f some
i n d i c e s a r e r e p e a t e d w i t h l n a c y c l e , o r i f t h e a i u sed a r e n o t
l i n e a r l y i n d e p e n d e n t .
The s t u d y o f when a c y c l i c o r d e r o f i n d i c e s w i l l a p p e a r ( i n
an asymptotic s e n s e ) has n o t been done y e t , e x c e p t f c r a two-
dimensional l i n e a r sys t em [ 9 1 .
![Page 52: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/52.jpg)
5 . EXAMPLES OF APPLICATION OF THE THEORY OF ACCELERATION OF THE RELAXATION METHOD
1 . A two-dimensional example was s t u d i e d i n [9] q u i t e ex ten- T 2 T s i v e l y . Le t a ' = ( c o s a , s i n a ) , a = ( c o s a , - s i n a ) ,
b1 = b2 = 0 , and aE(O,IT/4); a s t u d y o f t h e r e l a x a t i o n method o f 1 i Agmon ( 2 . 2 ) a p p l i e d t o ( a , x ) + b 2 0 , i = 1 , 2 , XER" was done and
it i s a l r e a d y q u i t e messy. A s h o r t summary fo l lows .
Le t a = 2 / ( l + s i n 2a) , t h e n i f & ( a * , 2 ] f i n i t e convergence o c c u r s
f o r any i n i t i a l p o i n t ; i f a € ( 1 , a ] , t h e n i n f i n i t e convergence
o c c u r s a l o n g 2 (one-s ided) asymptotes f o r a l l s t a r t i n g p o i n t s i n
an open a n g l e , w h i l e f i n i t e convergence o c c u r s f o r a l l o t h e r
s t a r t i n g p o i n t s ( e x c e p t f o r t h e boundary o f t h a t open a n g l e ,
where u n s t a b l e i n f i n i t e convergence i s t h e o r e t i c a l l y p o s s i b l e ) .
I f one d e a l s wi th t h e system o f e q u a l i t i e s aix + bi = 0 , * i = 1 , 2 , t h e n i f a E ( 1 , o 1 , t h e r e s u l t s a r e e s s e n t i a l l y i d e n t i c a l
( e x c e p t f o r t h e f i n i t e convergence p a r t ) . I f one u s e s t h e Kacmarz
method f o r t h i s system o f e q u a l i t i e s , and i f a E ( 1 , o ) , i d e n t i c a l
r e s u l t s can be shown. I n t h e r e l a x a t i o n method a p p l i e d t o bo th * e q u a l i t i e s o r i n e q u a l i t i e s ( f o r a E ( 1 , a 1 ) one u s e s t h e c o n s t r a i n t s
one a f t e r t h e o t h e r (maybe a f t e r a few i t e r a t i o n s i n t h e c a s e o f
e q u a l i t i e s ; i n t h e c a s e o f i n e q u a l i t i e s , t h i s a p p l i e s on ly i f
convergence i s n o t f i n i t e , and i f s o , t h e c y c l i c o r d e r s t a r t s a t
t h e f i r s t i t e r a t i o n ) . And t h u s f o r oE(1 ,o 1 , t h e convergence
theory i s i d e n t i c a l t o t h a t g iven by t h e S.O.R. t echn ique . W e
shou ld mention t h a t a i s t h e o p t i m a l v a l u e o f t h e r e l a x a t i o n
pa ramete r , a s p r e d i c t e d by t h e S.O.R. t h e o r y .
I f a ~ ( o , 2 ) , t h e n t h e S.O.R. theory shows a r a t e o f conver-
gence o f a-1 which is t h e modulus o f t h e two complex e i g e n v a l u e s * of t h e o p e r a t o r R, a s d e f i n e d i n s e c t i o n 4 . I f oE(o , 2 ) , t h e n
t h e c a s e o f t h e r e l a x a t i o n method o f Agmon a p p l i e d t o t h e equa l -
i t i e s has n o t been s t u d i e d i n a s a t i s f a c t o r y manner: t h e i n d i c e s
o f t h e e q u a l i t i e s a r e n o t used i n any e a s i l y r e c o g n i z a b l e c y c l e s ,
and t h u s t h e whole t h e o r y o f l i n e a r o p e r a t o r s does n o t seem t o
h e l p . Experiments have been performed (and a l s o w i t h s u b g r a d i e n t
o p t i m i z a t i o n ) , b u t w i l l be r e p o r t e d e l sewhere . Improvements of
r a t e s o f convergence o v e r t h e S.O.H. t h e o r y seem t y p i c a l .
![Page 53: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/53.jpg)
2 . An a s s i g n m e n t p r o b l e m w i l l b e s t u d i e d i n some d e t a i l .
I t h a s b e e n c h o s e n , b e c a u s e w e t h i n k t h a t i t i s t h e h a r d e s t o n e
f rom t h e p o i n t of v iew o f t h e r e l a x a t i o n method ( n o t e a g a l n t h a t ,
when w e t a l k o f r e l a x a t i o n me thod , i t i m p l i e s t h a t we make t h e * u n r e a l i s t i c a s s u m p t i o n t h a t f i s known) .
T h i s p r o b l e m i s d e f i n e d by a c o s t m a t r i x wh ich is z e r o :
C . . = 0 , i , j = I , ..., n ; w e w i l l d e n o t e t h i s a s s i g n m e n t p r o b l e m lJ H
by An '
C l e a r l y :
Note t h a t t h e s e two expressions a r e n o t d e f l n e d I n t e r m s o f t h e
same s e t o f l i n e a r p l e c e s :
I n t h e f i r s t c a s e
i f ( x ) = rnin {(a , x ) )
i E i
i k w h e r e a l ~ ~ " is a n y v e c t o r satisfying ( a i , e ) = o , ( a , e ) i s a n i n -
t e g e r ( + o r - 1 l e s s t h a n o n e , := = 1 , . . . ,r., ( e i s a v e c t o r o f o n e s ,
e k i s a v e c t o r w i t h o n e i n 7 o s l t l o n k and z e r o s e l s e w h e r e ) .
I n t h e s e c o n d c a s e
f ( x ) = min ( . l L , x ; , i = l , . . . . n
i i where v = e - n e , 1 = l , . . . , n.
The l m p l l c a t l o n s o f t n l s d l s t l n c t l o n a r e n o t e a r t n - s h a k l n g ,
~ u t t n e y n a y l e a d t o d l s t l n c t s e q u e n c e s ; t h l s 1s r e l a t e d t o t h e
question: wnen t h e r e 1s rnore t h a n o n e s u b g r a d l e n t , wh l ch o n e 1s
c h o s e n ? T h l s d e p e n d s o n t h e e x a c t way s u b p r o b l e m s a r e f o r m u l a t e d
and s o l v e d .
![Page 54: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/54.jpg)
The o p t i m a l se t i s P = (cie : r r e a l } ; n o t i c e t h a t
( a f ( x ) , e ) = 0 , VXER" and t h u s w i t h e i t h e r t h e r e l a x a t i o n method m m
o r s u b g r a d i e n t o p t i m i z a t i o n ( x O , e ) = ( x q , e ) = ( X , e ) , where x , t h e p o i n t t o which t h e s e q u e n c e c o n v e r g e s ( i f i t d o e s c o n v e r g e )
i s g i v e n by:
To s i m p l i f y t h e n o t a t i o n , w e w i l l assume t h a t (x" , e ) = 0 , * and t h u s " t h e " o p t i m a l p o i n t i s x = 0 . The whole s e q u e n c e t a k e s
p l a c e i n t h e ( n - 1 ) - d i m e n s i o n a l s u b s p a c e S = IXER" : ( e , x ) = 0);
and t h u s t h e problem ( 5 . 1 ) o r ( 5 . 2 ) i s e f f e c t i v e l y a p rob lem i n
(n- 1 ) d imens ions .
I f x E S t h e n f ( x ) = -n inax xi i and t h u s i = l , ..., n
I t i s p o s s i b l e t o check t h a t :
i 1 . The e x t r e m e p o i n t s o f a f ( 0 ) a r e v , i = 1 , ..., n
2. ( v i , v J ) = -n i f 1 # j
lvi12 = n ( n - 1 )
i 3 . T h e s o l u t i o n o f t h e n-1 e q u a t i o n s ( v , x ) = 0 ,
r = 1 . 2 , j - 1 , j + l , . . . , n s u c h t h a t xES is g i v e n by v J .
Proof * p = = l n f i n £ - ( u , x ( x ) -x )
x* uE; l f (x ) , u l / x * ( x ) - x l
The c o s i n e f u n c t i o n i s q u a s i c o n c a v e (on t h e domain where i t
1s positive), and t n u s t h e l n n e r infimum i s a t t a i n e d a t ex t r eme
p o l n t s o f a f ( x ) . But f o r e v e r y x , e v e r y e x t r e m e p o i n t of af ( x )
h a s norm n (n- 1 ) .
![Page 55: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/55.jpg)
Thus I
n max x . ! ? 3
= i n £ i z l I.= 1
: 1 x . = 0 , t x- = 1 p
- ( t h e i n f max i s a t t a i n e d f o r ( x i =
'1 n- 1 X . = - V n , - 1 V n (n-1)
1 # i ) , i = 1 , ..., n ) Q . E . D .
5. The rnlnlrnum 3 o r : 1s a t t a l n e d f o r
x = i v J + K e , V : > 0 , V K , j = l , . . . , n .
t h e bound on t n e r l q h t 1s t l g h t ~t x = - . v ' + Ke , V i . 0 ,
K , 1 = 1 , . . . , n and on t h e l e f t l f x = i v - + K e , V J . 0 , VK, ] = I , . . . , a.
Thus cne r a t e o f convergence , a s roved I n ( 7 3 f o r t h e r e l a x a t i o n
method i s , and t h e s u s t a i n a b l e r a t e o f conver- in-1)
qence [ 7 ] , i 2 l I f o r s u b g r a d i e n t o p t i r n l z a t i o n i s z ( p ) =
This i s n o t ve ry good. A b e t t e r t h e o r y certainly e x i s t s .
we w i l l t r y co show c h a t i sp rovement s i n t h e t h e o r y can
p r o b a b l y be made by u s l n g t h e comparison w i t h t h e S.G.R. t e c h -
n i q u e , p r o v l d e d t h a t a l l s u b g r a d i e n t s ( e x c e p t o n e ) a r e used
![Page 56: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/56.jpg)
i n a c y c l i c a l o r d e r . What w e w i l l do h e r e i s t o e x h i b i t a se- H
quence g e n e r a t e d by t h e r e l a x a t i o n method ( 2 . 2 ) a p p l i e d t o An
which i s c y c l i c a l , and t h e n i n d i c a t e t h a t t h i s s e q u e n c e is s t a b l e .
L e t t h e i n i t i a l p o i n t be :
Note t h a t x: > x i > . . . > x 0 11-1 > x:
n and t h a t 1 x i = ( x O , e ) = 0 . ( W e assume p # 1 ) .
k= 1
A t t h e f i r s t i t e r a t i o n , t h e s u b g r a d i e n t v1 = e - n e l i s c h o s e n ,
and t h e second i t e r a t e w i l l be :
1 . x 1s s i m p l y pxO w i t h a s h i f t o f one down o f a l l component
i n d i c e s , e x c e p t t h e n - t h , (which r ema ins l a s t ) , and t h e 1 - s t
which becomes t h e ( n - 1 ) - s t . C o n d i t i o n ( 5 . 3 ) s i m p l y e x p r e s s e s
the f a c t t h a t t h e fo rmula is c o n s i s t e n t and t h a t t h e sequence
w i l l r e p e a t i t s e l f ( ( 5 . 3 ) is a n e q u a t i o n between p and a ) .
![Page 57: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/57.jpg)
where pnn = 1
P i , i - l = f o r i = 2 , ..., n-1
P i , j = 0 o t h e r w i s e ,
and a l s o x(k+n- l - - p n - l p n - l x ( k ) = o n - l x ( k ) .
The s u b g r a d i e n t s a r e u s e d i n t h e c y c l i c a l s e q u e n c e
1 2 n-1 1 2 n- 1 v , v ,..., v , v , v ,..., v ,...
I t c a n be c h e c k e d t h a t :
k = 0, ..., n-2
w e e = 1 xk - x > 0 V i = 1 , . . . ,n-1 ( 5 . 5 ) n l n
. The v e c t o r (ii;, i = l , . . . , n -1 ) i s a n e l g e n v e c t o r o f t h e
S . O . R . o p e r a t o r R = ( I - o f , ) - ' ( ( 1 - a ) I + U ~ ) where ? is t h e 1
Cramnian c o n s t r u c t e d on ( v , . . . , vn - ' i / d x . The o p e r a t o r R o f t h e S.O. R. method h a s a c h a r a c t e r i s t i c
e q u a t l o n ( l n v a r i a b l e w ) which i s :
o r d e t
4 + a - 1 -01 (n - 1 ) . . .
- a o / ( n - 1 ) d + a - l . . . = 0
![Page 58: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/58.jpg)
which can be computed a s
T h i s polynomial can be r e l a t e d t o ( 5 . 3 ) which i s o b t a i n e d n- 1 by l e t t i n g p =W and t a k i n g t h e ( n - 1 ) - t h ( r e a l , p o s i t i v e ) r o o t
o f ( 5 . 6 ) :
Every r o o t 0 o f (5 .7 ) g i v e s r i s e t o a r o o t of (5 .6 ) g i v e n by n- 1
w=p . I f (5 .7 ) has (n-1) d i s t i n c t r o o t s p i , such t h a t n- 1
w . = p i a r e d i s t i n c t , t h e n w i t i = l , ..., n-1 a r e t h e (n-1) r o o t s
o f ( 5 . 6 ) .
I t is n o t t o o s u r p r i s i n g t h a t ( 5 . 3 ) i s i d e n t i c a l t o (5 .7 ) . I n o r d e r t h a t t h e sequence of p o i n t s g i v e n by (5 .2) , ( 5 . 3 ) , (5 .4 )
be o b s e r v a b l e i n p r a c t i c e , t h e s t a b i l i t y o f t h a t sequence must
be proved. S t a b i l i t y f o l l o w s i f two c o n d i t i o n s a r e met:
+ Condition 5.1 The l a r g e s t modulus r o o t of ( 5 . 6 ) , s a y w , must be
r e a l , p o s i t i v e , unique and l a r g e r i n modulus t h a n a l l o t h e r r o o t s
of ( 5 . 6 ) ( t h e same s t a t e m e n t h o l d s m u t a t i s mutandis f o r D + + (n-' ) i f a l l r o o t s o f ( 5 . 7 ) a r e d i s t i n c t ) : t h e v e c t o r fiO = ( w )
given by (5 .5 ) i s t h e e i g e n v e c t o r o f t h e o p e r a t o r R correspond- +
Ing t o w . Condition 5 .2 For a neighbourhood o f fio ( g i v e n by 5 .5 ) , one needs
t h a t t h e same c y c l i c u s e o f s u b g r a d i e n t s a s g i v e n by (5 .21 , ( 5 . 3 )
and ( 5 . 4 ) be p rese rved f o r a l l i t e r a t i o n s . A s k e t c h of t h e long
and u n e v e n t f u l proof fo l lows ( i f one assumes a l l e i g e n v a l u e s t o
be d i s t i n c t s o t h a t a f u l l s e t of e i g e n v e c t o r s e x i s t s ) : l e t
= i0 + E O , where fio i s g iven by ( 5 . 5 ) and E" i s a l i n e a r combina-
t i o n o f t h e e i g e n v e c t o r s o f R ( e x c l u d i n g f iO) ; t h e e q u a t i o n s ex-
p r e s s i n g t h a t a t i t e r a t i o n k + l (n-1 ) (where 0 5 k 5 n-1 ) , t h e sub- g r a d i e n t t o be used i s v k , a r e l i n e a r i n e q u a l i t i e s i n t e rms of
![Page 59: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/59.jpg)
7 variables which are the com2onents (poss~bly complex) of E In
terms of the nondominating eigenvectors of R and are such that
the coefficients of the variables are proportional to the L-th
power of the corresponding eigenvalues (multiplied by constants
which depend only on k and R), while the constant term is pro-
portional ( w + ) ' (times constant which depend on k and R). Every
one of these inequalities for every k=0, ..., n-1, k=Ol...,m, each of which contains zero, contains a neighbourhood of zero. Thus,
if E~ belongs to some neighbourhood of zero, the indices are
used in the cyclical order 1,2,. . . ,n-1,l , 2 , . . . ,n-1 . . . . A proof of this second condition for stability would become
interesting if some useful characterization of these neighbour-
hoods of attraction into a given cyclical order could be given.
H We will now move to experimentation on All. The number 1 1
was chosen because the routlne (Jenkins-Traub) we had available
to us broke down for n rather small on the polynomials glven
by ( 5 . 6 ) or (5.7).
Figure 2 . Roots of (5.7) and experimental ratss of convergence ti of the relaxation method applled to A , , .
![Page 60: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/60.jpg)
For n = l l , F i g u r e 2 shows t h e modulus o f t h e r o o t s o f ( 5 . 7 ) f o r
O E [ 1 , 2 ] . I t was checked t h a t t h e 10-th power o f t h e s e r o o t s
a r e t h e r o o t s o f (5 .6 ) because t h e y a r e d i s t i n c t ( e x c e p t a t
i s o l a t e d p o i n t s , where, by c o n t i n u i t y , it does n o t mat ter) , and
i n f a c t ( 5 . 6 ) was s o l v e d and it confi rmed t h i s ( n o t e : t h e rou-
t i n e w e used b roke down on ( 5 . 6 ) f o r v a l u e s o f a above 1 . 7 ) .
The r o o t s o f ( 5 . 7 ) a r e shown h e r e because t h e y r e p r e s e n t what
has been c a l l e d t h e r a t e o f convergence o f t h e r e l a x a t i o n
method (2 .2) , w h i l e t h e r o o t s o f (5 .6) r e p r e s e n t the r a t e o f
convergence o f c y c l e s (n-1 i t e r a t i o n s ) o f t h e S.O. R. method;
comparisons w i t h exper iments r e p o r t e d i n o t h e r works can b e
made.
I t t h u s f o l l o w s t h a t w e have shown t h a t f o r a E [ I , 1.431
t h e r a t e o f convergence o f t h e r e l a x a t i o n o f ( 2 . 2 ) a p p l i e d t o H
A l l , a s f u n c t i o n o f a , and f o r a set o f i n i t i a l p o i n t s o f d i -
mension 11, i s g iven by t h e graph o f t h e l a r g e s t r o o t .
A s i m i l a r g raph cou ld be drawn f o r any v a l u e o f n. Exper i -
ments and t h e t h e o r y o f polynomials show t h a t ( f o r UE( 1 , 2 ) ) :
1. ( 5 . 7 ) h a s e i t h e r two, one doub le , o r no p o s i t i v e r e a l
r o o t s .
2. (5 .7 ) h a s one n e g a t i v e r o o t i f n is even.
3 . The complex r o o t s o f (5 .7) g e t t o be ex t remely c l o s e i n
modulus b u t n o t i d e n t i c a l ( e x c e p t , o f c o u r s e , f o r t h e
c o n j u g a t e p a i r s ) , a s o i n c r e a s e s from 1 t o 2 , w h i l e t h e
n e g a t i v e r o o t ( f o r n even) i s always l e s s i n modulus
t h a n t h e complex r o o t s , b u t comes ve ry c l o s e i n modulus
when o goes from 1 t o 2.
4 . The two complex r o o t s which o r i g i n a t e from t h e p o s i t i v e
r e a l r o o t s , when o i n c r e a s e s , remain, i n modulus,
s i g n i f i c a n t l y l e s s t h a n t h e o t h e r complex r o o t s .
5 . The l a r g e s t modulus r o o t of ( 5 . 7 ) has a modulus g r e a t e r
t h a n o r e q u a l t o (a- 1 ) l ' (n- l ) ( t h i s is c l a s s i c a l i n t h e
S. 0. R. t h e o r y ) . I t s h o u l d be no ted t h a t i f o < 1 , t h e n R i s a p o s i t i v e
m a t r i x , and t h u s t h e e x i s t e n c e o f a dominat ing p o s i t i v e r e a l
r o o t fo l lows from t h e Perron ( F r o e b i n i u s ) theorem.
![Page 61: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/61.jpg)
Experiments t o v e r l f y a l l of t!lis w e r e made , a n d t h e o b -
s e r v e d r a c e o f c o n v e r g e n c e a s a f u n c t l o n o f TI i s p l o t t e d
i 7 ~ g u r e 2 ) . F o r .J E [ I , 1 .431 t h e experiments c o n f i r m t h e t h e o r y
q u i t e e x a c t l y ( i f j E ( 0 , l ) t h e same would b e t r u e ) . What e x p e r l -
m e n t s a l s o i n d i c a t e i s t h a t , a f t e r a few i n i t i a l i t e r a t i o n s , n-1
s u b g r a d i e n t s a r e u s e d l n a c y c l e ( o f o r d e r n-1) ; o f c o u r s e any (n -1) s u b g r a d i e n t s c o u l d b e u s e d i n any c y c l e o f o r d e r n-1. I n
t h l s p r o b l e m , c y c l i n g s e e m s t o b e a n attracting b e h a v i o u r f o r
a l l s t a r t i n g p o i n t s ( i f J 1 1 . 4 3 ) , i n t h a t c y c l i n g o c c u r s a f t e r
some i t e r a t i o n s .
Note t h a t t h e r a t e o f c o n v e r g e n c e o b t a i n e d f o r a = l . u 3 i s
a b o u t . 9 3 , w h i l e t h e r a t e g i v e n by p r e v i o u s t h e o r i e s ( f o r o = i )
is = . 9 9 5 ; a n d t h u s t h e t h e o r y o f a c c e l e r a t i o n i s a 10 s l j n i f i c a n t Improvement .
W e s h u u l d m e n c l o n t h a t t h l j e x a m p l e 1s v e r y particular: t 7 e
m a t r l x R d e p e n d s , I n g e n e r a l , o n t h e se t o f s u b g r a d l e n t s u s e u l n
a c y c l e , a n d o n t h e o r d e r l n g o f t h e c y c l e ; b u t l n t h l s c a s e ,
b e c a u s e o f t h e f a c t t h a t i v l , v l ) = C o n s t a n t l f I.#], R d o e s n o t
c h a n g e . I t would l m p l y t h a t , i n g e n e r a l , a w h o l e f a m l l y o f
m a t r l c e s R ( a n d o f polynomials) would n e e d t o b e s t u d l e d ; a n d
o f c o u r s e t h a t t h e l d e a o f 02 r a t e o f c o n v e r g e n c e disappears
( t n e r e m l g h t b e m a n y ) . F l g u r e 2 a l s o c l e a r l y shows t h a t l f
-E ( : . J 3 , 2 ) , t n e r a t e o f c o n v e r g e n c e 1s q u l t e different f rom t n e
m o d u l u s o f t h e e l g e n v a l u e s o f R ( ~ n f a c t , t h e c o n c e p t o f r a t e o f
c o n v e r g e n c e l n t h l s c a s e n a s n o t b e e n e x a c t l y defined).
The w h o l e se t o f n s u b g r a d l e n t s 1s u s e d l n a n o n c y c l i c a l
way, w l t h n o r e c o g n i z a b l e p a t t e r n . I t a l s o means t h a t t h e t h e o r y
o f l l n e a r o p e r a t o r s d o e s n o t seem t o p r o v l d e a n y i n s l q h t s a b o u t
c o n v e r g e n c e t h e o r y ( o f c o u r s e c o n v e r g e n c e t a k e s p l a c e a t a r a t e
o f a t most (/ 1 - 0 ( 2 - 1 ) p2,' b u t t h l s i s a r a t e whrch i g n o r e s t h e
interactions b e t w e e n successive i t e r a t e s ) . I t a l s o shows t n a t
i n t h i s c a s e , when n o n c y c i l n g b e h a v i o u r o c c u r s , a n e x t r a a c c e l -
e r a t l o n ( b e y o n d t h e a c c e l e r a t i o n 3 l v e n by t h e S . 0. R. t h e o r y )
t a k e s p l a c e . A t t h l s p o l n t i t i s q u l t e u n c l e a r how t n l s c o u l d
b e s t u d i e d t h e o r e t i c a l l y .
![Page 62: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/62.jpg)
3. Almost a l l t h a t was s a i d i n t h e p r e v i o u s s e c t i o n c a n be
i n t e r p r e t e d i n t e r m s o f t h e s o l u t i o n o f t h e s y s t e m o f l i n e a r
n n e q u a l i t i e s (v4.,x) =O, i - 1 , . . . ,n-1 wnere x € S = { e R : 1 x.=O}. I t
j=1 J
shows t h a t f o r v a l u e s o f a below t h e o p t i m a l r e l a x a t i o n param-
e te r (1.43), c h o o s i n g i n d i c e s a c c o r d i n g t o t h e maximum r e s i d u e
g i v e s rise t o t h e same r a t e o f conve rgence a s a c y c l i c c h o i c e
o f i n d i c e s .
B e f o r e e n d i n g t h i s s e c t i o n , w e would l i k e t o emphas ize t h a t
w e have d e a l t w i t h o n l y a few examples , and t h a t n o t a l l t h a t
h a s been o b s e r v e d c a n be e x t e n d e d v e r b a t i m . A g e n e r a l t h e o r y
migh t n o t b e u n a t t a i n a b l e b u t it seems t o b e e x c e e d i n g l y complex.
6 . SUBGRADIENT OPTIMIZATION *
I n most o p t i m i z a t i o n problems t h e o p t i m a l v a l u e o f f i s
n o t ~ n o w n , and t h u s t h e r e l a x a t i o n method ( 2 . 2 ) i s n o t implemen-
t a b l e . P a r t o f t h e i n t e r e s t i n it comes f rom t h e f a c t t h a t t h e
two t h e o r i e s on r a t e s o f conve rgence ( s u b g r a d i e n t o p t i m i z a t i o n
and r e l a x a t i o n method) a p p e a r t o b e r e l a t e d . Bounds on t h e r a t e s
o f conve rgence o f t h e two methods were shown t o b e i d e n t i c a l ,
unde r some c o n d i t i o n s , i n [ E l ; t h o s e bounds d i d n o t a c c o u n t f o r
any a c c e l e r a t i o n p r o p e r t i e s .
No t h e o r y h a s y e t been d e v e l o p e d which would show t h e a c c e l -
e r a t i o n o f s u b g r a d i e n t o p t i m i z a t i o n . So back t o e x p e r i m e n t a t i o n H on t h e example o f s e c t i o n ( 5 . 2 ) (A, ) .
Note t h a t i n t h e sequence o f i t e r a t e s g i v e n by ( 5 . 3 ) , it is
e a s y t o see t h a t / x ( k + l ) - x ( k ) o o k , and t h u s t h a t i f
one s t a r t e d s u b g r a d i e n t o p t i m i z a t i o n from t h e p o i n t x ( O ) w i t h - V
A - = a and o and p r e l a t e d by ( 5 . 3 1 , t h e n s u b g r a d i e n t
, V n - l ) optimization c o u l d t h e o r e t i c a l l y g e n e r a t e t h e same s e q u e n c e o f
p o i n t s ( 5 . 4 ) a s t h e r e l a x a t i o n method. T h i s d i d n o t happen i n
p r a c t i c e , b e c a u s e r o u n d i n g e r r o r s made t h e sequence u n s t a b l e
What was o b s e r v e d i n p r a c t i c e i s t h a t t h e s u s t a i n a b l e r a t e
o f conve rgence o f s u b g r a d i e n t o p t i m i z a t i o n was a r o u n d . 9 2 , even
![Page 63: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/63.jpg)
t h o u g h t h e s e q u e n c e o f i t e r a t e s u s e d a l l n s u b g r a d i e n t s i n a
p a t t e r n w i t h no d i s c e r n a b l e regularity. With s u b g r a d i e n t o p t i -
m i z a t i o n r a t e s o f c o n v e r g e n c e be tween . 9 2 a n d .9 were some t lmes
a c h i e v e d , b u t it r e q u i r e d t h a t A ? b e c h o s e n v e r y c a r e f u l l y , a s
c o n v e r g e n c e f a i l e d f o r h l g h v a l u e s o f X o a s w e l l a s f o r low
v a l u e s o f X . T h l s examp le ( a s w e l l a s TR 4 8 ) c l e a r l y indicates
a relationship be tween t h e r a t e s o f c o n v e r g e n c e o f s u b g r a d i e n t
o p t i m i z a t i o n a n d o f t h e r e l a x a t i o n n e t h o d ; a n d t h u s t h a t some
fo rm o f a c c e l e r a t l o n o c c u r s i n s u b g r a d i e n t o p t i m i z a t i o n .
7 . CONCLUSIONS, CONZECTURES AND AREAS OF FUTURE RESEARCH
S u b g r a d i e n t o p t i m i z a t i o n h a s worked w e l l i n p r a c t i c e o n
q u l t e a few p r o b l e m s g e n e r a t e d f r o m c o m b i n a t o r i a l p r o b l e m s . W e
t h i n k t h a t a c o m p l e t e e x p l a n a t i o n o f t h l s p r a c t i c a l e f f e c t i v e -
n e s s w i l l r e q u i r e f o u r p a r t s .
?err 2
The t h e o r y o f a c c e l e r a t l o n o f t h e c l a s s i c a l S.O.R. t h e o r y
e x t e n d s , somet lmes , t o t h e relaxation method f o r i n e q u a l l t l e s and
a l s o t o s u b g r a d r e n t o p t l m l z a t l o n . T h l s 1s essentially wha t h a s
b e e n done I n t h l s p a p e r , t n r o u q h a m l x t u r e o f t h e o r y , examp le s
and experiments. A g e n e r a l t h e o r y would n e e d t o add a few
l a y e r s o f complexity t o t h e S . O . R . t h e o r y \ ~ n c a s e s whe re ~t 1s
n o t v e r y c o m p l e t e , l . e . , where r n a t r ~ c e s a r e p o s l t l v e d e r r n l t c ) ,
a n d , ~ f t h e b e h a v l o u r 1s n o n c y c l ~ c a l , t h e n t h e S.O.R. t h e o r y
d o e s n o t seem t o h e l p .
?urr I
F o r some c l a s s e s o f c o m b l n a t o r l a l p rob l ems t h e r e a r e u n l -
v e r s a 1 bounds o n , a n d ;. For I n s t a n c e , we c o n ~ e c t u r e t h a t f o r 1 any a s s l g n m e n t p r o b l e m , u = ~ = - n- 1 '
T n l s c o n j e c t u r e L S a nit m l s l e a d l n g : t h e v a l u e s o f t n e
c o n d ~ t l o n numbers c l o s e t o t h e o p t l m a l s e t ( v - a n d j c i c o u l d be
mucn a b o v e l / in -1) . I n t n l s p a p e r we showed t h a t - , 3, p c o r
Q a r e n o t v e r y a c c u r a t e l n d l c a t o r s o f t h e r a t e o f c o n v e r g e n c e ; H s o w e w l l l a l s o c o n ] e c t u r e t h a t t h e a s s l g n m e n t p r o b l e m An ( f o r
wh lcn - = ; = l / ( n - 1 ) ) w l l l g l v e t h e l o w e s t r a t e o f c o n v e r g e n c e C C
![Page 64: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/64.jpg)
f o r both t h e r e l a x a t i o n method and subg rad i en t o p t i m i z a t i o n , i f
t h e a c c e l e r a t i o n phenomenon is taken i n t o account .
P a r t 3
To every problem one could a s s o c i a t e a r a t e of convergence
ach i evab l e by subgrad ien t op t im iza t i on . And t h u s w e cou ld make
p r o b a b i l i s t i c s t a t emen t s f o r t h e r a t e s of convergence of sub-
g r a d i e n t op t im iza t i on i f t h e d a t a of a problem v a r i e s accord ing
t o c e r t a i n p r o b a b i l i t y d i s t r i b u t i o n s .
In f a c t every experiment wi th subg rad i en t op t im iza t i on
could be viewed a s a s imu la t i on experiment designed t o e s t i m a t e
t h e p r o b a b i l i t y d i s t r i b u t i o n of t h e convergence r a t e s . One
should p o i n t o u t , t h a t t h e experiments r epo r t ed i n t h e publ i shed
l i t e r a t u r e a r e a very b i a sed sample as it is a p r a c t i c e (some-
t i m e s un fo r t una t e ) no t t o r e p o r t f a i l e d experiments . H For i n s t a n c e , problem AU8 was s t u d i e d a s a few assignment
problems of t h i s s i z e have been s t u d i e d by Held, Wolfe and
Crowder [ 1 2 ] . I n [ 7 ] , it was shown t h a t f o r A 4 8 (an assignment
problem s t u d i e d a l s o i n 1121) t h e b e s t r a t e o f convergence was H . 8 5 . For Au8 , t h e r o o t s of (5 .7 ) were s t u d i e d , even though t h e
r o u t i n e s f o r so lv ing polynomials broke down. W e found, by us ing
~ o u c h 6 ' s theorem (of t h e theory o f complex v a r i a b l e s ) t h a t t h e
minimum va lue o f D , s u b j e c t t o t h e c o n d i t i o n s t h a t it be t h e
s t r i c t l y dominating r e a l p o s i t i v e r o o t o f ( 5 . 7 ) , was around
.9926 f o r a va lue of 0 around 1.693. ( ~ o u c h & ' s theorem was used
t o show t h a t a l l r o o t s of ( 5 . 7 ) , excep t .9926, a r e i n a complex
c i r c l e / z ( 2 .9926-E) . Comparing .9926 t o . 8 5 seems t o i n d i c a t e
t h a t "random" assignment problems w i l l g i ve rise t o r a t e s o f
convergence which a r e reasonably good, on t h e average .
Other c l a s s e s o f combina tor ia l problems should be s t u d i e d ;
i n f a c t an assignment problem was chosen h e r e , because it i s
reasonably easy t o s t udy , and we should a t l e a s t say t h a t it
would be q u i t e absurd t o s o l v e assignment problems by using
subgrad ien t op t im iza t i on .
![Page 65: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/65.jpg)
W e w i l l a l s o c o n j e c t u r e t h a t t h e h a r d e s t p r o b l e m s f rom t h e
p o l n t o f v l e w o f s u b g r a d i e n t o p t i m i z a t i o n n o t o n l y n a v e low
probability, b u t a l s o a r e t h e e a s i e s t f r o m a c o m b i n a t o r i a l p o i n t
o f v i e w .
F o r r n s t a n c e , A t i s t r i v i a l , a n d i t is h a r d f r o m t h e p o i n t
o f v i e w o f s u b g r a d i e n t e s t i m a t i o n . I t s h o u l d b e p o s s i b l e t o
r e l a t e u c a n d t o t h e n a t u r e o f t h e s o l u t i o n t o t h e p r i m a l
a s s i g n m e n t p r o b l e m .
I t a l s o i n d i c a t e s why s u b g r a d i e n t o p t i m i z a t i o n w o r k s quite
w e l l w i t h i n a b r a n c h a n d bound f r a m e w o r k [ l o ] : s u b g r a d i e n t
o p t i m i z a t i o n s e e m s t o work b e t t e r when t h e c o m b i n a t o r i a l n a t u r e
o f t h e p r o b l e m i s h a r d .
REFERENCES
1 . Agmon, Shmuel . 1954 . The r e l a x a t i o n m e t h o d f o r l i n e a r inequalities. C a n a d l a n J . M a t h . , V o l . 6 , p p . 382-392.
2 . A l l e n , D . N . d e G . 1954 . R e l a x a t i o n Methods i n E n g i n e e r - l n q a n d S c l e n c e , McGraw-Hi l l , New York.
3 . C a m e r l n ~ , P .N. , L. F r a t t a a n d F . M a f f l o l l . 197b. jn l m - proving relaxation m e t h o d s uy m o d l f l e d g r a d l e n t t e c h - n l q u e s . 1 1 1 N o n d l f f e r e n t l a b l e O p t l m l z a t l o n , M a t h e m a t l - c a l Prograrnmlng S t u d y 3 , e d l t e d by M.L. S a l l n s k l a n d F . W o l f e , N o r t h - H o l l a n d , Amsterdam, p p . 26-34.
0. E r e m l n , 1.1. 1962 . An i t e r a t i v e m e t h o d f o r C h e b y s n e v a p p r o x i m a t i o n s o f i n c o m p a t i b l e s y s t e m s o f l i n e a r I n - e q u a l i t i e s . D o k l . k k a d . Nauk SSSR, V o l . 1 4 3 , 3 p . 1254-1256. E n g l i s h t r a n s l a t i o n : S o v i e t Math . D o k l . , V o l . 20 , pp. 570-572.
3 . E r e m l n , 1.1. 1966 . On s y s t e m s o f l n e q u a i i t i e s w i t h c o n - v e x functions o n t h e l e f t s i d e s . I z v . Akad. Nauk SSSR S e r . X a c . , V o l . 3 0 , pp . 265-278. E n g l i s h t r a n s l a t i o n : Amer. Math. S o c . T r a n s l . , Vol . 9 8 , p p . 67-83 .
6 . S o f f i n , J.L. 1 9 7 1 . On t h e f l n l t e c o ' n v e r g e n c e o f t h e r e - l a x a t i o n method f o r s o l v l n g s y s t e m s o f i n e q u a l i t ~ e s . O p e r a t i o n s R e s e a r c h C e n t e r r e p o r t ORC 71-36 , U n i v . o f California a t B e r k e l e y .
7 . G o f f l n , J . L . 1 9 7 7 . On c o n v e r g e n c e r a t e s o f s u b g r a d l e n t o p t l m l z a t l o n m e t h o d s . N a t h e m a t l c a l Programming, Vol . 1 3 , p p . 329-347.
![Page 66: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/66.jpg)
Goffin, J.L. 1978. Nondifferentiable optimization and the relaxation method in nonsmooth optimization. In Nonsmooth Optimization, edited by C. ~emarechal and R. Mifflin, Pergamon Press, Oxford, pp. 31-50.
Goffin, J.L. 1977. The relaxation method for solving systems of inequalities. Working paper 77-54, Faculty of Management, McGill University, Montreal, Quebec.
Held, M. and R.M. Karp. 1971. The traveling-salesman problem and minimum spanning trees: part 11. Mathematical Programming, Vol. 1, pp. 6-25.
held, M., R.M. Karp and P. Wolfe. 1972. Large-scale opti- mization and the relaxation method. In Proceedings of the 25th National ACX Meeting, held at Boston, Mass., August 1972.
Held, M., P. Wolfe and H. Crowder. 1974. Validation of subgradient optimization. Mathematical Programming, Vol. 6, pp. 62-88.
herman, G.T., A. Lent and S.W. Rowland. 1973. ART: mathematics and applications. A report on the mathe- matical foundations and on the applicability to real data of the algebraic reconstruction techniques. J. Theor. Biol., Vol. 42, pp. 1-32.
Kahan, W. 1958. Gauss-Seidel methods of solving large systems of linear equations. Doctoral thesis, Uni- versity of Toronto.
Kennington, J. and M. Shalaby. 1977. An effective sub- gradient procedure for minimal cost multicomodity flow problems. Management Science, Vol. 23, No. 9.
Motzkin, Th. and I.J. Schoenberg. 1954. The relaxation method for linear inequalities. Canadian J. Math., Vol. 6, pp. 393-404.
~emarechal C. and R. Mifflin (Editors). 1978. Nonsmooth Optimization. Proceedings of an IIASA Workshop, 28 March to 8 April, 1977. Pergamon Press, Oxford.
Nicolaides, R.A. 1974. On a geometrical aspect of S.O.R. and the theory of consistent ordering for positive definite matrices. Numer. Math., Vol. 23, pp. 99-104.
Oettli, W. 1972. An iterative method, having linear rate of convergence, for solving a pair of dual linear programs. Mathematical Programming, Vol. 3, pp. 302-311.
Polyak, B.T. 1969. Minimization of nonsmooth functionals. Zh. Vychisl. Mat. Mat. Fiz (USSR Computational Mathematics and Mathematical Physics), Vol. 9, pp. 509-521
![Page 67: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/67.jpg)
21. S h o r , N.Z. 1968. The r a t e o f c o n v e r g e n c e o f t h e q e n e r a l - i z e d g r a d i e n t d e s c e n t method . K i b e r n e t i c a , Vo l . 4 ( 3 ) , 98-99. E n g l i s h t r a n s l a t i o n : C y b e r n e t i c s , V o l . 4 ( 3 ) , 79-80.
22. S h o r , N.Z. 1976. G e n e r a l i z e d q r a d i e n t me thods f o r non- smooth f u n c t i o n s and t h e i r a p p l i c a t i o n s t o m a t h e m a t i c a l proqrammlnq p r o b l e m s . Ekonomika i X a t e m a t i c h e s k i e Metody , Vol . 1 2 ( 2 ) , 337-356 ( i n R u s s i a n ) .
23 . S o u t h w e l l , R . V . 1940 . R e l a x a t i o n Methods i n E n q i n e e r l n g S c i e n c e . O x f o r d U n i v e r s i t y P r e s s , O x f o r d .
![Page 68: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/68.jpg)
![Page 69: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/69.jpg)
NUMERICAL EXPERIMENTS I N NONSMOOTH OPTIMIZATION
C l a u d e Lemargcha l INRIA. F r a n c e
1 . INTRODUCTION
The a i m o f t h i s p a p e r i s t o show t h e b e h a v i o r o f v a r l o u s
me thods f o r nonsmooth o p t i m i z a t i o n , a p p l i e d t o v a r i o u s p r o b l e m s .
I n S e c t i o n 2 we p r e s e n t t h e m e t h o d s , I n S e c t i o n 3 t h e p r o b l e m s
and t h e r e s u l t s . I n S e c t i o n 4 we draw some c o n c l u s i o n s .
N o t a t i o n . The f u n c t i o n t o b e m i n i m i z e d ( w i t h o u t c o n s t r a i n t s ) is
f ( x ) , x E R~ a n d we d e n o t e by ( . , . ) ( r e s p . 1 j ) t h e s c a l a r
p r o d u c t ( r e s p . t h e norm) i n R ~ . The s e q u e n c e o f i t e r a t e s g e n e r -
a t e d by a n a l g o r i t h m is x 1 ' " " xn a n d g l , ..., gn d e n o t e t h e
s u b g r a d i e n t s t h a t it u s e s ; o b j e c t i v e v a l u e s and s u b g r a d i e n t s
a r e computed s i m u l t a n e o u s l y a t e a c h e x e c u t i o n o f a s u b p r o g r a m ,
c h a r a c t e r i s t i c o f e a c h p rob l em, b u t t h e same f o r e a c h a l g o r i t h m .
when no c o n f u s i o n is p o s s i b l e , we w i l l d e n o t e by x t h e c u r r e n t
i t e r a t e xn , and by x+ t h e f o r t h c o m i n j i t e r a t e x * + ~ ; t h e same
n o t a t i o n 9 and g+ w i l l h o l d f o r t h e s u b g r a d l e n t s ; i w i l l i n d e x
i t e r a t i o n s p e r f o r m e d b e f o r e t h e c u r r e n t o n e . The d i r e c t i o n moved
f rom xn w i l l be d o r d n , a n d t h e s t e p s i z e a o r a n .
2 . THE MZTHODS
The f i r s t two me tnoas a r e h e u r i s ~ l c , i n t h a t t h e y a r e
s u p p o s e d t o be a p p l i c a b l e o n l y t o differentiable o b j e c t i v e
functions.
- 6 1 -
![Page 70: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/70.jpg)
2.1 BFGS. In this method the direction is d = -Hg and the next
iterate is
where H is the N x N quasi-Newton matrix computed by the
Broyden-Fletcher-Goldfarb-Shanno formula (Powell 1975). One
has HI = I, the identity matrix and setting y = 9, ' 9
where 3 denotes the transpose. The stepsize a is computed as follows:
One starts from an initial guess a > 0. We take the one
suggested by Fletcher (Wolfe 1975) : a = 2 (fn - fn-l) / (q, d) , i-e., the stepsize that would minimize f along d, if f were
quadratic, and if its corresponding decrease were equal to the
decrease obtained at the previous iteration.
One tests if the stepslze meets a certain stopping
criterion, namely, lf
1 + d m (9, d , and
i 1 f+ : f + m2a(g. d) , i
where 0 mZ m, 1 are preassigned coefficients (in fact
n1 = 0.7, m2 = 0.1).
If a is not convenient, one performs a series of adjust-
ments until an a is found that satisfies (1). Noting that,
at a point x = x + adn, (g(xn + adn)dn) is the derivative with respect to a of the one dimensional function f (x + ad,), (1) means that the objective has sufficiently decreased, and
the derivative sufficiently increased, relative to the initial
slope (9,. dn) = (9, d).
![Page 71: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/71.jpg)
The way the adjustments are made LS rather standard and not
significant enough to be described here. Suffice it to say
that it is a safeguarded cubic interpolation.
These characteristics make BFGS equivalent to the subroutine
VA13A of the Harwell Library. In nonsmooth optimlzatlon, such
a method is heur~stic. In particular dn may be uphill. There-
fore, the only possible stopplng criterion is the case of failure:
the algorithm is run until any positive a satisfying (1) cannot
be found. This event means that xn is (very close to) a kink.
However, it is likely to occur only when xn is close to a
minimum, thanks to the fact that the line search is by no means
an attempt to minimize f along dn. This intuitive statement
1s supported by numerical validation, which makes quasi-Newton
methods, well-known and easy to program, reasonable "faute de
mieux" for nonsmooth optimization.
1.2 SHOR. The second method is Shor's (1971) dilation of the
space along the difference of two successive qradients. Here
d rs again of the form -Hg and
The matrix H is a product of orthogonal affinities aicnq
q l - gi-, . We refer to (Shor and Shabashova 1972) fcr the
complete statement of the algorithm.
In thls method there is no llne-search, the stepslze 1s
computed off-line as a geometric sequence:
where the parameters a, and q have to be tuned. Gnfortunately
we do not know any method t3 do it properly, so, ln the present
study, we nad to do ~t emp~rically by running the aiqorithm with
several values of a, and q--for each problem--and taking the
comblnatlon that gives apparently the best results. Of course,
thls 1s possible only when the optimal solution is known. There-
rore, we cannot really speak of an implementable algorithm.
![Page 72: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/72.jpg)
However, a s w i t h 2 . 1 , n u m e r i c a l e v i d e n c e w i l l d e m o n s t r a t e
e x c e l l e n t b e h a v i o r o f t h i s method.
The n e x t methods a r e a l l based o n t h e u s e o f s - s u b g r a d i e n t s .
2 . 3 EPSDES. I n t h i s method, which c o u l d b e c a l l e d method o f
E - d e s c e n t (Lemargchal 1 9 7 4 ) , t h e d i r e c t i o n is computed by
o r t h o g o n a l p r o j e c t i o n o f t h e o r i g i n o n t o a set o f s u b g r a d i e n t s ,
and t h e l i n e - s e a r c h is an approx ima te one-dimensional minimi-
z a t i o n .
N I f G = ( g l , ..., gk} is a f i n i t e s e t i n R , w e d e n o t e by
NrG t h e un ique s o l u t i o n o f
I n EPSDES, a number s > 0 is managed a l o n g t h e i t e r a t i o n s .
I t i s normal ly k e p t f i x e d , and i t is d i v i d e d by 1 0 when it is
r e c o g n i z e d t h a t xn ( a p p r o x i m a t e l y ) min imizes f w i t h i n E .
An i t e r a t i o n c o n s i s t s o f f i n d i n g a d i r e c t i o n o f € - d e s c e n t
d n , i . e . , s u c h t h a t t h e r e e x i s t s a > 0 s a t i s f y i n g
and t h e n xn is updated t o xn+, s u c h t h a t a d e c r e a s e o f E i s
o b t a i n e d .
C o n s t r u c t i n g such a d i r e c t i o n i s i t s e l f a s u b a l g o r i t h m ,
made of a series of l i n e - s e a r c h e s a l o n g t r i a l d i r e c t i o n s 1 k d n , . - . , d n u n t i l t h e p r o p e r d e c r e a s e i s o b t a i n e d . A t x n ,
hav ing g e n e r a t e d g1 ,, - . . , gn k (one s t a r t s w i t h g n 1 = g n , s u b g r a d i e n t
![Page 73: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/73.jpg)
a t x ) one computes n
k Then a l i n e - s e a r c h i s performed a l o n g d n
. i f a d e c r e a s e by E i s o b t a i n e d , t h e n t h e i t e r a t i o n i s
f i n i s h e d , k+l .
. i f n o t , a new g 1s o b t a i n e d , t h a t i s a p p r o x i m a t e l y a n s u b g r a d i e n t a t t h e minimum of f a l o n g d
k k + l . n ; gn 1s added k+ 1 t o G and a new d i r e c t i o n d n i s computed.
O f c o u r s e , t h e l l n e - s e a r c h 1 s p o s s l b l e o n l y l f sk # 0. n Hence, a convergence pa ramete r n ,. 0 1s used and t h e t e s t
is checked. I f i t i s me t , one h a s t h e approximate o p t i m a l i t y
c o n d i t i o n
( t h i s i s t r u e o n l y i s f i s convex; i n t h e nonconvex c a s e , t h e
method 1s h e u r i s t i c ) .
We r e f e r t o (Lemargchal 1970) f o r an a c c u r a t e s t a t e m e n t
o f t h e a l g o r i t h m . We c o n s i d e r it r a t h e r s p e c i a l . I t s t r o n g l y
r e l i e s upon c o n v e x i t y and a p p e a r s a s e x t r e m e l y heavy. Only a
c o a r s e exper imen ta l v e r s i o n h a s been programmed, and numer ica l
r e s u l t s w i l l show t h a t , a l t h o u g h i t 1s ve ry reliable, t h e
convergence 1s l ~ s u a l l y v e r y .;low.
2 . 4 CHAINE. T h l s method 1s essentially t h a t of (Lemarechal
1 9 7 5 ) , b u t w ~ t h a l l n e - s e a r c h based o n (Wolfe 1 9 7 5 ) . T h ~ s l ~ n e -
s e a r c h 1s r a t h e r fundamenta l because ~t 1s a d l r e c t e x t e n s l o ?
of 1 . 1 , and i s l d e n t l c a l f o r 2 . 4 , 2 . 3 , and 2 . 6 . T h e r e f o r e we
expose i t f r r s t . The o n l y d i f f e r e n c e w i t h ( i i o l f e 1975) i s t h e
t e s t f o r n u l l s t e p .
![Page 74: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/74.jpg)
~t xn a r e g i v e n t h e d i r e c t i o n dn and t h e two c o e f f i c i e n t s
m l = 0 .2 , m 2 = 0 .1 . Also a t o l e r a n c e E ' > 0' i s g i v e n and one
h a s on hand a number p < 0 t h a t e s t i m a t e s f ' ( x n ; d n ) ( n o r m a l l y
p = f ' (x,, d,) = (dn , gn) ) . The aim o f t h e l i n e - s e a r c h i s t o
p roduce a p o i n t y = x + adn and a s u b g r a d i e n t g a t t h i s p o i n t n s u c h t h a t :
and e i t h e r
I n c a s e s ( 3 ) , ( 4 ) a normal d e s c e n t s t e p is made from xn t o
x n+ l = y = xn + adn . I t is a s e r i o u s s t e p (compare w i t h ( 1 ) ) .
I f xn i s ( v e r y c l o s e t o ) a k i n k , it may happen t h a t
m2p < f ' (x,, d n ) and t h e n (4) i s i m p o s s i b l e t o o b t a i n . Then a
( s u f f i c i e n t l y s m a l l ) s t e p s i z e i s found s a t i s f y i n g (31, ( 5 ) :
x ~ + ~ i s t a k e n a s xn , o n l y t h e new g r a d i e n t i s used f o r t h e fo r thcoming i t e r a t i o n . I t i s a n u l l s t e p .
A s f o r t h e d i r e c t i o n , it is d = - N r G , where G = {g , , . . . , 'n
and i s r e i n i t i a l i z e d when a c e r t a i n test i s met , namely
The convergence pa ramete r i i s managed t h r o u g h o u t t h e a l g o r i t h m .
When t h e d i r e c t i o n is computed, t h e p a r a m e t e r s f o r t h e l i n e -
s e a r c h a r e p = - idn * and i * = 0.1 E. The set G i s r e i n t i a l -
i z e d on two more o c c a s i o n s :
-when 1 d 1 5 rl ( s e e ( 2 ) ) . Then E is a l s o d i v i d e d by 10 ,
'when t h e number o f s u b g r a d i e n t s t o b e s t o r e d e x c e e d s a
![Page 75: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/75.jpg)
p r e a s s i g n e d l i m i t ; t h i s p r e v e n t s t h e a l g o r i t h m f rom n e e d i n g
a n i n f i n i t e amount o f s t o r a g e .
Note t h a t t h i s a l g o r i t h m i s a p p a r e n t l y s l m i l a r t o 2 . 3 .
However, t h e r e a r e two c h a r a c t e r i s t i c s t h a t make r t fundamen-
t a l l y d i f f e r e n t :
. t h e t e s t f o r d e s c e n t ,
- t h e t e s t f o r r e i n i t i a l i z i n g G ,
( i n 2 . 3 , G i s r e i n i t i a l i z e d e v e r y t i m e a d e s c e n t i s made) .
2 . 5 DYNEPS. T h i s a l g o r i t h m was f i r s t p r e s e n t e d i n ( L e m a r 6 c h a l
1 9 7 6 ) . S e e a l s o (Lemargcha l 1 9 7 9 ) . I t s l i n e - s e a r c h i s i d e n t i c a l t o t h a t o f 2.11, and a c o n v e r g e n c e p a r a m e t e r E is
a l s o managed. The d i r e c t i o n i s computed i n a s l i g h t l y d i f f e r e n t
way t h a n 2 . 3 a n d 2 . 4 .
C a l l y i t h e p o i n t a t which t h e s u b g r a d i e n t g i h a s b e e n
computed ( o b s e r v e t h a t y i = x i i f t h e s t e p a i - l was a s e r i o u s
o n e ; f o r e a c h n u l l s t e p , y i = + a i - l d i - l # x i ; f o r i = 1
y , = x l ) . Then a t e a c h i t e r a t i o n d e f r n e t h e numbers l q ,
i = 1 , ..., n a s f o l l o w s :
1 l1 = 0 a n d , f o r n = 1 , . . . ,
n+ 1 ' 0 i f a 1s a s e r i o u s s t e p , = 1 I n+ 1 i ( f ( x n ) - f ( y n + , ) + a n i g n + , , d n ) ( i f it 1s a n u l l
s t e p .
F o r e a c h s e r i o u s i t e r a t i o n n , u p d a t e a 5 0
( n o t e t h a t t h i s f o r m u l a would l e a v e I ? u n c h a n g e d l f i t were a p p l i e d
a t a n u l l i t e r a t i o n ) . T h e s e f o r m u l a e a l l o w t o d e f h e Q~ r e c u r -
s l v e l y a t e a c h i t e r a t i o n . T h e r e i s a t l e a s t o n e i f o r wh ich n
1 . = 0 ( i t is t h e i n d e x o f t h e l a s t s e r i o u s i t e r a t i o n p e r f o r m e d
b e f o r e t h e p r e s e n t o n e ) a n d a: y 5 ' (see ( 5 ) ) .
![Page 76: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/76.jpg)
If the absolute values were neglected, one would have
which is the error made at xn when f is linearized at xi. In
the convex case, these absolute values do not play any part.
Then the direction finding problem is now sn = NrG(E)
where
More precisely, the algorithm is as follows:
x l , g 1 E a f(xl) are given, together with the tolerances E > 0, n > 0, and the coefficients m, = 0.2, m2 = 0.1. Set -
= +-, n = 1 , a = 0 Choose some E in [2,F] . Steu 1. Solve
and let s = liigi be the solution. Also call u - > 0 the multi-
plier of the last constraint.
Step 2. If (si. > n then set dn = -s and go to Step 3. Other-
wise if E : f. then stop. Otherwise take E = rnax(5, 0 . 1 ~ ) and
go to Step 1 .
Step 3. Apply the line-search of 2.4 with p = - [ dn / + uil
and E ' = 0.1~.
Obtain y = x + andn and gn+, E a f (y) . n
In cases (3), (4) go to Step 4. In case (3), (5) go to
Step 5.
![Page 77: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/77.jpg)
S t e p 4. ( s e r i o u s i t e r a t i o n ) . S e t x ~ + ~ = y , a = 0. Change n+ 1 t h e , x i ' s i = 1 , . . . , n; c h o o s e a new = E [ g , F ] . S e t n = n + 1
and g o t o S t e p 1 .
S t e p 5. ( n u l l ~ t e r a t i o n ) . S e t x ~ + ~ = x n ; compute 2 n + l f Set n = n + 1 and g o t o S t e p 1 .
T h i s a l g o r i t h m is n o t t o t a l l y d e f i n e d b e c a u s e t h e c h o i c e
o f c a t S t e p U i s somewhat a r b i t r a r y . W e h a v e s t u d i e d s e v e r a l
p o s s i b i l i t i e s f o r t h i s c h o i c e :
E i s n o t changed i n S t e p U ( t h u s t h e o n l y o c c a s i o n a t j6)
which E is changed i s i n S t e p 2 , when it i s d i v i d e d by 1 0 ) ,
E + = A f whe re .lf > 0 i s t h e d e c r e a s e o f f o b t a i n e d d u r i n g
t h e l a s t s e r i o u s i t e r a t i o n , ( 9 )
< + = k [ f ( x n ) - min £ 1 where k i s a f i x e d number i n ] 0 , 11
( t h i s l a s t r u l e s u p p o s e d t h a t t h e o p t i m a l c o s t min f i s
known; we h a v e t e s t e d k = 0.1 , k = 0 . 5 and k = 1 ) . ( 1 0 )
2 . 6 BEEPS. F i n a l l y we h a v e t e s t e d a r o u g h v e r s i o n o f t h e
a l g o r i t h m p r e s e n t e d i n ( ~ e m a r 6 c h a l 1 9 7 8 ) . As I n 2 . 1 , a q u a s i -
Newton m a t r i x H i s u p d a t e d a t e a c h s e r i o u s i t e r a t i o n , by t h e
BFGS f o r m u l a . Then d = - H s whe re s = 1; 9 , i s t h e s o l u t i o n o f i 1
i 1 min - 1 ( t i i q i , ~ 1 ~ ~ 9 ~ ) + i i i a i 2
!
A s f o r t h e l i n e - s e a r z n , it i s t h e same a s i n 2 . 4 e x c e p t 1
t h a t , I n s t e a d o f ( 5 ) , t h e t e s t f o r n u l l s t e p i s a l-o .
![Page 78: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/78.jpg)
3. THE TESTS
We now p resen t app l i ca t ion of t hese algori thms t o t h e test
problems of (Lemargchal and Mi f f l i n eds. 1979). Since each
a lgor i thm i s appl ied t o each test-problem, t h e s tudy is compar-
a t i v e . However, t he r e s u l t s should n o t be considered a s r e l i a b l e
enough t o al low f o r accura te comparative conclusions. The
reason is t h a t the algori thms t e s t e d here a r e experimental , on ly
CHAINE being a polished product. Thus ou r p re sen ta t ion is more
an i l l u s t r a t i o n o f t h e behavior of var ious methods, than a
normative ana lys i s of t h e i r r e spec t ive performances. Only very
l a r g e d i f f e rences ( i n speed of convergence f o r example) a r e
conclusive.
The speed of a method is cha rac t e r i zed by two numbers:
. number of l ine-searchers ( i . e . number of t i m e s a d i r e c t i o n
i s computed),
. number of computations of funct ion-gradient .
We th ink t h a t t h e second one is probably t h e most s i g n i f i c a n t ,
s ince nonsmooth opt imiza t ion seems normally devoted t o problems i n
which function-gradient a r e expensive t o compute.
We now review t h e problems and g ive t h e r e s u l t s .
3.1 MAXQUAD. In t h i s problem, x E and f i s t h e maximum of
f i v e convex quadra t ics :
See Lemargchal and Mi f f l i n eds. 1979) f o r t h e d e f i n i t i o n of Ak
and bk.
Table 1 d i sp l ays the r e s u l t s f o r methods 2 .1 t o 2 .6 . Each
l i n e of t he t a b l e corresponds t o an i t e r a t i o n ( l ine-search) and
g ives , f o r each method, t h e cumulative number of times function-
g rad ien t have computed, and t h e cu r ren t value of t h e ob jec t ive
function.
The parameters of 2 .2 a r e a t = 10 , q = 0.95. For method 2.5, t h e r u l e f o r E a t Step 4 i s ( 9 ) . Other r u l e s f o r t h e same
method a r e exh ib i t ed i n Table 2 , which reads a s Table 1.
![Page 79: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/79.jpg)
Ta
ble
1.-
Elh
Ym
IN)
Cu
mp
dr.
3~ lv
c
tc.s
Ls
![Page 80: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/80.jpg)
w m w - m m m m e m m m w m m o o - - - - - -
e a m - m N m m m J < < < e < . Y < N ~ N - ~ a w m a a m m m a m m m m a m . . . . . . . . . . . . . . . . . . . .
wNONf.-Y?lOI I I I I I I I I I l l 1 I I I fi
n a m - m - fi
9
m e m ~ ~ m ~ ~ m m c m m m N m C O m m Q O C O O - - -
m m Q e m - Y N M P l F 7 m < - z ' < < < < J m f i N a O O N f i f i Q m m m a ~ m m m m a m a . . . . . . . . . . . . . . . . . . . . . .
w N U J w a P 7 N - - O I I I I I l l I I I I I I 1 I I I 1
![Page 81: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/81.jpg)
3 . 2 SHELL. An example 3f e x a c t p e n a l t y i s g i v e n w i t h t h e s e c o n d
p rob l em o f C o l v i l l e , whe re a l l t h e constraints a r e e l i m i n a t e d by
a n L l , p e n a l t y . S e e (Lemarecha l and M i f f l i n e d s . 1979 ) f o r t h e
p r e c i s e definition o f t h e p r o b l e m .
Method 2 . 2 u s e s t h e p a r a m e t e r s a l = 1 0 , q = 0 . 9 7 .
I t t u r n s o u t t h a t , f o r t h e p r e s e n t p r o b l e m , mos t me thods
f a l l t o c o n v e r g e t o t h e s o l u t l o n ( t h l s bad b e h a v l o r 1s a p p a r e n t l y
d u e t o n o n c o n v e x l t y ) . T h e r e f o r e , l t 1s meaningless t o compare
s p e e d s o f c o n v e r g e n c e , s o w e r a t h e r compare r o b u s t n e s s by recording,
f o r e a c h method , t h e b e s t o b l e c t l v e v a l u e o b t a l n e d when t h e inethod
s t o p s . I n o r d e r t o h a v e more l l l u s t r a t l v e r e s u l t s , w e h a v e made
two serles o f experiments, w h e r e , I n t h e s e c o n d s e r l e s , function-
g r a d l e n t a r e computed I n d o u b l e p r e c l s l o n .
T a b l e 3 shows , f o r e a c h e x p e r i m e n t , t h e number o f l i n e -
s e a r c h e s , t h e number o f c o m p u t a t i o n s o f f u n c t i o n - g r a d i e n t , and
t h e f i n a l o b j e c t i v e v a l u e . E x c e p t f o r method 2 . 2 , t h e b e h a v i o r
1s v e r y bad and d o u b l e p r e c i s i o n d o e s n o t s u b s t a n t i a l l y improve
t h e s l t u a t l o n .
I t would b e frustrating t o l l m l t t h l s l l l u s t r a t l o n o f ;, p e n a l t y w l t h s u c h n e q a t l v e r e s d l t s , i O T a b l e 4 (which r e a d s a s
T a b l e 1 ) shows t h e same i l n d o f e x F e r l m e n t s on t h e f l r s t p rob l em
o f C o l v l l l e ( S h e l l p r l r n a l , .
3 . 3 EQUIL. T h l s r s a set o f 3 e x a m p l e s o f c o m p u t a t i o n o f eco-
nomlc e q u l l l b r l a . I n t e r m s o f nonsmooth o p t i r n l z a t l o n t h e y c a n
be w r l t t e n
N ' mrn n a x f k \ x !
k= l
wne re N = 5, 3 and 1 0 respectively ( a n d , a s u s u a l , x E RN) . See ~ S e m a r e c h a l and Y l f f l i n scs. 1 9 7 9 ) f o r t h e d e f l n l t l o n o f f. . The f ~ n c t l o n s f k a r e d e f ~ n e d o n l y f o r x . 0 a n d t e n d t o + m 15
a coordinate o f x t e n d s t o 3 . T h e r e f o r e we e x t e n d e a c h f k ~y
![Page 82: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/82.jpg)
Table 3 . - SHELL DUAL Comparative tests
I
nb. nb. final
iter. obj. obj.va1.
38 104 84 .3
365 365 32 .4
747 3001 32.5
193 787 38.2
92 329 3 3 . 9
6 9 306 34.4
105 376 33 .6
73 271 37.2
103 404 33.2
42 161 3 4 . 4
113 370 34 .1
9 34 1648
DOURLE precision
B F G S
S H O R
EPSDES
CHAINE
( 6 )
( 7 )
D ( 8 )
!J ( 9 )
0 . 1 S
0 . 5
1
BFEPS
nb. nb. final
iter. obj. obj.va1.
4 3 137 65.7
1 9 9 1 9 9 3 2 . 6
9 8 0 3562 32 .7
244 979 39 .6
63 224 36.5
1 1 1 463 34 .1
56 246 34 .1
8 5 337 33 .8
5 6 187 36 .7
8 6 307 34 .7
78 251 35 .1
38 112 49 .7
STKPLE precision
![Page 83: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/83.jpg)
![Page 84: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/84.jpg)
+m o u t s i d e t h e f e a s i b l e domain. Furthermore, w e do n o t r e a l l y
c o n s i d e r f k b u t r a t h e r i t s r e s t r i c t i o n t o t h e manifold l x . = 1 I
( t h e g r a d i e n t of such a r e s t r i c t e d f u n c t i o n i s t h e p r o j e c t i o n
of t h e o r i g i n a l g r a d i e n t on to t h e subspace l g . = 0--see (Lemargchal l
and d i f f l i n e d s . 1 9 7 9 ) ) . A s a r e s u l t t h e minimax problem i s
uncons t ra ined and methods 2.1-2.6 a r e a p p l i e d a s they s t a n d .
1 For method 2 . 2 , t h e paramenters a r e a l = and q = 0.95.
I n a d d i t i o n t o 2 . 5 , we have t e s t e d a v a r i a n t , which we c a l l
ECONEW, s i m i l a r t o DYNEPS e x c e p t t h a t t h e s e t G ( E ) is l a r g e r :
t o t h e v e c t o r s g l , ..., gn t h a t a r e s u b g r a d i e n t s of f a t y, ,..., yn,
one appends t h e v e c t o r s
g,+k = Ofk ( x n ) , w i t h t h e c o e f f i c i e n t s
Th is makes dn s i m i l a r t o t h e d i r e c t i o n of t h e Newton method,
when a p p l i e d t o t h e minimax problem ( 1 1 ) . I t s o happens t h a t
( 1 1 ) is a Haar problem s o t h a t t h e .Newton method has a super -
l i n e a r r z t e of convergence (and s o should have ECONEW).
Tab le 5 shows t h e r e s u l t , f o r the 3 ~ r o b l e m s . For each
method it d i s p l a y s t h e number of l i n e - s e a r c h e s , of computat ions
of f u n c t i o n - g r a d i e n t , and t h e f i n a l v a l u e of t h e max-function f .
F i g u r e 1 i l l u s t r a t e s t h e r e l a t i v e behav ior of DYNEPS and ECONEW
on t h e t h i r d problem (N = l o ) , w i t h E i n S t e p 4 g iven by ( 6 ) .
I t e x h i b i t s t h e b e t t e r asympto t ic behav ior of ECONEW.
3.4 TR4a. Th is is t h e d u a l of a t r a n s p o r t a t i o n problem i n R 4 8
and t h e f u n c t i o n t o be minimized is
I t i s a p iecewise l i n e a r f u n c t i o n , w i t h a very l a r g e number
of l i n e a r p i e c e s . The optimum v a l u e i s -638565 b u t , working i n
s i n g l e p r e c i s i o n , one should be s a t i s f i e d w i t h o b j e c t i v e v a l u e s
around -638500.
![Page 85: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/85.jpg)
B F C S
S H O R
CHAINE
BFEPS
nb. nb. finai
~ t e r . 3bj. ob! .->dl.
nb. nb. flnal
' e . i t ) obj.va1.
--
Table 5.- EQUIL Comparative t e s t s
nb. qb. flnal
~ t e r . obj. ob].val.
i 5 / )
![Page 86: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/86.jpg)
Figure 1 .- EQUIL 10x5
DYNEPS & ECONEW E constant (6)
Oblective value as a
function of number of
function-gradient evaluations
........-. ECONEW
,, ,,, , D Y N E P S
5 i 11 7 0 50 n b .
![Page 87: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/87.jpg)
For method 2.2, the parameters are al = 1000, q = 0.985.
Table 6 (which reads as Table 3) shows the results, and demon-
strates the excellent behavior of method 2.2. Method 2.3 was
so slow that we could not consider that it converged in a reason-
able amount of CPU time.
3.5 HILBERT. Finally, for curiosity, we have tested these
algorithms on an 111-conditioned quadratic function:
1 5 0 f(x) = T ( ~ , AX) - (b, X) , x E R ,
where A is the Hllbert matrix ail = , and b is such that l+] - the solution 1s (1, . . . , 1 ) .
From properties of Hilbert matrices (reasonable norm but
small coercivlty constant) it is easy to ~dentlf:~ the optimal
cost, but impossible to obtain the optimal solution up to a
reasonable preclslon. Therefore we measure the quallty of a
solutlon x, not by its cost f(x), but by its deviation from the 5-0
2 optimum: 2 ( x . - 1) . l 1
Slnce the oblective functlon is differentiable, we have
tested--in additlon to algorithms of i2--two versions of the
conjugate gradient method, namely the subroutines VA08A and
VA14A of the Harwell library.
Table 7 shows, for each method, the number of llne searches,
of computatlons of function-qradlent, and the final devlatlon
from the optimal soluclon. A note ( ' ) indicates that the method
has stopped through roundoff errors in the line-search; other-
wise the stopplng crlterlon has normally worked.
It has been practically lmposslble to implement method 2.2
because, In thls example, we could not really know whlch crlterlon
tc use for ad]usclng parameters a, and p.
The resuits are qenerally modest !the contrary would have
been a Sla surprise) but lt 1s astonlsnlng to see that the most
preclse algorithm for estimatlna the optimal solutlon is 2.3,
whereas thls algorl~hm is devlsed to estimate the optlmal cost.
![Page 88: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/88.jpg)
![Page 89: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/89.jpg)
T a ~ l e 7 . - HILBERT Comparative tests
B E G S
VA08A
VA14A
S H 0 R
EPSDES
CHAINE
(6)
(7 )
Y (8 )
N
E (9 )
P 0.1
5 0.5
1.
BEEPS
nb. nb. final
iter. obj. obj .val.
8 16 .098 ( I )
5 15 . I 00 ( I )
8 2 0 .099 ( ' )
100 100 ,139
70 276 .004
16 3 5 .013
15 3 8 .OlO ( I )
15 35 .054 ( ' )
15 3 1 .018
13 32 ,078 ( I )
20 3 8 .013
11 29 .091 ( I )
14 3 1 ,034 ( I )
1 1 21 .018
![Page 90: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/90.jpg)
4 . CONCLUSION
The aim of this paper was first to demonstrate the validity
of some recently proposed algorithms for nonsmooth optimization.
We think that the variety of experiments (although they are purely
academic) shows that these algorithms do behave consistently,
even if their convergence is not always very fast. Failures
have been recorded only in one instance (Shell dual) but some
improvements of E-subgradient methods are under study to better
cope with nonconvexity, and more satisfactory results have already
been obtained. .
We have also exhibited the fact that it can be good prac:
tice to use a quasi-Newton method in nonsmooth optimization
The convergence is rather rapid, and often a reasonably good
approximation of the optimum is found; this, in our opinion,
is essentially due to the fact that inaccurate line-searches
are made. Of course, there is no theoretical possibility to
prove convergence to the right point (in fact counterexamples
exist) neither are there any means to assess the results.
In terms of rapidity of convergence, Shor's dilatation of
the space along the difference of two successive gradients is
an excellent method. However, it must be recalled that the
question of stepsize is not yet solved, and this prevents the
method from being really implementable.
Finally it is rather amusing to compare the results on
problems 3.4 and 3.5. Since 3 . 4 is piecewise linear, one should
expect better results with methods 2 . 4 to 2 . 5 (which are based
on piecewise linear approximations). On the other hand, problem
3 . 5 being quadratic, it is with methods 2 . 1 and 2 . 2 (which are
based on quadratic approximations) that better results should
be expected. Table 6 and 7 show that it is the contrary that
happens, and this raises the question: is there a well-defined
frontier between quadratic and piecewise linear functions, or
more generally, between smooth and nonsmooth functions?
![Page 91: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/91.jpg)
tWPENDIX: E x p e r i m e n t s w i t h t h e e l l i p s o i d a l g o r i t h m .
Because a n e x c e p t i o n a l attention h a s b e e n r e c e n t l y g i v e n t o S h o r ' s method o f d i l a t i o n o f t h e s p a c e a l o n g t h e g r a d i e n t ( t h e s o - c a l l e d e l l i p s o i d a lgo r i t . hm, p o p u l a r i z e d by Khachy i an ) w e n a v e a l s o programmed t h i s method , a s d e s c r i b e d i n :
N . Z . S h o r : C u t - o f f method w i t h s p a c e e x t e n s i o n i n convex programming p r o b l e m s , Cybernetics 1 ( 1 977 ) 94-96.
I n t h i s method , a p a r a m e t e r R mus t b e g i v e n , w h i c h e s t i m a t e s t h e d i s t a n c e f rom t h e i n i t i a l i t e r a t e x l t o t h e o p t i m a l s o l u t i o n x * . Because x* is known i n e a c h o f t h e p r e s e n t e x a m p l e s , w e j u s t set R = I x l - x*l .
T a b l e a shows t h e r e s u l t s by t h i s me thod , w i t h t h e 7 test- p r o b l e m s described i n S e c t i o n 3 .
T a b l e 8 . E x p e r i m e n t s w i t h t h e e l l i p s o i d a l g o r i t h m .
- -
Number o f calculations F i n a l v a l u e o f
o f f u n c t i o n - g r a d i e n t o b j e c t i v e f u n c t i o n
MAXQUAD 1383 -.a1114 ( 1 )
SHELL DUAL U399 3 2 . 3 5 ( 2 )
EQUIL 5 X 3 160 0 . 0 3
EQUIL d X 5 398 0 . 0 3
EQUIL 10 X 5 714 0 . 0 9
HILBERT 2 4 1 0 . 0 0 3 ( 4 )
( 1 ) V a l u e a f t e r 5 5 0 c a l c u l a t i o n s : - . a 2
( 2 ) V a l u e a f t e r 1000 c a l u l a t i o n s : 83
( 3 ) S t a r t s d i v e r g i n g a f t e r t h a t i t e r a t i o n
2 ( 4 ) F i n a l v a l u e o f / x - x * l . n
![Page 92: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/92.jpg)
REFERENCES
Lemargchal, C. (1974) An Algorithm for Minimizing Convex Functions in Rosenfeld ed. Proceedings of the IFIP Congress, Stockholm, 553-556. North Holland Publishing Company.
Lemar6cha1, C. (1975) An extension of "Davidon" methods to nondifferentiable problems in Balinski and Wolfe eds. Nondifferentiable Optimization, Mathematical Programming Study 3:95-109. North Holland Publishing Company.
Lemargchal, C. (1976) Combining Kelley's and Conjugate Gradient Methods in Prekopa, ed. Abstracts of the IX Symposium on Mathematical Programming, Budapest.
Lemar6cha1, C. (1979) Bundle methods in nondifferentiable optimization in ~emargchal and Mifflin eds. N~nsmooth Optimization 79-102. Pergamon Press.
Lemargchal, C. (1978) Nonsmooth Optimization and Descent Methods. RR-78-4. Laxenburg, Austria: International Institute for Applied Systems Analysis.
Powell, M.J.D. (1975) Some Global Convergence Properties of a Variable Metric Algorithm for Minimization Without Exact Line-Searches. AERE Report Cssl5. Harwell.
Shor, N.Z. (1971) A minimization method using the operation of dilatation of the space in the direction of the difference of two successive gradients. Cybernetics 3:450-459.
Shor, N.Z. and L.P. Shabashova (1972) Solution of minimax problems by the method of generalized gradient descent with dilation of the space. Cybernetics 1:88-94.
Wolfe, P. (1975) A method of conjugate subgradients for minimizing nondifferentiable functions in Balinski and Wolfe eds. Nondifferentiable Optimization, Mathematical Programming Study 3:145-173. North Holland Publishing Company.
A set of nonsmooth optimization test-problems in Lemarechal and Mifflin eds. Nonsmooth Optimization 151-165. Pergamon Press. ( 1979)
![Page 93: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/93.jpg)
CONVERGENCE OF A MODIFICATION OF LEMA&CHAL'S ALGORITHM FOR NONSMOOTH OPTIMIZATION*
Robert Miff lin Department of Pure and Applied Mathematics Washington State University Pullman, Washington USA
1. INTRODUCTION
We consider the problem of minimizing a locally Lipschitz
continuous function f on R". We give a modification of an al-
gorithm due to Lemarechal [ 2 ] and show convergence to a station-
ary point of f if f also satisfies a weak "senismoothness" [ 3 , 4 ]
hypothesis that is most likely satisfied by continuous functions
arising in practical problems. The method combines a general-
ized cutting plane idea with quadratic approximation of a La-
grangian. Even for the case of a convex f, as considered in [2],
this version differs from the original method, because of its
rules for line search termination and the associated updating of
the search direction finding subproblem. More specifically, our
version does not require a user-specifled uniform lower bound on
the line search stepsizes.
A point ~ER" is stationary if 0 ~ a f ( 2 ) where 2f is the gen- eralized gradient [I] of f, i.e., af(x) is the convex hull of
all limits of sequences of the form iVf (xk) : (xkj + x and f is
This material is based upon work supported by the National Sclence Foundation under Grant No. MCS 78-06716.
![Page 94: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/94.jpg)
differentiable at each xk}. Important properties of the mapping
af(-) are uppersemicontinuity and local boundedness. If f is
convex (concave), af equals the subdifferential (superdifferen- 1 tial) of f or if f is continuously differentiable (C ) af equals
the (ordinary) gradient (Vf}. It is possible to determine af or
at least to give one element of af(x) at each x for many other
functions f such as those that are pieced together from C ' func-
tions (for example, via maximization and/or minimization opera-
tions occurring in decomposition, relaxation, duality and/or
exact penalty approaches to solving optimization problems).
In order to implement the algorithm we suppose that we have
a subroutine that can evaluate a function g(x)~af (x) for each
XER". Of course, we are especially interested in the case where
g is discontinuous at stationary points of f. Associated with f
(and g) let a : R"XR" - R+ be a nonnegative-valued function satisfying
- - a(x,y) - 0 if x - x and y + x ,
- a(z,y) -u(x,y) - 0 if X + X , z +'.X and y + y ,
( l a )
(lb)
and
a(x,y) is intended to be an indication of how mucn g(y)Eaf(y)
deviates from being a yeneralized gradient at x. If f is con-
vex we take
which measures the deviation from linearity of f between y and
x. For a nonconvex f a possibility is to take a(x,y) to be some
"distance" between x and y, in whrch case property (lc) follows
from the uppersemicontinuity of af.
![Page 95: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/95.jpg)
2. THE ALGORITHM
Given a p o s i t i v e i n t e g e r k , n - v e c t o r s x k , g ( x k ) , y . and
q ( y i ) f o r i = 1 , 2 , . . . , k and a p o s i t i v e d e f i n i t e nxn m a t r i x Ak n+ 1
s o l v e f o r ( d , v ) = ( d k , v k ) EP. t h e kth q u a d r a t i c programming sub-
problem:
minimize 1/2<d,Jikd> + v
s u b j e c t t o < g ( x k ) , d > > v
-a ( x k , yi) + < g ( y i ) , d > 2 v f o r i = 1 , 2 , . . . , k .
I f Vk = 3 s t o p .
Othe rwise (vk < 0 and dk # 0 ) pe r fo rm a l i n e s e a r c h from xk
a l o n g dk t o f i n d ( i f p o s s i b l e ) two ( p o s s i b l y t h e same) s t e p s i z e s
tL 2 0 and tR > 0 and two c o r r e s p o n d i n g p o i n t s , xL=xk + t d L k and
+ t R d k , such t h a t y =x
and
wnere mL and mR a r e f i x e d p a r a m e t e r s s a t i s f y i n g 0 < mL < mR c 1 .
If t h e l i n e s e a r c h 1s s u c c e s s f u l r e p e a t t h e above p r o c e d u r e w i t h
t h e k+ l subproblem d e f i n e d by s e t t i n g
Xk+l = XL and Y k + l = Y R '
replacing i n t h e subproblem constraints
g ( x k ) by g ( x L ) and x ( x k , y l ) by ~ ( X ~ I Y , )
f o r i = 1 , 2 ,..., k ,
appendrng t h e c o n s t r a i n t
- 3 ( x k + l tYk+l) + ' g ( y k + l ) t d > 2 vt
![Page 96: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/96.jpg)
and r e p l a c i n g q( by a p o s i t i v e d e f i n i t e m a t r i x % + l '
3. REMARKS ON THE ALGORITHM
For t h e s t a r t i n g p o i n t x l w e may set y l = x l . I f t h i s i s t h e
c a s e t h e a ( x l , y l ) = 0 and g ( x ) = g ( y ) s o t h a t t h e two s t a r t i n g 1 1 c o n s t r a i n t s a r e t h e same. I n g e n e r a l it may be t h e c a s e t h a t
xkPyi f o r some i ~ { 1 , 2 , . . . , k}. I n t h i s c a s e t h e f i r s t c o n s t r a i n t
may h e d i s r e g a r d e d d u r i n g subproblem s o l u t i o n , b e c a u s e it i s
i n c l u d e d among t h e o t h e r c o n s t r a i n t s .
Elsewhere w e w i l l r e p o r t o n how t o e x t e n d t h e n u m e r i c a l l y
s t a b l e c o n s t r a i n e d l e a s t s q u a r e s a l g o r i t h m i n [ 5 1 f o r s o l v i n g
more g e n e r a l q u a d r a t i c p r o g r a m m l n g problems. T h i s w i l l r e s u l t
i n a r e l i a b l e method f o r s o l v i n g t h e d u a l of t h e subproblem
g i v e n h e r e .
The s c a l a r vk can be i n t e r p r e t e d as a n approx imat ion t o t h e
d i r e c t i o n a l d e r i v a t i v e o f f a t xk i n t h e d i r e c t i o n d k . By con-
vex q u a d r a t i c programming d u a l i t y t h e o r y , a s i n [ 2 ] ,
and
where i . > 0 f o r i = O , 1 , . . . , k a r e d u a l v a r i a b l e s ( m u l t i p l i e r s ) l k - a s s o c i a t e d w i t h t h e kth subproblem such t h a t
1 x . . = 1 . i=o '1
By t h e f i rs t subproblem c o n s t r a i n t , t h e Cauchy-Schwarz i n e q u a l -
i t y , ( 3 a ) , t h e n o n n e g a t i v i t y o f X i k and a and t h e p o s i t i v e
definiteness of q( we have
![Page 97: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/97.jpg)
Therefore if vk # 0 then vk c 0 and dk # 0 and, hence, the line
search may be initiated.
Let yk > 0 be the smallest eigenvalue of Ak. Then from ( 4 )
and we have the following:
knmz 1. If vk = 0 then xk is stationary.
L o o f . If vk = 0 then, from ( S ) , dk = O and, by (3),
and
The nonnegativity of X . and a and (6a) imply that a(xk,yi) = 0 lk
for each i such that X . > 0. Thus, by property (lc) , g (yi) is l k
an element of the convex set 3f(xk) for each i such that
iik > 0 . Now, stationarity of x, follows from (6b) and (3c).
Relative to vk and dk the line search termination crite-
rion (2a) guarantees sufficient function value decrease, while
(2b) along with the definition of a provides sufficient (approx-
Lrnate) directional derivative increase. More specifically,
because mR .: 1, (2b) causes (dk,vk) to be infeasible ln sub-
problem k+l. If £ 1s weak11 uppersemlsmooth (see Appendix)
then, because of (la) and the parameter inequality mL < rnR, a
slmple search procedure, suzh as In [31, can be designed to find
tL and tR or to generate an increasing sequence it2) such that
(f(xk+tzdk) + - m . In order to deal effectively with regions on
which f is smooth it is recommended that mL < 1 / 2 - Note that due to nondifferentiability of f it is possible
that t =0, so xk+,=xk and the first constraint is not changed. L
![Page 98: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/98.jpg)
But i n t h i s c a s e , s i n c e tR > 0 , Y ~ + ~ # x k + , , 50 t h e appended
c o n s t r a i n t i s c l e a r l y d i f f e r e n t from t h e f i r s t c o n s t r a i n t and
i t s i n c l u s i o n improves o u r approx ima te knowledge o f a f ( x k ) . I n
f a c t , t h i s d i f f e r e n c e h o l d s i n g e n e r a l when tL # tR and t h i s is
what d e t e c t s and d e a l s w i t h d i s c o n t i n u i t i e s o f g .
W e do n o t d i s c u s s t h e i m p o r t a n t q u e s t i o n o f u p d a t i n g % h e r e . W e c o n j e c t u r e t h a t % s h o u l d b e chosen t o converge t o t h e
Hess ian o f some Lagrang ian a s s o c i a t e d w i t h t h e l i m i t i n g o p t i m a l
m u l t i p l i e r s o f t h e subproblems. T h i s i s t h e s u b j e c t o f ongo ing
r e s e a r c h where we a r e d e v e l o p i n g tests t o i d e n t i f y i t e r a t i o n
i n d i c e s k where we may s i m u l t a n e o u s l y make a v a r i a b l e metric
upda te o f % and r e d u c e o r a g g r e g a t e t h e c o n s t r a i n t bund le .
4 . CONVERGENCE
I n t h i s s e c t i o n we e s t a b l i s h t h r e e lemmas t h a t p rove
p a r t ( a ) o f t h e f o l l o w i n g convergence theorem. P a r t ( b ) f o l l o w s
from p a r t ( a ) , because a s t a t i o n a r y p o i n t f o r a semiconvex func-
t i o n (see Appendix) i s a min imiz ing p o i n t 141 and b e c a u s e e v e r y
accumula t ion p o i n t o f (xk} h a s t h e same f - v a l u e due t o t h e mono-
t o n i c i t y o f ( f (xk) ) .
Theorem. Suppose ( x k i r ( y k ] and ( % I a r e u n i f o r m l y bounded w i t h
(+! u n i f o r m l y p o s i t i v e d e f i n i t e . Then
( a ) a t l e a s t one o f t h e accumula t ion p o i n t s o f i x k l i s
s t a t i o n a r y and
( b ) i f f i s semiconvex on R n , e v e r y a c c u m u l a t i o n p o i n t o f
(xk1 min imizes f .
flernarr.. I f ( x : f ( x ) 2 f ( x l ) } is bounded t h e n txk} is bounded
and iykj c a n be made bounded by c h o o s i n g an a d d i t i o n a l p a r a m e t e r
3 > 0 and imposing t h e a d d i t i o n a l l i n e s e a r c h r e q u i r e m e n t t h a t
For weakly uppersemismooth f u n c t i o n s it i s p o s s i b l e t o s i m u l t a -
n e o u s l y s a t i s f y ( 2 ) and ( 7 ) a f t e r a f i n i t e number o f l i n e s e a r c h
s t e p s .
![Page 99: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/99.jpg)
Cons ide r t h e following assumpt ion t h a t i s t r i v i a l l y s a t i s -
f i e d i f t h e m a t r i x sequence { A } is un i fo rmly p o s i t i v e d e f i n i t e : k
I f {dk! h a s no z e r o a c c u m u l a t i o n p o i n t t h e n ( 8 )
{ y k l h a s no z e r o a c c u m u l a t i o n p o i n t .
iemm 2. Suppose ( a ) h o l d s and { x k } and { y k l a r e u n i f o r m l y bounded.
Then { d 1 h a s a t l e a s t o n e z e r o a c c u m u l a t i o n p o i n t . k
Prwof. Suppose f o r pu rposes o f a p roof by c o n t r a d i c t i o n t h a t t h e r e
e x i s t s a p o s i t i v e number 6 such t h a t
i d k 1 1 d ; 0 f o r a l l k.
Tncn, by i 8 ) , t h e r e e x i s t s a p o s i t i v e number y such t h a t
y k ) y 0 f o r a l l k
a n d , by ( 5 )
Thus, s l n c e i x 1 1s assumed bounded and a £ ( . ) i s l o c a l l y bounded, k ! g ( x k ) ! a n d , hence , { vk i and id. a r e bounded. L e t and 2 be
K accumulation p o i n t s o f ( v i and { d k j , respectively. Then, k by ( 9 )
; - y , j 2 < 0 .
BY 12a) and ( 9 ) .
f (xL) - f ( x k ) _L mLtLvk I -mLtL.v5 d k ~
o r , s i n c e X : < + ~ = X =x +t d L k L k '
f ( x ~ + ~ ) - f ( x k ) 2 - m L y l I xk+l -xk j .
![Page 100: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/100.jpg)
For any p > k + l , ( 1 1 ) and t h e t r i a n g l e i n e q u a l i t y imply
A s f i s c o n t i n u o u s and ix. 1 is assumed bounded, t h e monotone K
n o n i n c r e a s i n g s e q u e n c e ( f ( x k ) } is bounded f rom below a n d , hence ,
These f a c t s t o g e t h e r w i t h p r o p e r t y ( l b ) a n d t h e assumed bounded-
n e s s o f { y k } lmply t h a t
A l s o , f o r any p 2 k+l w e h a v e , by t h e pth subprob lem f e a s i -
b i l l t y , t h a t
and , by ( 2 b ) w i t h X = = X ~ + ~ and t h a t
S u b t r a c t i n g t h e l a t t e r f rom t h e fo rmer i n e q u a l i t y g i v e s
Now choose p and k i n K, a s i n f i n i t e set o f i n t e g e r s where
I d k l k E K + a and { v k l k E K + ; , SO t h a t from ( 1 2 ) , ( 1 3 ) and t h e
boundedness o f ( g ( ~ ~ + ~ ) } w e have
![Page 101: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/101.jpg)
Since ma < 1, this implies that ; , 0, which contradicts (1 0) and completes the proof.
L s m 3. Suppose that K is such that !dkjkU( + 0, 'Akdk'ka + O
and {<g(xk),dk>lkEK + 0. Then {vk!ka + 0,
and
where the Aik 2 0 satisfy (3).
Proof. The conclusions follow from the hypotheses (4) , (3a) and (3b).
L e m 4. In addition to the hypotheses of Lemma 3, suppose that - ( x ~ J ~ ~ ~ and iykJkEK are uniformly bounded and let x be any
accumulation {xkikEK . Then x is stationary.
.173of. AS in the proof of Thm. 5.2 in [ 3 ] , depending on the local boundedness and uppersemicontinuity of a £ and the proper- ties of convex combinations, Lemma 3 implies the existence of a
posltive integer m 2 n+l, an infinite subset JCK and convergent
subsequences
i i { (xk#g(xk)) jka - i x t g O ) ~ ~ n (;) , ( (ykpg (yk) ) ikEJ
+ (yl,gi)~~' x
for 1=1,2, ..., m and I V ' 1
k,kEJ+)b 2 0 for l=O,l , . . . , m
![Page 102: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/102.jpg)
such that
and for i=1,2, ..., m
Now, stationarity of x follows from property (Ic) as in the proof of Lemma 1.
5. EXTENSION TO CONSTRAINED PROBLEMS
Finally, we remark that using ideas in [ 3 ] , the algorithm
can be extended to become a feasible point method for dealing
with constrained optimization problems involving semismooth
functions.
6. APPENDIX
A function f : Rn +R is weakly uppersemismooth [ 31 at
X E R ~ if
(a) f is Lipschitz continuous on a ball about X.
(b) for each d€Rn and for any sequences (tk}CR+ and
(gk1cRn such that (tkl 4 0 and gkEaf (x+tkd) it follows
that lim inf <gk,d> 2 lim sup [f(x+td) - f ( x ) ]/t k+m t+O
It can be shown that the right-hand side of the above inequality
is in fact equal to
the directional derivative of f at x in the direction d.
![Page 103: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/103.jpg)
The class of weakly uppersemismooth functions strictly
contains the class of semismooth [ 4 ] functions. This latter
class is closed under composition and contains convex, concave,
C' and many other locally Lipschitz functions such as ones that
result from piecing together C' functions.
A function f : R" -r R is semiconvex [ 4 ] at XER" if
(a) f is Lipschitz continuous on a ball about x;
and for each ~ER", f' (x' d) exists and satisfies
(b) f' (x;d)= max[<g,d> : gEaf (x) i (c) f ' (x:d) 5 0 implies f (x+d) I f (x).
An example of a nondifferentiable nonconvex function that
is both semismooth and semiconvex is log (l+/xl) for ~€2.
REFERENCES
1. F . H . Clarke. 1 9 7 5 . Generalized Gradients and Applications. Trans. Amer. Math. Soc. 2 0 5 , pp. 2 4 7 - 2 6 2 .
2. C. Lemarechal. 1 9 7 8 . Nonsmooth Optimization and Descent Methods. RR-78-4. International Institute for Applied Systems Analysis, Laxenburg, Austria.
3. R. Mifflin. 1 9 7 7 . An Algorithm for Constrained Optimization with Semismooth Functions. Mathematics of Operations Research 2 , pp. 1 9 1 - 2 0 7 .
4 . R. Mifflln. 1 9 7 7 . Semismooth and Semiconvex Funcuons in Constrained Optimization. SIAM J. Control and Optlmi- zation 1 5 , pp. 9 5 9 - 9 7 2 .
5. R. Mlfflin. 1 9 7 9 . A Stable Method for Solving Certain Constrained Least Squares Problems. ~Mathematlcal Pro- gramming 16, pp. 1 4 1 - 1 5 8 .
![Page 104: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/104.jpg)
![Page 105: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/105.jpg)
SUBGRADIE1.IT METHOD FOR MINIMIZING WEAKLY CONVEX FUNCTIONS AND E-SUBGWIENT METHODS OF CONVEX OPTIMIZATION
Z.A. Nurminski International Institute for Applied Systems Analysis Laxenburg, Austria
Optimization methods are very important in systems analysis.
Not all systems analysis problems are optimization problems, of
course, but in any systems problem optimization methods are im-
porcant and useful tools. The power of these methods and their
ability to handle different problems makes it possible to analyze
and construct very complicated systems. Economic planning, for
Instance, would be greatly limited without the use of linear
programming (LP) techniques.
However, linear programming is not the only method of opti-
mizatlon. Problems including factors such as uncertainty, only
partial knowledge of the system, and conflicting goals require
more sophisticated methods for their solution--methods such as
nondifferentiable optimization.
This paper considers the common situation which arises when
tne outcomes of particular decisions cannot be estimated without
solving a difficult auxiliary problem. The solucion of this aux-
iliary probiem can be very time-consuming and may limit the
analysis of different decisions in the original problem. Thls
paper develops methods of optimal decision making which avoid
the direct comparison of 2ecisions and which use only information
which is readily accessible from a computational point of view.
![Page 106: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/106.jpg)
2 . THE PROBLEM
T h i s p a p e r d e a l s wi th t h e f i n i t e - d i m e n s i o n a l u n c o n d i t i o n a l
extremum problem
min f ( x )
X E E "
where t h e o b j e c t i v e f u n c t i o n has no con t inuous d e r i v a t i v e s w i t h
r e s p e c t t o t h e v a r i a b l e x = (x, ,..., x,). Var ious methods were
d i s c u s s e d and sugges ted i n r e l e v a n t l i t e r a t u r e t o s o l v e problem
( 1 ) w i t h many t y p e s o f n o n - d i f f e r e n t i a b l e o b j e c t i v e f u n c t i o n s . The b i b l i o g r a p h y p u b l i s h e d i n [ I ] g i v e s a f a i r l y good n o t i o n o f
t h e s e works. I t should b e emphasized t h a t t h e n o n d i f f e r e n t i -
a ~ i i i t y of t h e o b j e c t i v e f u n c t i o n i n problem ( 1 ) i s , a s a r u l e , due t o complex i ty of t h e f u n c t i o n ' s s t r u c t u r e . A r e p r e s e n t a t i v e
example i s minimax problems where t h e o b j e c t i v e f u n c t i o n f ( x )
i s a r e s u l t of maximizat ion o f some f u n c t i o n g ( x , y ) w i t h r e s p e c t
t o v a r i a b l e s y:
f ( x ) = max ~ ( X I Y )
YE Y
I n t h i s c a s e even a s imple computat ion o f t h e v a l u e of f
a t some f i x e d p o i n t may be q u i t e a time-consuming t a s k which
r e q u i r e s , s t r i c t l y speak ing , an i n f i n i t e number o f o p e r a t i o n s .
With t h i s i n mind, it seems t o be i n t e r e s t i n g from t h e s t and-
p o i n t o f t h e o r y and p r a c t i c e t o i n v e s t i g a t e t h e f e a s i b i l i t y o f
s o l u t i o n o f problem ( 1 ) w i t h a n approximate computat ion o f t h e
f u n c t i o n f ( x ) and o f i t s s u b g r a d i e n t s ( i f t h e l a t t e r a r e d e t e r -
mined f o r a g iven t y p e o f n o n d i f f e r e n t i a b i l i t y ) . To t h e b e s t o f
o u r u n d e r s t a n d i n g , E - s u b g r a d i e n t s o f f u n c t i o n s o f t h e form ( 2 ) ,
i n t r o d u c e d by R.T. R o c k a f e l l a r [ 2 ] , a r e q u i t e c o n v e n i e n t f o r
c o n s t r u c t i n g numer ica l methods, and s o we o f f e r h e r e some
r e s u l t s g e n e r a l i z i n g e f f o r t s i n t h i s d i r e c t i o n [3-51.
![Page 107: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/107.jpg)
3 . WEAKLY CONVEX FUUCTIONS
The d i s c u s s i o n o f a c l a s s o f t h e n o n - d i f f e r e n t i a b l e func-
t i o n s b r o a d e r t h a n t h e convex f u n c t i o n s e n a b l e s u s t o g a i n sub-
s t a n t i a l l y i n g e n e r a l i t y a t t h e expense o f a minor i n c r e a s e i n
complex i ty . P r o p e r t i e s o f t h e c l a s s which w i l l b e t r e a t e d
a r e d e s c r i b e d by t h e f o l l o w i n g d e f i n i t i o n [6]:
D e f i n i t i o n The c o n t i n u o u s f u n c t i o n f ( x ) i s c a l l e d t h e
weakly convex f u n c t i o n i f f o r e a c h x t h e r e e x i s t s a t l e a s t one
v e c t o r g such t h a t
f o r a l l y , and t h e r e s i d u a l t e rm r ( x , y ) s a t i s f i e s t h e c o n d i t i o n
o f uni form s m a l l n e s s w i t h r e s p e c t t o llx-yll i n each compact sub- n
s e t o f E , i . e . , i n any compact set K C E " f o r any E > 0 t h e r e
e x i s t s b k > 0 such t h a t f o r Ox - y ll 2 Sk, x , y E K
N o t l c e t h a t n o c o n s t r a i n t s a r e imposed on the s i g n o f t h e
r e s i d u a l t e rm r ( x , y ) . Fur the rmore , s t r e n g t h e n i n g ( 3 ) i t L S pos-
s i b l e t o add t o r ( x , y ) any e x p r e s s i o n o f t h e form @ (llx - yll , where
$ i t ) 2 0 , m t - ' - 0 f o r t - + O
The t e r n weakly convex f u n c t i o n s i s s u g g e s t e d by ana logy
t o t h e s t r o n g l y convex functions s t u d i e d by B . T . Polyak [ 7 ] .
W e w i l l c a l l t h e v e c t o r g , s a t i s f y i n g ( 3 ) , t h e s u b g r a d i e n t
o f t h e f u n c t l o n f i x ) and w i l l d e n o t e 3 s e t of s u b g r a d i e n t s a t
t h e p o i n t x by 3f ( x ) . We w i l l now d e s c r i b e some s i m p l e p r o p e r t i e s o f weakly
convex f u n c t i o n s and o f t h e i r s u b q r a d i e n t s .
![Page 108: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/108.jpg)
Lemma 1 . Ni th r e s p e c t t o x , a f ( x ) i s a c o n v e x , c l o s e d , bounded
and upper semicont inuous m u l t i v a l u e d mapping.
The proof o f t h e s e p r o p e r t i e s p r e s e n t s no s p e c i a l problems.
Lemma 2. Le t f ( x , a ) b e con t inuous w i t h r e s p e c t t o a and
weakly convex w i t h r e s p e c t t o x f o r e a c h a b e l o n g i n g t o t h e
compact t o p o l o g i c a l space A. T h a t i s ,
f o r a l l y , and h e r e r a ( x , y ) s a t i s f i e s t h e c o n d i t i o n o f uniform
s m a l l n e s s u n i f o r m a l l y w i t h r e s p e c t t o a E A . Then
f c x ) = max f ( x , a )
a € A
i s a weakly convex f u n c t i o n .
The proof i s r a t h e r s i m p l e .
Le t
A(x) = ( a : f (.x,al = f ( x ) 1
Then, c o n s i d e r i n g ( 4 ) f o r a e A ( x ) , w e o b t a i n
f ( y ) - f ( x ) 2 f ( y , a ) - f ( x , a ) 2
> ( g a , y - x ) + r a ( x , y ) 2 -
> (g,, y - x ) + F i x . y ) -
where
- - r ( x , y ) = sup i r a ( x t y ) I a E A
I t i s e a s i l y seen t h a t ; ( x , y ) s a t i s f i e s n e c e s s a r y c o n d i t i o n s of
uniform s m a l l n e s s and t h e lemma i s proved.
![Page 109: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/109.jpg)
The proof o f L e m a 2 h e l p s i n u n d e r s t a n d i n g t h e p r o c e d u r e o f
c a l c u l a t i o n o f s u b g r a d i e n t s o f t h e weakly convex f u n c t i o n s .
S p e c i f i c a l l y , f o r f u n c t i o n s o f t h e form ( 5 ) t h e v e c t o r
E G ( x ) , a E A ( x ) is t h e s u b g r a d i e n t o f t h e f u n c t i o n f ( x ) a t ga a t h e p o i n t x . I t f o l l o w s from Lemma 1 t h a t an a r b i t r a r y v e c t o r
i s a l s o t h e s u b g r a d i e n t . The f i n d i n g o f even one e l e m e n t o f t h e set G ( x ) may b e a
n o n - t r i v i a l problem a n d , i g n o r i n g e f f o r t s s p e n t t o c a l c u l a t e f o r
t h e f i x e d a t h e s u b g r a d i e n t g a € a f a ( x ) , it c a n be s a i d t h a t prob-
l e m s o f comput ing f ( x ) and o f i t s s u b g r a d i e n t g E af ( x ) a r e e q u a l
I n c o m p l e x i t y .
I n e s t a b l i s h i n g n e c e s s a r y extremum c o n d i t i o n s f o r weakly
convex f u n c t i o n s o f g r e a t i m p o r t a n c e i s t h e e x i s t e n c e o f d i r e c -
t i o n a l d e r i v a t i v e s and a fo rmula f o r t h e i r c o m p u t a t i o n i n t e r m s
o f s u b g r a d i e n t s .
Lemma 3 . The weakly convex f u n c t i o n f ( x ) i s d i r e c t i o n a l l y
d i i f e r e n t i a b l e and
P r o o f . L e t
> ( h ) = f ( x + h e ) - f (x)
I t is e a s i l y s e e n t h a t q ( h ) a s a f u n c t i o n o f h i s weakly
convex. Denote t h e se t of s u b g r a d i e n t s o f m(h) by 3Q ( h ) . Assume
t h e c o n t r a r y o f what t h e lemma a s s e r t s :
- a = l i m 9(h) < - - - L i m D(h) = a
h-+ 0 h h-- + 0 h
![Page 110: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/110.jpg)
and l e t { T ~ } = T and {ak} = a be. sequences of values of h such
t h a t
Furthermore, w e have:
where
$ q k i G r k O ( r 4 . 3 k k . f o r k-m, r k++O
Without l o s s of gene ra l i t y i t may be assumed t h a t
Dividing ( 6 ) by rk and passing t o t h e l i m i t f o r k -.- we
ob ta in
BY v i r t u e of Lemma 1 gT E a b ( 0 ) , t he re fo re
Dividing ( 7 ) by ok and passing t o t h e l i m i t when k-.m we
have a con t r ad ic t ion t h a t proves the d i f f e r e n t i a b i l i t y i n any
d i r e c t i o n . By v i r t u e of the weak convexity of f it is easy t o
ob ta in
> max ( g , e ) - 3 e g 5 a f
![Page 111: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/111.jpg)
Now l e t
and
Then
The d i v i s i o n o f t h e above i n e q u a l i t y Sy tk and t h e p a s s t o t h e
l i m i t when k-. = y i e l d :
af ( x ) - < ( g , e ) 5 max ( g , a ) - ae g ~ a f
and t h u s t h e p roo f i s comple t ed .
Lemma 3 i m p l i e s t h a t t h e n e c e s s a r y c o n d i t i o n f o r t h e p o i n t
be e x t r e m a l i s
0 C i f ( x * ) ( 8 )
however , u n l i k e t h e c a s e w i t h t h e cor.vex f u n c t i o n , t h i s c o n d i t i o n
i s i n s u f f i c i e n t .
L o c a l p r o p e r t i e s o f t h e weak ly convex f u n c t i o n s d o n o t
d i f f e r f rom t h o s e of t h e convex f u n c t i o n s b u t t h e i r g l o b a l p ro -
~ e r t i e s a r e r a d i c a l l y d i s s i m i l a r . S p e c i f i c a l l y , t h e weak ly con-
vex f u n c t i o n s l a c k t h e s a l i e n t f e a t u r e o f s u b g r a d i e n t s t h a t
e n a b l e s u s t o p r o v e t h e c o n v e r g e n c e o f s u b g r a d i e n t method, i . e . ,
t h e p o s i t i v i t y of s c a l a r p r o d u c t o f a n a r b i t r a r y s u b g r a d i e n t a t
some p o i n t x i n t h e d i r e c t i o n f rom t h e extremum p o i n t x*:
f o r an a r b i t r a r y q~ i f
T h l s and t h e f a c t t h a t a s h i f t ir. t h e d i r e c t i o n o f t h e
a n t l g r a d i e c t d o e s n o t a s s u r e a d e c r e a s e i n v a l u e o f a f u n c t i o n
![Page 112: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/112.jpg)
being op t imized both f o r t h e weakly convex and convex f u n c t i o n s ,
compl ica te t a n g i b l y t h e proof o f t h e subg rad i en t method convergence.
D i f f i c u l t i e s t h a t p r e sen t themselves i n p rov ing t h e conver-
gence of non-relaxat ion a lgor i thms a r e o f common knowledge. How-
e v e r , i n a number o f c a s e s they pay, opening new p o s s i b i l i t i e s .
I n t h e fo l lowing chap t e r w e w i l l d e s c r i b e c e r t a i n c r i t e r i a of
convergence of i t e r a t i v e a lgo r i t hms which made it p o s s i b l e t o
prove convergence o f a n m b e r o f a lgor i thms whose behaviour i s
s u b s t a n t i a l l y non-monotonic.
4 . CONVERGENCE OF ITEEATIVE METHODS OF NON-LINEAR PROGRAMMING
General c o n d i t i o n s o f convergence of i t e r a t i v e procedures
r e ce ived a t t e n t i o n of a l o t of r e s ea r che r s . The most fundamental
r e s u l t s appear t o belong t o W . I . Zangwill who suggested necessary
and s u f f i c i e n t cond i t i ons of convergence of i t e r a t i v e methods o f
t h e mathematical programming [ 7 ] . However, t h e convergence theo-
rems de r i ved by W . I . Zangwill do n o t exhaus t i n v e s t i g a t i o n s con-
duc ted i n t h i s f i e l d , and many au tho r s formulated o t h e r cond i t i ons
t h a t c h a r a c t e r i z e convergence of i t e r a t i v e procedures . I n s p i t e
of t h e f a c t t h a t t h e l a t e r approaches a r e less g e n e r a l and
u n i v e r s a l they proved t o be more h e l p f u l i n i n v e s t i g a t i o n s o f
s p e c i f i c a lgor i thms . Take (7-91 a s an example. I t should be
emphasized t h a t i n t h e major i ty o f c a s e s t h e s e works d e a l wi th
convergence of a lgor i thms whose o b j e c t i v e func t i on dec r ea se s
monotonical ly a s a p roce s s goes and, t h e r e f o r e , they a r e no t
a p p l i c a b l e , i n p r i n c i p l e , t o t h e ca se i n hand. These and o t h e r
reasons se rved a s t h e s t a r t i n g p o i n t i n t h e e l a b o r a t i o n o f condi-
t i o n s o f convergence of i t e r a t i v e procedures w i th weakened proper-
t i e s of a monotonous v a r i a t i o n o f t h e o b j e c t i v e func t i on i n t h e
p rog re s s of t h e s o l u t i o n of an extremum problem. The approach
s e t f o r t h below i s based on a u t h o r ' s paper [ 1 2 ] .
W e w i l l c ons ide r an a lgo r i t hm of t h e mathematical programming
a s a c e r t a i n r u l e of c o n s t r u c t i o n of a sequence (xS1 of p o i n t s o f
![Page 113: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/113.jpg)
an n - dimens iona l Euc l idean space E". C o n d i t i o n s o f convergence
of t b s sequence w i l l be fo rmula ted i n terms o f p r o p e r t i e s o f
t h i s sequence and o f a c e r t a i n s u b s e t X* of t h e space E" which
we w i l l c a l l t h e s o l u t i o n s e t . The a l g o r i t h m w i l l b e thought
of a s t h e convergen t a l g o r i t h m i f each l i m i t p o i n t o f a sequence
g e n e r a t e d by it belongs t o t h e set x*.
The b a s i c convergence theorem i s fo rmula ted a s f o l l o w s :
Theorem 1. L e t t h e sequence {xSl and t h e set X* be such t h a t
Al) I f xSk - x* E X* then
A 2 ) There e x i s t s a compact s e t K such t h a t
k - A 3 ) If x -+ x ' ;= X* , then t h e r e e x i s t s E~ > 0 such t h a t f o r
t k a l l E ( E and any k ' s t h e r e e x i s t s a p o i n t x , tk > Sk 0
such t h a t
!7e w i l l assume
tk = n i n t : IIX' - xSkl > E
t > Sk
A 4 ) There e x i s t s a con t inuous f u n c t i o n W(x) such t h a t
f o r a r b i t r a r y sequences I s k : , ( t k j satisfying c o n d i t i o n
A 3 .
![Page 114: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/114.jpg)
AS) The func t i on W assumes on X* an everywhere incomplete set
of va lues .
Then a l l l i m i t p o i n t s of t h e sequence {xS1 belong t o
t h e set X . Thi s theorem is proved i n [ 1 2 ] . A v e r s i o n o f cond i t i ons
g iven t h e r e v a r i e s t o some e x t e n t from t h a t g iven above; however,
p roo f s o f bo th theorems a r e p r a c t i c a l l y s i m i l a r . An a s s e r t i o n
weaker than Theorem 1 is a l s o o f i n t e r e s t .
Theorem 2 . Under t h e c o n d i t i o n s of Theorem 1 A1-A4 t h e r e
e x i s t s a l i m i t p o i n t of t h e sequence {xS} which be longs t o t h e s e t
x*. The proof o f t h i s theorem employs t h e same arguments a s
t hose of t h e proof o f Theorem 1 .
5 . MINIMIZATION OF WEAKLY CONVEX FUNCTIONS
I n t h i s s e c t i o n w e s h a l l s t udy convergence of t h e r e c u r r e n t
procedure
f o r f i n d i n g t h e uncondi t iona l n~inimum of t h e weakly convex func-
t i o n f . I n t h e above r e l a t i o n ps > 0 a r e s t e p m u l t i p l i e r s ,
gS E a f ( x S ) i s t h e subg rad i en t o f t h e o b j e c t i v e f u n c t i o n f a t t h e
p o i n t xS. Requirements p laced upon t h e sequence o f s t e p mu l t i -
p l i e r s w i l l be s t i p u l a t e d i n what fo l lows .
To prove convergence of procedure (10) r e q u i r e s an a u x i l i a r y
geomet r ica l lemma. In a s i m p l i f i e d form such lemma was f i r s t
proved i n [ 6 ] . Lemma 4. Let D be a convex compact s e t which does no t
con t a in a z e ro and l e t 1 ~ ~ 1 be an a r b i t r a r y s e t o f v e c t o r s from
D . By means of a sequence of numbers on such t h a t
![Page 115: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/115.jpg)
n l e t u s form a sequence o f v e c t o r s { z 1 a s f o l l o w s :
0 0 z = y
Denote by { n k ) a s equence o f i n d e x e s s u c h t h a t
Then f o r some -< > 0 such a sequence e x i s t s and
n P r o o f . I t i s o b v i o u s t h a t t z 1 C D . S i n c e 0 ZD, t h e n
c o n s t a n t s d and A e x i s t s u c h t h a t
n L e t u s c o n s l d e r now t h e changes i n t h e l e n g t h o f v e c t o r s z :
n+l 2 n+ 1 n 2 Uz I = i z n + o n ( y - z ) I = znl12 +
I f f o r a l l n
n n+ l 1 - 2 ( z , y
![Page 116: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/116.jpg)
t h e n
n+l 2 < z n I 2 + ud2u; - 6 2 on YZ u -
S i n c e a n + 0, t h e n f o r s u f f i c i e n t l y l a r g e n
Summing t h e above i n e q u a l i t y w i t h r e s p e c t t o n from N t o N c M - 1
we o b t a i n
The p a s s t o t h e l i m i t when M + m l e a d s t o a c o n t r a d i c t i o n t o
t h e s u p p o s i t i o n ( 1 1 ) . I t f o l l o w s t h a t t h e r e e x i s t s a sequence
( n k l such t h a t
F u r t h e r , from (12) it fo l lows f o r s u f f i c i e n t l y l a r g e k t h a t
Hence
which completes t h e p roof .
The main r e s u l t which w i l l be prcved h e r e l a t e r i s t h e
p r o p o s i t i o n a b o u t convergence o f p rocedure ( 1 0 ) . A t f i r s t t h e
![Page 117: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/117.jpg)
s o l u t i o n s e t w i l l be d e f i n e d u s i n g t h e n e c e s s a r y extremum
c o n d i t i o n s :
The f o l l o w i n g theorem i s v a l i d :
Theorem 3 . L e t
and t h e sequence { x S } d e f i n e d by (10) be bounded. Then a l l l i m i t
p o i n t s o f t h i s sequence be long t o t h e s e t x * .
P r o o f . I n p r o v i n g t h i s theorem we s h a l l employ t h e g e n e r a l
c o n d i t i o n s o f convergence d e s c r i b e d i n S e c t i o n 3 .
The o b j e c t i v e f u n c t i o n f(x) i s chosen a s W(x) and i t is
d e m o n s t r a t e d t h a t c o n d i t i o n s A1-A4 w i l l be a l s o s a t i s f i e d . For
s i m p l i c i t y , we w i l l assume t h a t c o n d i t i o n A5 i s s a t i s f i e d .
I t i s o b v i o u s , t h a t t h e s a t i s f a c t i o n o f c o n d i t i o n s A1 , A 2
f o l l o w s d i r e c t l y from t h e a s sumpt ions o f t h e p r o o f .
L e t {xnkI be a convergen t subsequence and
l i m xnk = x * 2 X* k + m
I n t h i s c a s e 0 Z G ( x l ) and by v i r t u e o f G(x) b e i n g upper semi-
c o n t i n u o u s i t is p o s s i b l e t o choose s o s m a l l & > 0 t h a t
T h i s i s a l s o t r u e f o r t h e E - s u b g r a d i e n t s . I t 1s always p o s s i b l e
t o choose s o s m a l l E , & > 0 t h a t
![Page 118: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/118.jpg)
Then, if condition A3 is not satisfied, for k's large enough
and by virtue of separation theorems there exists a vector e
such that
The above inequality implies because of our assum~tions an s
unlimited growth of the inner product (x ,e). This implication
obviously contradicts the assumption and, therefore, proves
that condition A 3 is satisfied.
Let for some small i > 0
% = rnin n : \xm - .Ckl > i
m l n k
Requirements placed on E will be refined later.
We meet the dominant difficulty at the following step of
the proof; an estimation of andecrease in the objective function s when passing from the polnt x k. As the directions - g are,
generally speaking, not the directions of decrease in the function
f(x) the problem of estimation of the function decrease is fairly
difficult and rather unwieldy in view of the large number of com-
putatlon.
![Page 119: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/119.jpg)
Let us f i x a s u f f i c i a n t l y l a r g e k and examine a d i f f e r e n c e
m n f i x ) - f i x n k ) 2 ( g m , x m - x n k i + - r i x k , x m l , m > n k
E s t i m a t e w i t h g r e a t e r p r e c i s i o n t h e addend on t h e r i g h t
s i d e of t h i s i n e q u a l i t y .
m V e c t o r s zk c a n be o b t a i n e d by means o f t h e r e c u r r e n t
formula:
w l t h t h e l n i t i a l c o n d i t i o n
and c o e f f i c i e n t s IJ!~' e q u a l t o
I t i s e a s i l y s e e n t h a t 0 5 g:k) 5 1
![Page 120: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/120.jpg)
Then by v i r t u e o f Lemma 4 t h e r e e x i s t s a sequence k
( s i t i = 1 , 2 , . . . , I of indexes such t h a t
and h e r e
k Choose from t h e sequence h i , i = 1 , . . . , 1 a maximum i n d e x whose k v a l u e does n o t exceed t h e i n d e x mk and d e n o t e i t by v . : 1
From t h e i n e q u a l i t y (Lemma 4 )
sk - 1
" ( k ) < c k s -
S'S . 1
it f o l l o w s t h a t f o r s u f f i c i e n t l y l a r g e k ' s
whicn i m p l i e s t h a t
![Page 121: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/121.jpg)
The above i n e q u a l i t y may be p u t i n a n o t h e r form:
where q = 1 - p < 1
Summing up it is p o s s i b l e t o s a y t h a t we have c o n s t r u c t e d
.,: a s a r e s u l t t h e p o i n t x such t h a t
" 1 I f i n a s i m i l a r r e a s o n i n g t h e p o i n t x 1s considered a s t h e
i n l t i a l o n e , t h a n i t i s p o s s i b l e t o show t h e e x i s t e n c e o f a p o i n t k
J "
x s u c h t h a t
k k "1 " 2 - r ( x , x )
s=v k
1
and
![Page 122: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/122.jpg)
Let us f i x an a r b i t r a r y smal l T > 0 and r e p e a t t h i s p rocess
a r equ i r ed number of t imes i n o rde r t o c o n s t r u c t a sequence of k
V .
p o i n t s {x l, i = 1 $ 2 , . . . , M I such t h a t f o r each i i n e q u a l i t i e s
s i m i l a r t o C13)- Cl4) be s a t i s f i e d :
and qm 2 T . I t obviously s u f f i c e s t o r epea t t h e above reason-
i ngs no more than M = [ l o g T I + 1 t imes . Summing (1 5) wi th
r e spec t t o i from zero t o M - 1 we o b t a i n (assuming u = nk and k denot ing b M = t k 1 :
Addend i n t h e r i g h t p a r t of t he i n e q u a l i t y is eva lua t ed
a s fo l lows:
![Page 123: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/123.jpg)
M sup 1 r l x . y ) = n ;,ixnki -
n For t h e k ' s t h a t a r e l a r g e enough Px - x ' - < E t h e r e f o r e
w h e r e 6 ( L ) - 0 f o r ,- 0
F i n a l l y we ob t a in :
+ E 6 ( E ) .
w h e r e - may b e assumed t o b e so small t h a t
![Page 124: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/124.jpg)
I n doing so we obta in :
Furthermore,
S u b s t i t u t i n g t h i s es t imate i n t o (16 ) w e o b t a i n :
I t may be always assumed t h a t
hence
Passing t o t h e l i m i t when k-, we obta in :
- "'k n l i m W(x ) < lim w (X k]
k + m k' m
which is what it was requi red t o prove.
A s a r e s u l t t he convergence of algori thm ( 1 0 ) is a
consequence of tile s a t i s f a c t i o n of cond i t i ons At-A5 of Theorem 1 .
![Page 125: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/125.jpg)
6. CONVEX CASE
For t h e problem o f convex m i n i m i z a t i o n some r e s u l t s c a n be
o b t a i n e d d e s c r i b i n g t h e b e h a v i o r o f p r o c e s s ( 1 0 ) i n t h e c a s e when E = 6 = c o n s t .
S
Theorem 4 . L e t t h e o b j e c t i v e f u n c t i o n f ( x ) b e convex
Then, ~f t h e sequence (xS1 is bounded, t h e r e e x i s t s o n l y one
c o n v e r g e n t subsecuence ( x
k-+ao
and
f ( 2 ) 5 min £ ( X I + E
XEE
P r o o f . The p roof w i l l b e b a s e d on t h e same fo rma l i sm as i n
Theorem 3 . L e t
n X* = ( x * : f ( x * ) = min f ( x ) , X E E 1
and
Denote
W(x) = min Ux- x*i12
X* E X*
![Page 126: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/126.jpg)
I n o u r c a s e t h e r o l e of a set o f s o l u t i o n s w i l l be played
by X: . L e t us v e r i f y whether c o n d i t i o n s A1-A4 from S e c t i o n 2
can be s a t i s f i e d . I t i s obvious , t h a t o n no account c o n d i t i o n
AS can be s a t i s f i e d i n t h i s c a s e and, t h e r e f o r e , it is p o s s i b l e
t o prove o n l y a weakened convergence o f Drocess (10) i n t h e
s p i r i t o f Theorem 2.
C o n d i t i o n s Al, A2 a r e obv ious ly s a t i s f i e d i n assumptions of
t h i s theorem: v e r i f y whether c o n d i t i o n A3 is s a t i s f i e d . L e t
t h e r e be some subsequence:
t h a t is,
f(xl) > min f(x) + E
x E E ~
Assume the c o n t r a r y t o c o n d i t i o n A 3 , that i s ,
Then f o r an a r b i t r a r y 6 > 0 f o r a s u f f i c i e n t l y l a r g e k
for s nk . Choose 6 > 0 i n such a way that t h e set
does n o t i n t e r s e c t w i t h t h e s e t X*E : U,.& ( x ' ) 71 XE = $. Then i n
s u p p o s i t i o n s of t h e proof f o r an a r b i t a r y x * E X * and s > n k:
![Page 127: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/127.jpg)
I l l X s - x*l12 + CO: - 2 0 g ( q s 1 ~ s - x * ) -
s i n c e
t h e n
whence w e have f o r s > n k
S u b s t i t u t i n g t h e above i n e q u a l i t y i n t o (17) w e o b t a i n
or f o r s u f f i c i e n t l y l a r g e k
w ( x S + ' ) " ( x S ) - Y i i s
Summing (18) w i t h r e s u e c t to s f rom nk t o m-1 we o b t a i n :
P a s s i n g i n t h e above i n e q u a l i t y t o t h e l i m i t when m - a we
have a c o n t r a d i c t i o n t o t h e boundedness o f t h e c o n t i n u o u s func -
t i o n W(x) on U u 5 ( x ' ) . The o b t a i n e d c o n t r a d i c t i o n p r o v e s t h e
f a c t t h a t c o n d i t i o n A 3 1s s a t i s f i e d . L e t
![Page 128: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/128.jpg)
F o r k's t h a t a r e l a r g e enough
t h e r e f o r e t h e e s t i m a t e o f ( 1 9 ) i s a l s o v a l i d f o r m = m k
However,
By means o f t h e above e s t i m a t e we f i n a l l y o b t a i n :
and p a s s i n g t o the l i m i t when k + -
which , by v i r t u e o f Theorem 2 , p r o v e s o u r p r o p o s i t i o n .
I n a l l p r o b a b i l i t y t h e assert!-on o f t h i s theorem c a n n o t b e
s t r e n g t h e n e d u n l e s s a d d i t i o n a l h ~ o t h e s e s c o n c e r n i n g t h e c h o i c e
of v e c t o r s gS from a p p r o p r i a t e sets G (xS) of E - s u b g r a d i e n t s E
a r e i n v o l v e d .
I t is a l s o o f i n t e r e s t t o e s t i m a t e a d e v i a t i o n o f t h e l i m i t
p o i n t s cf t h e sequence {xs) f rom t k e set o f s o l u t i o n s X: . I f we d e n o t e
d = s u p i n f Ax*-xzn
x;€x: x*fX*
t h e n f rom g e o m e t r i c a l c o n s i d e r a t i o n s it i s e a s i l v shown t h a t a l l
![Page 129: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/129.jpg)
l i m i t p o i n t s of t h e sequence (xs! o c c u r i n t h e s e t
where S i s a u n i t b a l l and t h e a d d i t i o n is meant i n Minkovsky 's
s e n s e .
7 . A P P E N D I X L J D GEIiEiiALIZATIONS
An e s s e n t i a l f e a t u r e t h a t d i s t i n g u i s h e s t h e r e s u l t of
Theorem 3 a s compared t o t h a t o b t a i n e d e a r l i e r i n [ l 31 i s , a s
a p p l i e d t o minimax problems o f t h e t y p e
rnin max f ( x , y )
x !'
t h e p o s s i b i l i t y t o r l d o n e s e l f o f t h e check o f e x a c t n e s s o f
t h e s o l u t i o n of an a u x i l i a r y problem of f i n d i n g t h e i n t e r n a l
maximum:
P ( x ) = max f ( x , y )
7
T h i s e n a b l e s us t o j u s t i f y t h e a p p l i c a t i o n o f Arrow-Gurwitz '
method
I n t h e s o l u t i o n o f problem ( 2 0 ) on t h e S a s l s of b r o a d e r assump-
t l o n s than common assumptions of s t r l c t c o n v e x i t y - c o n c a v i t y o r
s i m l l a r o n e s . Under some o f them c o n c e r n i n g t h e r e l a t l o n between
s t e p multipliers l t p roves t o be p o s s i b l e t o c o n s l d e r l t e r a t l v e
r e l a t i o n ( 2 ) a s t h e E - s u b g r a d l e n t method o f m i n i m i z a t i o n o f t h e
![Page 130: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/130.jpg)
func t ion @ ( x ) . Convergence of method (21 ) - (22 ) i s he re an
imp l i ca t ion of Theorem 3. Resu l t s ob ta ined i n this f i e l d a r e
descr ibed i n more d e t a i l i n [141. Of g r e a t p r a c t i c a l i n t e r e s t
i s a l s o t h e development of methods f o r r e g u l a t i n g s t e p m u l t i p l i e r s
i n procedure ( 1 0 ) . Bas i ca l ly , Theorem 3 a s s e r t s t h a t t h e
E - subgrad ien t methods converge under t h e same assumptions a s
t h e subgradien t methods. I n a l l p r o b a b i l i t y , t h e i d e a s t h a t
u n d e r l i e t h e subgradien t methods a r e app l i cab l e t o t h e E -sub-
g rad i en t methods when t h e i r s t e p m u l t i p l i e r s a r e r egu la t ed , and
furthermore, t h e computational e f f e c t is a l s o t h e same.
A non-formal requirement he re c o n s i s t s i n g iv ing up t h e
exac t computation of t h e o b j e c t i v e func t ion a s s t a t e d e a r l i e r
i n t h e i n t r o d u c t i o n t o t h i s paper. For i n s t ance , t h e general-
i z a t i o n on t h e c a s e of E - subgrad ien t method of s t e p r e g u l a t i o n
[ 1 1 ] presen t s no d i f f i c u l t i e s .
REFERENCES
[ I ] Ba l i c sk i , M.L. and P. Wolfe, eds . , Nondi f fe ren t iab le Optimizat ion, Mathemat ical Programming, S t u d y 3, North-Holland Publ i sh ing Co., Amsterdam, 1975.
[ 21 Rockafe l la r , R.T. Convez AnaZys i s , Pr ince ton Univers i ty P re s s , Pr ince ton , N . J . , 1970.
[ 3 1 Ber tsecas , D.P. and S.K. M i t t e r , A Descent Numerical Method f o r Optimizat ion Problems wi th Nondi f fe ren t iab le Cost Funct iona ls , SId,Y Journal C o n t r o l , Vol. 11, 2 (1973) .
[4] Lemarechal, C . , Nondi f fe ren t iab le Optimizat ion; Subgradient and E - Subgradient Methods, Lecture Notes: Numerical Methods i n Optimizat ion and Operat ions Research, Spr inger Verlag, August 1975, 191-199.
[51 Rockafe l la r , R.T. , The M u l t i p l i e r Method o f Hesten and Powell Applied t o Convex Programming, JOTA, Vol. 1 2 , 6 (1974). -
[6] Nurminski, E . A . , The Quas ig rad i en t Method f o r t h e Solving of Nonlinear Programming Problems, C y b e r n e t i c s , Vol. 9 , 1 , (Jan-feb 1?73) , 145-150, Plenum Publ i sh ing co rpo ra t ion , N . Y . , Iondon.
[ 71 Zangwill , W. I . , Convergence Condit ions f o r Nonlinear Pro- gramming Algorithms, Management S c i e n c e , Vol. 16, 1 ( 1 9 6 9 ) ~ 1-13.
![Page 131: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/131.jpg)
(81 : J o l f , P . , a n v e r g e n c e ? h e o r 3 i n ; i o n L i n e c r P r o g r a m m i n g , N o r t h - H o l l a n d Publishing Company, 1 9 7 0 , 1 -36 .
191 Y e y e r , G.G.L., A S y s t e m a t i c Approach t o t i y e S y n t h e s i s o f A l g o r i t h m s , S u r n e r i c a l ! d a c h s r n a t i c s , Vol . 2 4 , 5 ( 1 9 7 5 ) , 277-290.
[ l o ] R h e i n b o l d t , W.C., A U n l f i e d Conve rgence T h e o r y f o r a C l a s s of I t e r a t i v e P r o c e s s e s , S T A M Z o u r n a .Vumerica L AnaZysz . : , Vol . 5 , 1 ( 1 9 6 8 ) ,
[ I l l N u r m l n s k i , E.A. a n d A.A. Z h e l i k h o v s k i Investigation o f One R e g u l a t i n q S t e p , Z y h e r n e t i . ? ~ , V o l . 1 0 , 5 (Yov-Dec 1974 ) , 1027-1031, P lenum P u b l i s h i n g C o r p o r a t i o n , New York .
1121 N u r m i n s k i , E.A., Conve rqence C o n d i t i o n s f o r N o n l i n e a r Programming A l g o r i t h m s , K y o e r n e t i k o , 5 ( 1 9 7 2 ) , 73-81 ( i n R u s s i a n ) .
[ 131 N u r m l n s k l , E.A. a n d A.A. Z h e l i k h o v s k l , 5 - Q u a s i g r a d l e n t Method f o r S o l v i n g Nonsmooth E x t e r n a l P r o b l e m s , ; 3 b e r ~ e t i , s , Vol . 1 3 , 1 ( 1 9 7 7 ) , 109-114 , P lenum P u - l i s h l n g C o r p o r a t i o n , NTY. , London.
[ I U ] N u r m i n s k i , E.A. and P . I . Ve rchenko , Conve rgence c f A l g o r i t h m s f o r F i n d i ~ q S a d d l e P o i n t s , L ' y b e r n e c i z s , Vol . 1 3 , 3, 430-434 , Plenum P u b l i s h i n q C o r p o r a t i o n , N . Y . , London.
![Page 132: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/132.jpg)
![Page 133: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/133.jpg)
FAVORABLE CLASSES OF LIPSCHITZ-CONTINUOUS FUNCTIONS IN SUBGRADIENT OPTIMIZATION
3. Tyrrell Rockafellar Department of Xathematics University of Washington Seattle, Washington USA
1 . INTRODUCTION
function E : R" - R ~s said to be l ocaSC3 : I p . i c ; l i t z i z n ~f for
each Y.ER" there is a neighborhood X of x such that, for some !% :I',
I f ( X " ) - f ( ~ ' ) ; i x " - x ' I for all x'EX, x " E X ( 1 . 1 )
Examples include continuously differentiable functions, convex
functions, concave functions, saddle functions and any linear
comb~nation or pointvise maximum of a finite collection of such
functions.
Cllrke (1975 and 1 9 R O ) , has shown that when E 1s locally
L~~schltzlan, the generalized dlrectlonal derlvatlve
f (XI+ tv) - f (x') fa ( x : v ) = lln sup :.: ' - X t t . o
1s for each x a finlte, iu~linear (i.e., convex and ccsitivel:~
homogeneous) funct~on of 7 . From this it follows by classical
convex analysls that the set
I1 1 , . 3f (x) = ' . y iln y - t i fo ( x ; . i ) f o r all v 5 'i ', I . 3 i
-
![Page 134: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/134.jpg)
is nonempty, convex, compact, and satisfies
f0 (x;v) = max {yev / y E af(x) I f o r a l l v ~ ~ " . (1.4)
The elements of af(x) are what Clarke calledngeneralized gradients"
of f at x, but we shall call them subgradients. As Clarke has
shown, they are the usual subqradients of convex analysis when f
is convex or concave (or for that matter when f is a saddle func-
tion). When f is continuously differentiable, af(x) reduces to
the singleton [Vf (x) 1 . In subgradient optimization, interest centers on methods for
minimizing f that are based on being able to generate for each x
at least one (but not necessarily every) yEaf(x), or perhaps just
an approximation of such a vector y. One of the main hopes is
that by generating a number of subqradients at various points in
some neighborhood of x, the behavior of f around x can roughly be
assessed. In the case of a convex function f this is not just
wishful thinking, and a number of algorithms, especially those of
bundle type (e.g., Lemarechal 1975 and Wolfe 1975) rely on such an
approach. In the nonconvex case, however, there is the possibility,
without further assumptions on f than local Lipschitz continuity,
that the multifunction af : x+af(x) may be rather bizarrely disas-
sociated from f. An example given at the end of this section has
f locally Lipschltzian, yet such that there exist many other locally
Lipschitzian functions g, not merely differing from f by an addit-
ive constant, for which ag(x) = af(x) for all x. Subgradients alone
cannot discriminate between the properties of these different func-
tions and therefore cannot be effective in determining their local
minima.
Besides the need for conditions that imply a close connection
between the behavior of f and the nature of af, it is essential
to ensure that a £ has adequate continuity properties for the con- struction of "approximaten subgradients and in order to prove the
convergence of various algorithms involving subgradients. The key
seems to lie in postulating the existence of the ordinary direction-
al derivatives
![Page 135: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/135.jpg)
and some s o r t o f relationship be tween them and 2f. X i f f l i n ( 1977a
a n d 1 9 7 7 b ) , mos t n o t a b l y h a s worked i n t h i s d i r e c t i o n .
I n t h e p r e s e n t a r t i c l e we s t u d y t h e r e l a t i o n s h i p be tween f '
and jf f o r s e v e r a l s p e c i a l c l a s s e s o f l o c a l l y L i p s c h i t z i a n f u n c -
t i o n s t h a t s u g g e s t t h e m s e l v e s a s p a r t ~ c u l a r l y amenab l e t o comput-
a t i o n . F i r s t w e g l v e some new r e s u l t s a b o u t c o n t i n u i t y p r o p e r t l e s
o f f ' when f b e l o n g s t o t h e r a t h e r l a r g e c l a s s o f f u n c t i o n s t h a t
a r e " s u b d i f f e r e n t i a l l y r e g u l a r " . Next w e p a s s t o f u n c t i o n s f t h a t
a r e L S ~ J ~ P - C ~ f o r some k , 1 ~k - < m , i n t h e f o l l o w i n g s e n s e : f o r e a c h
point x E R" t h e r e i s f o r some open n e i g h b o r h o o d X of a r e p r e s -
e n t a t i o n
f ( x ) = rnax F ( x , s ) f o r a l l X E X , sES
whe re S i s a compac t t o p o l o g i c a l s p a c e a n d F : X x S - R is a f u n c -
t l o n wh ich h a s p a r t l a 1 d e r i v a t i v e s up t o o r d e r k w i t h r e s p e c t t o
x and wh ich a l o n g w l t h a l l t h e s e d e r i v a t i v e s i s c o n t i n u o u s n o t
l u s t I n x , b u t j o i n t l y i n ( x , s ) E X * S. We r e v i e w t h e s t r o n g r e - 1 .
i u l t s o b t a i n e d by S p i n g a r n ( f o r t h c o m i n g ) f o r lower-C t u n c t l o n s ,
whlch g r e a t l y illuminate t h e p r o p e r t l e s t r e a t e d by M i f f l i n ( 1977b) , and w e g o on t o show t h a t f o r k . 2 t h e c l a s s e s o f l ower - ck f u n c t i o n s - a l l c o i n c i d e and h a v e a s l m p l e c h a r a c t e r i z a t i o n .
B e f o r e p r o c e e d i n g w i t h t h i s , l e t u s r e v i e w some o f t h e e x i s -
t e n c e p r o p e r t i e s of f ' and continuity p r o p e r t i e s o f !f t h a t a r e
p o s s e s s e d by a n y l o c a l l y L i p s c h l t z i a n f u n c t l o n . T h l s w i l l b e u s e -
€21 p a r t l y f o r backg round b u t a l s o t o p r o v i d e c o n t r a s t be tween
s u c h p r o p e r t i e s , wh ich a r e a o t a d e q u a t e f o r p u r p o s e s o f s u b g r a d i e n t
q p t l m i z a t i o n , and t h e r e f i n e m e n t s o f them t h a t w i l l b e f e a t u r e d
l a t e r .
L o c a l L l p s c h l t z continuity of a f u n c t l o n f : R"- R l m p l l e s by
a classical t h e o r e m o f Xademacher ( s e e S t e l n ' 9 7 0 ) t h a t f o r a l m o s t
? v e r y x ER", 7 f 1s d : f f e r e r . t l a b l e a t x , and m o r e o v e r t 3 a t t h e g r a d -
, e n t n a p p l n g 7f, on t h e s e t where 1*- ~ ~ 1 s t ~ ~ 1s l ~ c a i l ( bounded .
![Page 136: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/136.jpg)
Given any x E Rn, a point where f may or not happen to be differ-
entiable, there will in particular be in every neighborhood of x
a dense set of points x' where £(XI) exists, and for any sequence
of such points converging to x, the corresponding sequence of
gradients will be bounded and have cluster points, each of which
is, of course, the limit of some convergent subsequence. Clarke
demonstrated in Clarke (1975) that 3f(x) is the convex hull of all
such possible limits:
afcx) = co (lim f(xl) Ixl-.x, f differentiable at x' 1 . (1-7)
Two immediate consequences (also derivable straight from properties
of fo (x;v) without use of Rademacher's theorem) are first that af
is localiy bounded: for every x one has that
U af(xl) is bounded for some neighborhood X of x, x'Ex (1.8)
and second that af is upper semicontinuous in the strong sense:
for any E > 0 there is a 6 > 0 such that
af(xl)caf(x)+~Bwhenever 1x1-x16 ,
where
B = closed unit Euclidean ball = {xi 1x1 < 1 . - (1.10)
The case where ?f(x) consists of a single vector y is the
one where £ is strictly differentiable at x with 7f(x) =y, which
by definition means
1 im f(xl+tv) -£(XI) = y-v for all VER" . (1.11)
x'-x t t t.0
This is pointed out in Clarke (1975). From (1.7) it is clear
that this property occurs if and only if x belongs to the domain
of Vf, and Vf is continuous at x relative to its domain.
![Page 137: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/137.jpg)
We c o n c l u d e t h i s i n i z r o d u c t i o n w i t h a n i l l u s t r a t i o n o f t h e
abysma l e x t e n t t o whlch ? f c o u l d i n g e n e r a l , w i t h o u t a s s u m p t i o n s
beyond l o c a l L i p s c h i t z c o n t i n u i t y , f a i l t o a g r e e w i t h 7 f o n t h e
domain o f Sf and t h e r e b y l o s e c o n t a c t w i t h t h e l o c a l p r o p e r t i e s
o f f .
C o u n t e r e x a m p l e
T h e r e is a L i p s c h i t z i a n f u n c t i o n f : R"+R s u c h t h a t
n a f c x ) = [ - 1 , 11 f o r a l l X E R " .
To c o n s t r u c t f , s t a r t w i t h a m e a s u r a b l e s u b s e t A o f R s u c h t h a t
f o r e v e r y nonempty open i n t e r v a l I C R , b o t h m e s [ A n I l > 0 anci
mes[A \ I] '. 0 . (Such sets d o e x i s t and a r e d e s c r i b e d i n mos t t e x t s
on Leljesgue m e a s u r e . ) D e f i n e h : R + R by
i t ( 1 i f t E A , h ( t ) = 1 . - ( T ) d ( ~ ) , where i ( t ) = 1 - 1 i f t € A .
' 0
S i n c e 11511== 1 , h i s L i p s c h i t z i a n o n R w i t h L i p s c h i t z c o n s t a n t
i = 1 . Hence h l i t ) e x l s t s f o r a l m o s t e v e r y t , a n d I h 1 ( t ) ' : l .
I n f a c t h ' = + a l m o s t e v e r y w h e r e , f r om wh ich i t f o l l o w s by t h e
c h o i c e o f A t h a t t h e s e t s i t ! k l 1 ( t ! = 1 ! a n d !t h l ( t ) = - 1 : a r e b o t h
d e n s e i n R. Now l e t
m f ( x ) = ; h ( x i ) f o r x = ( x l , ..., x n ) .
i= 1
Then E IS L i p s c h i t z i a n on xn w i t h g r a d i e n t
e x ~ s t i n q i f and o n l y i f h ' ( x . i e x i s t s f o r i = 1 , ..., n . T h e r e f o r e
7f ( x ) € [ - I , 1 1 " whenever 7f ( x ) e x i s t s , and f o r e a c h o f t h e c o r n e r
p o l n t s e of [ - l , l l n t h e set i x i 7 f i x ) = e i i s d e n s e i n R". Formula
( 1 . 7 ) i m p l i e s t h e n t h a t ( 1 . 1 2 ) h o l d s .
Note t h a t e v e r y t r a n s l a t e g ( x ) = f ( x - a ) h a s 39 3f, b e c a u s e
:f is c o n s t a n t , a n d y e t q - f may b e f a r f rom c o n s t a n t .
![Page 138: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/138.jpg)
2. SUBDIFFERENTIALLY REGULAR FUNCTIONS
A locally Lipschitzian function f : R"+ R is subdi fferentiahty
regutar if for every x €Rn and v € Rn the ordinary directional de-
rivative (1.5) exists and coincides with the generalized one in
(1 -2) :
fl(x;v) = fO(x;v) for all x,v.
Tnen in particular f' (x;v) is a finite, subadditive function of
v; this property in itself has been termed the quasidifferen-
tiabitity of f at x by Pshenichnyi (1971).
THEOREM 1. (Clarke 1975) . If f is convez or tower-C k
on Rn for some k, 1, then f is not only tocatty Lipschitzian
but aubdifferentiatty regular.
Clarke did not study lower-Ck functions as such but proved
in Clarke (1975) a general theorem about the subgradients of "max
functions" represented as in (1.6) with F(x,s) not necessarily
differentiable in x. His theorem says in the case of lower-C k
functions that
where
I(x) = arg max F(x,s). . S E S
It follows from this, (1.4),and the definition of subdifferential
regularity, that
for lower-C1 functions,a well known fact proved earlier by
Danskin (1 967) . The reader should bear in mind, however, that Theorem 1 says
considerably more in the case of lower-ck functions than just this.
![Page 139: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/139.jpg)
BY a s s e r t i n q t h e e q u a l i t y of f ' and f a , i t i m p l i e s p o w e r f u l t h i n g s
a b o u t t h e s e m i c o n t i n u i t y o f f ' and s t r i c t d i f f e r e n t i a b i l i t y o f f .
W e u n d e r l i n e t h i s w i t h t h e new r e s u l t wh ich f o l l o w s .
THEOREM 2 . For a function f : R n - + R , the following are
equivalent:
(a) f is :ocally Lipschitzian and subdifferentiaZZy regular;
i b ) f ' ( x ; v ) exists finitely for all x,v, cznd is upper semicontinuous in x.
Proof.
a ) - b . T h i s i s t h e e a s y i m p l i c a t i o n ; s i n c e f ' = f a
u n d e r s u b d i f f e r e n t i a l r e g u l a r i t y , we need o n l y a p p l y ( 1 . 4 ) a n d
( 1 . 3 ) .
( b ) - ( a ) . Fo r any x ' a n d v t h e f u n c t i o n Q ( t ) = f ( x ' + t v )
h a s b o t h l e f t and r i g h t d e r i v a t i v e s a t e v e r y t by v i r t u e o f (b):
-4oreover , t h e u p p e r s e m l c o n t i n u l t y I n (b) l m p l l e s t h a t f o r any
f i x e d x and v t h e r e 1s a convex neighborhood X o f x and a c o n s t a n t
i . O s u c h t h a t -
£ ' ( X I + t v : V ) < A and - £ ' ( X I - + t v ; v ) 1 -A when x'+ t v E X - ( 2 . 5 )
S i n c e Q h a s r i g h t and l e f t d e r i v a t i v e s eve rywhere and t h e s e a r e
l o c a l l y bounded, l t is t h e i n t e g r a l o f t h e s e derivatives ( c f .
Saks ( 1 9 . 3 7 ) ) :
From t h i s and ( 2 . 5 ~ l t f s i l o w s t n a t
'Thus t h e l o c a l L i p s c h i t z p r o p e r t y ( 1 . 1 ) h o l d s a s l o n g a s x u - x '
1 s some m u l t i p l e o f a f i x e d v . To c o m p l e t e t h e a r g u m e n t , con-
s i d e r n o t j u s t one v b u t a b a s i s v,, ... , v n f o r R".
![Page 140: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/140.jpg)
Each x E R" h a s convex neighborhoods Xi and c o n s t a n t s X i 2 0 such
t h a t
I f ( x ' + t v . ) I. -f ( x ' ) ( ( lit when x ' E X i , x ' + t v . I. E Xi (2 .6 )
Then t h e r e is a s t i l l s m a l l e r neighborhood X of x and a c o n s t a n t
a 2 0 such t h a t f o r X ' E X and xnEX one h a s
w i t h x ' and x ' + t l v l E X1 , X I + t l v l and ( X I + t v l ) + t v 2 E X 2 , and
s o f o r t h , and
Then by ( 2 . 6 )
! f ( x " , - £ ( X I ) < I £ ( x ' + t l v l ) - f ( x ' ) l + I f ( x 1 + t v 1 + t v 2 ) - f ( x l + t v ) I t . . . - 1
I n o t h e r words, f s a t i s f i e s t h e L i p s c h i t z c o n d i t i o n ( 1 . 1 ) w i t h
X = ( X I +... + X n ) a . Thus f is l o c a l l y L i p s c h i t z i a n .
W e a rgue n e x t t h a t f ' ( x ; v ) 2 f ' ( x ; v ) f o r a l l x , v by ( 1 . 2 ) ,
and t h e r e f o r e v i a ( 1 . 7 ) t h a t
f o ( x ; v ) = l i m sup f ' ( x l ; v ) . X ' + X
The " l im sup" i n ( 2 . 7 ) i s j u s t f ' ( x ' ;v) under ( b ) , s o w e conc lude
t h a t f' ( x ; v ) = f o ( x ; v ) . Thus ( b ) does imply ( a ) , and t h e proof of
Theorem 2 is complete .
![Page 141: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/141.jpg)
COROLLARY 1. Suppose f is locally Lipscniszian (2nd
subdifferentiaLLy regular on R" and let D be the set of ali
points where f happens to be differentiable. Then at each
XED, f is in fact strictly differentiable. Furthermore,
the gradient mapping is continuous relative to D.
COROLLARY 2. If f is Locally Lipschitzian and subdif- n ferentially regular on R , then af is actuaLLy single-oaLued
n at almost every x E R .
These corollaries are immediate from the facts about differ-
entiabillty of f that were cited in $1 in connection with formula
(1.7). The properties they assert have long been known for convex
functions but have not heretofore been pointed out as properties
of all lower-ck functions. They hold for such functions by virtue
of Theorem 1.
COROLLARY 3. Suppose f is LocaLLy Lipscnitzian and sub- n diffsrentialLy rsguLar on R . If g is another locaLZy Lip-
schitzion functi~n on R" such that ag = a£, then g = f + const.
Proof. BY corollary 2, 39 is single-valued almost every-
where. Recalling that g is strictly differentiable wherever
ag is single-valued, we see that at almost every x ER" the
function h = g - f is strlctly differentiable with Vh(x) =
79(x) -Vf (x) = 0. Since h is locally Llpschltzian, the fact
that Vh(x) = 0 for almost all x implies h is a constant func-
tion.
COROLLARY 4 . S n p p o s e f C 3 iocallg SCpscnitzian and sub- n I'<jferentCa!ly regular on R . ?hen J^3r avery concfnuousiy
diJ3rentiabLe nappzng {:R-.R", the function Q(t) = f (:(t))
has rignt and ierivatives OC(t) ~ n d Q:(t) avar?where,
2nd tnese sazi;fj
Ql(t) = lim sup Q;(T) = lim sup Q ~ ( T ) , 7-t T+t
![Page 142: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/142.jpg)
Proof. The function Q is itself locally Lipschitzian
and subdifferentiably regular (cf. Clarke 1980). Apply
Theorem 2 to Q, noting that Q;(t) = Q1(t;l) = ~'(t;l) and
Ql(t) = -Q1(t;-1) = -~"(t;-1), and hence also aQ(t) =
[Q: (t) , Q; (t) 1 . The reason Q; ( T) and QL ( T ) can appear inter-
changeably in (2.8) is that by specialization of (1.7) to
Q, as well as the characterizations of Q; and 0: just men-
tioned, one has
Q;(T) = lim sup Q' (T') , Q ~ ( T ) = lim inf Q' (T') , T'+T T'+T
where the limits in this case are over the values T' where
Q'(T') exists.
The multifunction af : R" ZR" is said to be monotone if
(x'- x") (yl- yn) 2 0 whenever y'g af (XI), y n ~ a f(xl') (3.1)
This is an important property of long standing in nonlinear ana-
iysis, and we shall deal with it in $4. In this section our aim
is to review results of Spingarn (forthcoming) on two generaliz-
ations of monotonicity and their connection with subdifferentially
regular functions and lower-c1 functions. The generalized prop-
erties are as follows: af is submonotone if
and it is s t r i c t l y submonotone if
lim inf (x"- x') (yo'- y') 2 0 , vx ,
x'+x 1x"- XI 1 x"+ x
Y'E afcxt) yo*€ af cxl*)
![Page 143: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/143.jpg)
To s t a t e t h e r e s u l t s , w e a d o p t S p i n g a r n ' s n o t a t i o n :
Thus a f ( x I v i s a c e r t a i n f a c e o f t h e compac t c o n v e x s e t 3 f ( x ) ,
t h e o n e c o n s i s t i n g o f a l l t h e p o i n t s y a t wh ich v i s a normal
v e c t o r . L e t u s a l s o r e c a l l t h e n o t i o n o f semi smoo thnes s o f f
i n t r o d u c e d by Mif f l i n ( 1977) : t h i s means t h a t
wheneve r x l+ x , v l + v , t . 4 0 , yj -c y , w i t h 1
y 3 € a f ( x 1 + t . v 3 ) , t h e n o n e h a s y . v = f l ( x ; v ) , ( 3 . 5 ) 3
THEOREM 3 ( S p i n g a r n ( f o r t h c o m i n g ) ) . The ,fo L I O W ~ Q ~
p r o p e r t i e s o f a LocaL Ly L i p s c h i t z i a n f u n c t i o n f:Rn-c R a r e
2 q u i v a L e n t :
( a ) f i s b o t h a u b d i f f z r s n t i a l t y r e g u t a r and s emi smoo th ;
( b i a f i s submonotone;
i c ! a f < s d i r e c t i o n a l l y upper s e m i c o n t i n u o u s i n c h z
s z n s e :hat f o r e v e r y x E R", v E R" nnd E > 0 , t h e r e
i s n 6 > 0 such : n a t
a f ( x + t v l ) c 3 f ( ~ ) ~ + . ; B when i v ' - v : < 6 and O < t < 5 .
( 3 . 61
THEOREM 4 ( S p i n g a r n ( forthcoming) ) . T;le :'o, L ~ d i n ?
2 r o p e r t i s s o f a l o c a l l y L i p s c h i t z z a n : u n c t i o n £:R"+A a r e
~ q u i v a l e n t :
1 i a i f i s l ower C ;
I c i a f is s t r i c t Ly d i r e c t i o n c L iy dpper a e m - : z o ~ t i ~ x o u s
i n : h e ; enso that j g r e a e r y x E R n , V E R ~ 2nd E > 3 ,
t ; ~ e r e i s a 3 > 0 t h a t
![Page 144: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/144.jpg)
(yn-y')*v'> - - E when Ix8-xl c 6 , Iv8-vl c 6 , O c t c d ,
y'~af(x') and y n ~ a f ( x B + t v l ) .
(3.7)
Spingarn has further given a number of valuable counter-
examples in his forthcoming paper. These demonstrate that
af submonotone f af strictly submonotone , (3.8)
1 f subdifferentially regular r( f lower-C , (3.9)
f quasidifferentiable and semismooth r( f subdiffer- entially regular. (3.10)
Comparing Theorems 3 and 4, we see that lower-C I functions
have distinctly sharper properties than the ones of quasidiffer-
entiability and semismoothness on which Mifflin, for instance,
based his minimization algorithm (1977a). In perhaps the majority
of applications of subgradient optimization the functions are ac- m
tually lower-c', or even lower-C . This suggests the possibility
of developing improved algorithms which take advantage of the
sharper properties. With this goal in mind, we explore in the
next section what additional characteristics are enjoyed by lower- k C functions for k > 1 .
The properties of lower-ck functions for k 2 2 turn out, rather surprisingly, to be in close correspondence with properties
of convex functions It is crucial, therefore, that we first take
a look at the latter. We will have an opportunity at the same
time to verify that convex functions are special examples of m
lower-C functions. The reader may have thought of this as obvi-
ous, because a convex function can be represented as a maximum of
affine (linear-plus-a-constant) functions, which certainly are C-.
The catch is, however, that a representation must be constructed
in terms of affine functions which depend continuously on a para-
meter s ranging over a compact set, if the definition of lower-cm
is to be satisfied.
![Page 145: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/145.jpg)
We make use now of the concept of monotonicity of af defined
at the beginning of 5 3 .
THEOREM 5. For a l o c a l l y L i p s c h i t z i a n f t t n c t i o n ~:R"-.R,
t h e f o l z o w i n g p r o p e r t i e s a r e e q u i v a l e n t :
( 5 ) a f i s montone ;
I c ) f o r sach ~ E R " t h e r e i s a ne ighborhood X o f
and a r e p r e s e n t a t i o n o f f a s i n (1.6) w i t h S a
compact t o p o l o g i c a l s p a c e , F (x, s) a f f i n e in x
and c o n t i n u o u s i n s .
P r o o f . [a) - (c) . In terms of the conjugate f* of the
convex function f, we have the formula
f (x) = max {y. x -f*(y) 1 for all x , YE R"
where the maximum is attained at y if and only if y E 3f (x)
(see Rockafellar 1970, $ 2 3 ) . Any x has a compact neighborhood X on which 2f is bounded. The set
is then compact, and we have as a special case of (4.1)
f[x) = max iy-x-dl . (y,a) E S
This is a representation of the desired type with s = (y,3),
F(x,s) = y-x - 3 .
(c) (a). The representations in [c) imply cer-
tainly that f is convex relative to some neighborhood of each
point. Thus for any fixed x and v the function Q (t) = f (x + tv) has left and right derivatives QI and Q; whlch are nondecreasing
in some neighborhood of each t. These derivatives are then non-
decreaslnq relative to t~ (--,a), and it follows from this that
![Page 146: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/146.jpg)
Q is a convex function on ( - - , - I (cf. Rockafellar 1 9 7 0 , 524)
Since this is true for every x and v, we are able to conclude
that f itself is convex.
( a ) - (b) . i'his is well-known (cf. Rockafellar
1 9 7 0 , $24).
b a . A direct argument could be given, but
we may as well take advantage of Theorem 3. Monotonicity of
af trivially implies submonotonicity, so we know from Theorem
3 that f is subdifferentially regular. Fixing any x and v, we have by the monotonicity of af that
((x +tnv) - (x+tlv)) . (y"- y l ) ) 0 when
This impl-es
i n f yll. V ' -sup [-y'"v1, SUP y l . v < -
af c x + t l v ) Y ~ E af ( X + t l -v) y v m z Sf [x + t " V )
or equivalently (by 1.4) and subdifferential regularity)
f' (X + tlv;v) 2 -f (X + tl'v;-V) when t'2 t" . (4.2)
Since also
-f' (XI;-v) f' (xl;v) for all x7,v,
by the swlinearity of f' ( x ' ; - ) , (4.2) tells us that the func- tion Q(t) = f(x +tv) has left and right derivatives which are
everywhere nondecreasinq In t € (-a, -;. Again as in the arqu- rnent that (c) implies (a), we conclude from this fact that f
is convex on R".
COROLLARY 5 . Ever? convet function f:Rn+R is in par-
ticular lower-cm .
![Page 147: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/147.jpg)
P r o o f . I n t h e r e p r e s e n t a t i o n i n ( c ) we m u s t h a v e F ( x , s ) =
a ( s ) . x - n ( s ) f o r c e r t a i n a ( s ) E R " a n d -r(s) E R t h a t depend con -
t i n u o u s l y o n x . T h i s i s t h e o n l y way t h a t F ( x , s ) c a n b e a f f i n e
i n x a n d c o n t i n u o u s i n s . Then , o f c o u r s e , F ( x , s ) h a s p a r t i a l
d e r i v a t i v e s o f a l l o r d e r s w i t h r e s p e c t t o x , a n d t h e s e a r e a l l
c o n t i n u o u s i n ( x , s ) . 13
L e t u s now d e f i n e two n o t i o n s p a r a l l e l t o S p i n g a r n ' s subrnon-
o t o n i c i t y a n d s t r i c t s u b m o n o t o n i c i t y : a f is h y p o m o n o t o n e i f
( X I - x ) . ( y t - y ) > -- f o r a l l x a n d y E af ( x ) l i m i n f x l + x X I - X I 2 ( 4 . 3 )
y l E a f ( X I )
a n d s t r i c t l y h y p o m o n o t o n e i f
( x V - X I ) ( y " - y e ) > -m f o r a l l x . l i m i n £
X ' + X i X l l - x 1 12 x"+ x
Y ' E a f c x l ) y " ~ a f cxw)
C l e a r l y hypomonotone ~ m p l i e s submonotone , and s t r i c t l y hypomono-
t o n e u n p l i e s s t r i c t l y submono tone . We have l ~ t t l e t o 53'1 h e r e
a b o u t h y p o n o n o t o n i c l t y i t s e i f , b u t t h e i r n p o r t s n c e of s t r l c t hypo-
m o n o t o n l c i t y is d e m o n s t r a t e d by t h e f o l l o w i n g r e s u l t .
THEOREM 6 . For I l o c a L i y L i p s c h i t z i a n ' . rnc;3n f o n Rn,
t h e f o l Z o w i n g p r o p e r t i e s a r e e c u i v a Z e n t :
( 3 , "r e v e r y ;; E R " t h e r e i s J n o n o s - ? e I g ; 1 3 ? r ~ o o d X
7 j X 3n ~ n i c n f h a s J r e p r e s e n t a t i o n
f = g - h o n i( u C t h g c o n v e x , h 7 u a d r a t i c =o,lverc. ( 4 . 5 )
i d ! F O P e v e r y ;; € R n t h e r e i s a . r e i g h b o r i . o o d :< o f
a n d a r e p r e s a n t a r l o n 3f f a s i n ( 1 . 6 ) u i - k S .z
eompac: ~ o p o : ~ ~ d n a l s p a c e , F ( x , s ) q x a d r g z < l i n
x a n d : o n - i n s o u s <n s .
![Page 148: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/148.jpg)
Proof.
(a) -(c). Choose any and consider on some neighborhood X
of x a representation (1.6) of f as in the definition of f 2 being lower-C : F(x,s) has second partial derivatives in x, and
these are C O ~ ~ ~ ~ U O U S with respect to (x,S). Shrink x if necessary so that it becomes a compact convex neighborhood of x.
2 The Hessian matrix O x F(x,s) depends continuously on (x,s)
ranging over a compact set X x S, so we have
min 2 v Ox F(x,s)v > -m . (x,s) EXXS Ivl = 1
Denote this minimum by -0 and let
Then
for all (x, S) E X x S when I v 1 = 1 and hence also in fact for all
v E R ~ , because both sides of (4.7) are homogeneous of degree 2
with respect to v. Thus v 2 G(x,s) is a positive semidefinite X
matrix for each (x,s) E X x S, and G(x,s) is therefore a convex
function of x E X for each sES. The function
g(x) = max G(x,s) SES
is accordingly convex, and we have from (4.6) and (1.6) that 2 (4.5) holds for this and h (x) = ( 3 / 2 ) x( .
c * dl. Given a representation as in (c) , we can translate it into one as in (d) simply by plugging in a representation of
g of the type described in Theorem 5(c).
(d) -(a). Any representation of type (d) is a special case
of the kind of representation in the definition of f being lower- 2 m C (in fact lower-C ) ; if a quadratic function of x depends
![Page 149: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/149.jpg)
continuously on s, so must all its coefficients in any expansion
as a polynomial of degree 2.
(c) - (b) . Starting from (4.5) we argue that 2f (x) =
2g (x) - 3h(x) (cf. Clarke 1 9 8 0 , s3, and Rockafellar 1 9 7 9 , p.345) , where 29 happens to be monotone (Theorem 5) and 3h is actually
a linear transformation: y E af (x) if and only if y = Ax, where
A 1s symmetric and positive semidefinite. For y'E af (x') , y " ~ a f (x") , we have y'+ Ax'€ ag(x') and yn+ Ax"€ 3g(xn), so from the monoton-
icity of ag it follows that
Choosing p > 0 large enough that
v.Av < j v ( - for all v ER"
we obtain from (4.8) tnat
(X"-X').lyl'-yl) > P x l @ - x ' ~ - when X'E X, x"EX,
y'E 3f (x'), (4.9)
y"E 3f (x") .
Certainly (ii.4) holds then for x = x, and s ~ n c e x was an arbl- n
trary polnt of R we conclude that ?f 1s hypomonotone.
b - ( 1 . We are assuming ( 4 . 4 ) , so for any x we know ;Je can find a convex neighborhood X of x and a o > 0 such that (4.9)
holds. Let g (x) = f (x) + ( 0 / 2 ) 1x1 2 , so that 39 = 3f + oI i c f . Clarke
1980, 5 3 , and Rockafellar 1 9 7 9 , p. 345). Then by (4.9) r ;9 is
monotone on X, and ~t follows that g is convex on X (cf. Theorem 5; n tne argument in Theorem j is in terms of functions on all of R ,
but it 1s easily relativlzed to convex subsets of R"). Thus (4.5)
holds for this g and h(x) = ( ; / 2 ) !xi .
![Page 150: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/150.jpg)
2 COROLLARY 6. If a function f:Rn+R is lower-C , it m
is actually lower-C . Thus for 2 c k I - the classes of
lower-ck functions all coincide.
Proof. As noted in the proof that (d) - (a) , any re- presentation of the kind in (d) actually fits the defini-
tion of f being lower-C* . 2 COROLLARY 7. Let £:R"+R be lower-C . Then at almovt
everv ~ € 3 " ~ f ia twice-differentiable in the sense that
there is a quadratic function q for which one has
Proof. This is a classical property of convex func-
tions (cf. Alexandroff 1939), and it carries over to gener-
al lower-c2 functions via the representation in (c).
Counterexample
Since the lower-ck functions are all the same for k 2 2, it might be wondered if the lower-C' functions are really any dif-
ferent either. But here is an example of a lower-c1 function
that is not lower-c2. Let f (x) = - (x l3l2 on R. Then f is of
class cl, hence in particular a lower-c2, and there would exist by
characterization (d) in Theorem 6 numbers a, B,y, such that
f(x) - > a + 0 x + y x 2 forallxnear0,
with equality when x = 0 . 2 Then a = f (0) = 0 and - ( x 13/? > Bx + yx , from which it follows on -
dividing by 1x1 and taking the limits x + 0 and x + 0 that B 10.
Thus -7 would have to be such that - )x l3I2 > y 1 xI2 for all x suf- - ficiently near 0, and this is impossible. Therefore f is not
lower-c2 .
![Page 151: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/151.jpg)
Alexandroff, A.D. 1 9 3 9 . Almost everywhere existence of the second differential of a convex function and some properties of convex surfaces connected with it. Leningrad State Univ. Ann., Math. Ser. 2 : 3 - 3 5 (Russian).
Clarke, F.H. 1 9 7 5 . Tenerallzed gradients and applications. Trans. Amer. Math. Soc. =:247-262.
Clarke, F.H. 1 9 8 0 . Generalized gradients of Lioschitz contin- uous functionals. Advances in Math.
Danskin, J.M. 1 9 6 7 . The Theory of Max-Min and its Applications to Weapons Allocations Problems. Springer-Verlaq. New York.
Lemarechal, C. 1 9 7 5 . An extension of Davidon methods to non- differentiable problems. Math. Programming Study 3 : Non- differentiable Optimization, 9 5 - 1 0 9 . North Holland.
Mifflin, R. 1977a An algorithm for constrained optimization with semismooth functions. Math. of Op. Research 2: 1 9 1 - 2 0 7 .
Hifflin, R. 1 9 7 % Semismooth and semiconvex functlons in con- strained optimization. SIAM J. Control Opt. s: 9 5 9 - 9 7 2 .
Pshenichnyi, B.N. 1 9 7 1 . Necessary Conditions for an Extremum. Marcel Dekker. New York.
Rockafellar, R.T. 1 9 7 0 . Convex Analysis. Princeton UnlTJ. Press.
Hockafellar, R.T. 1 9 7 7 . Directlonally Lipschitzian functions and subdifferential calculus. Proc. London Math. Soc. 3 9 : 3 3 1 - 3 5 5 . -
Saks, S. 1 9 3 7 . Theory of the Integral. Hafner. New York.
Spingarn, J.E. (forthcoming) Submonotone subdifferentials of Lipschltz functions. Trans,Amer.Nath.Soc.
Stein, E.M. 1 9 7 0 . Singular Integrals and D~iferentiability Properties of Functions. Princeton Univ. Press.
Yolfe, P. 1975 . A method of conjugate qradients for minimizing nondifferentiable functlons. Math. Programming Study 3: Nondifferentiable Optimization, 1 4 5 - 1 7 3 . North Holland.
![Page 152: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/152.jpg)
![Page 153: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/153.jpg)
NONDIFFEREMTIABLE FUNCTIONS IN HIERARCHICAL CONTROL PROBLEMS
Andrzej ~uszczyfiski Institute of Automatic Control Technical University of Warsaw Poland
1 . INTRODUCTION
The aim of this paper 1s to highlight the importance of the
theory of nondifferentiable optimization and of equations with
nondlfferentiable operators in a modern branch of control theory - the theory of h i e r a r c n i c a l c o n t r o l s y s t e r n s .
The hierarchical approach usually applied to large-scale
declsion problems 1s based on the partial decentralization of
the decision-making process. The typical decision problem in-
volves an object and a decision maker. In the classical (cen-
tralized) approach the decision maker observes the object and
chooses the decision variables according to cert3in preferencps.
Each such operation is connected with the solution of an optimi-
zation problem. In practice, however, we often have to deal with
large-scale objects which are systems composed of several lnter-
connected subsystems. In this situation the centralized approach,
though mathematically correct, has several drawbacks due to the
need for large-scale informatlon processing, data transmission,
etc. Therefore we aim at organlzinq the decision-making process
in such a way that the decision variables related to a definite
subsystem are chosen on the basis of informatlon relevant for
thls subsystem. Thus, a decision maker is assoc~ated with each
![Page 154: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/154.jpg)
subsystem and takes part in the decision-making process: each of
them observes his own subsystem and chooses his decision variables
by solving an optimization subproblem related to this subsystem.
It is clear that a simple decentralization of this type cannot be
considered effective if there are interactions between the sub-
systems, and for this reason a supreme decision maker (called the
coordinator) is introduced. The aim of the coordinator is to in-
fluence the lesser decision makers so as to make their decisions
consistent with the global objective. This is achieved through
the introduction (by the coordinator) of certain parameters (co-
ordination variables) into the lower-level problems and the modi-
fication of the set of preferences (i.e., objectives and/or con-
straints) used by the other decision makers. It should be stressed
that the coordinator has no direct access to the system and that
he may not override the decisions of the other decision makers.
It is also assumed that he receives aggregated information about
the effect of his decisions. The decision-making structure out-
lined above, which comprises the coordinator and several lower-
order decision makers, is called the hierarchical control system.
At this point it should be said that the above approach does
not necessarily lead to strictly optimal decisions, and this is
the main reason why it is difficult to convince a mathematician
of the value of hierarchical control systems. The advantages of
the hierarchical approach lie beyond the mathematics and are con-
nected with adaptability, information privacy, data transmission
and other preferences which cannot be formalized and which, obvi-
ously, will not be discussed in this paper (see, however, refs.
2 , 4 , and 7 ) . However, once a definite hierarchical structure has
been chosen, interesting and well-defined mathematical problems
arise. These problems are connected with the coordinator's task,
which is to optimize the performance of the whole system by the
appropriate choice of values for the coordination variables. From
the outline given above it will be evident to experts in optimiza-
tion that the function coordination variables + performance will
be a nonsmooth function, even if the infimal problems involve
differentiable functions. This is the main reason for including
this paper in a volume concerned with nondifferentiable optimization.
![Page 155: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/155.jpg)
In the following analysis we shall assume that the subsystems
belng controlled are described by the followinq equations:
where i is an index representing a particular subsystem, yi denotes
the output influencing other subsystems, ci denotes the local de-
cision variables (control), u. denotes the input originating from 1
other subsystems, and zi denotes disturbances. W e assume that
YiE YiCYi, CiECiCCi, U. E U . cyi, Z i E Z i , fi: " 1 1
Ci x Ui x zi + Yi,
U. are finlte dimensional spaces. The interconnections and Yi, Ci, -1 between the subsystems are described by the following equations:
where the H . . are linear operators (usually given by 0-1 matrices). 1 3
For brevity we use the following notation: y = ( y l , ...,y N ) ,
N N N ?i
2 = X 2.. Using :his nctation we can ,write equations ( l . l i , 1 1 . 2 1 L Z 1
-9 the compact form
,where f: C .: U x 2 - Y , H: Y -+ U , H, : 7 -+ U . It 1s ass~rnea that - - - -1
for any 2 E C and z G ;: cne set of equat Lons ( 1 . 3 ) , (1.4) defines .~nique interactions u (c , z j ~ U , u (c , z ) = ( u , (c, z ) , . . . ,ux ( c , z) l .
Xe assume that there 1s a perfornance lndex qlc,z) defined
for tne system,
qlc,z) = ' q i c . , U l , Z )
![Page 156: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/156.jpg)
where u. - ui(c,z). Also, the controls c and inputs u must satisfy 1
the constraints
i where gi : CixUix Zi+R . Here Ji represents the dimensionality
of this vector constraint.
In the next two parts of the paper we shall describe two ex-
amples of the hierarchical approach to the problem of finding the
value 6 which minimizes the function (1.5) subject to (1.3) , (1 -41, and (1.6). These examples represent the two main ideas on which
hierarchical systems are based - the primal approach, which involves a two-stage minimization, and the dual approach, which uses coordi-
nation variables (prices) to represent interactions.
However, before proceeding to these problems we should explain
the notation used in the following sections. If @ : Rn + R' then 2
Vb (x) and V O (x) denote the gradient and the hessian, respectively. If $ : Rn + Rm, @ (x) = (01 (x) , . . . ,@,(XI ) then V@ (x) denotes the matrix
with columns Vbi (x) (i = I,. . . ,m) , and Ox (x) denotes the derivative, considered to be a linear operator. For a linear operator (matrix)
A we use A* to denote the operator (matrix) conjugate to A. Finally,
a function @ : Rn + Rm is described as convex if all functions $i
(i = 1, ..., m) are convex.
2. THE PRIMAL APPROACH
2.1 The decentralized control system
In this section we assume that the disturbance z = (zl, ..., zN) which influences the system is a random variable. We also assume
that it is possibletorneasure z. The optimal decisionrule is then simple:
observe z and choose the value of c which minimizes q (c,z) (defined
by (1.5)) subject to the constraints (1.31, (1.41, and (1.6). How-
ever, this approachnay notbe satisfactory for some particularreason
and so we are going to develop a decision rule z+c, which can be
split into N rules of the form zi+ci. To achieve this we shall
adopt the prtrnaZ a p p r o a c h derived from large-scale optimization
theory (see, for example, ref.5), and analyzed in the context of
![Page 157: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/157.jpg)
control ln refs. 2-4. Let us assume that the desired values of
interactions y = (y, ,...,y N), u = Hy = (u,, . . . , uN) are fixed. Under this assumption we formulate N independent problems for
l o c a l d e c i s i o n m a i a r s as follows: (DMi) o b s e r v e z. and c h o o s a 1
t h e u a l u e ci w h i c h s o l 2 e s f h e p r o b l e m
min qi (ci,ui,zi)
subject to
and
Note that ui and yl are fixed parameters for D M i Let C.(u.,yi,zl) 1 1
denote the feasible set for DMi, defined by (2.1) and (2.2), and
let ci(ui,yi,zi) be the solution of DMi.
N Let ~ ( o , y , z l = x C. (ui,yi,zl), c(u,y,zt = ( G I (u1,y,,z1t , . . . ,
i=1
cx ( u ~ , ~ ' ? ~ , z ~ ) ) . :.Je define the set
vhere P denotes the probability. We assume that Yo # d , and that for any y€YO, u = Hy and for almost all z with respect to measure
2 , the local problems DMi have solutions. Thus, for any y€Y 0 '
the set of problems DM. ( iE 1 1 7 ) ) defines a mapping z - c ( H ~ , y, z ) , 1
composed of local rnapplnqs zi-ci(Hiy,yi,zi). Thus, any y€Y0
deflnes a i e c s r , z r 8 ~ : ~ z e d c , ? n t r ' , ? . :2-- -zn! of the type we were trying
to achleve. It remalns only to make the performance of this con-
trol system as good as possible by appropriate adjustaent of the
deslred values of lnteractlons y (and u = Hy). This is the - 0 -
. . ? r s z ? a t i s ~ : a s k and xe shall focus our attention on this problem.
LC t 3 : Y g Z + R ' be defined by
s I y , z ) = q(cIHy,-/,z) , z ) )
![Page 158: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/158.jpg)
and let
where E denotes the mathematical expectation. The coordination
task is to solve the following optimization problem:
which, when written explicitly, takes the form
N E I ~ E min q i ( ~ i f H i ~ I ~ i f ~ i ) YEY~ ciECi (HiYrYi,Zi)
This is a large-scale two-stage stochastic programming problem [ I ]
(see also the paper by Yu. M. Ermoliev in this volume). The so-
lution of problems of this type presents certain serious difficulties.
Firstly, in order to obtain the value of O at a given point y it is
necessary to compute the integral O (y = (y, z) P (dz) . The ana-
lytical calculation of this integral is impossible (except in some
trivial cases). In addition, the numerical computation of the inte-
gral appears to be difficult and time-consuming. Secondly, even in
very simple cases, the function y + $(y,z) is nonsmooth because its
values are obtained from the optimization problems dependent on the
parameter y. Thus the function O is generally nondifferentiable.
Thirdly, the feasible set Yo is defined indirectly and so it is
difficult to verify whether a given y belongs to Yo.
With a view to the above-mentioned difficulties we have de-
veloped a special algorithm which is able to tackle problem (2.7).
This algorithm is based on the stochastic subgradient method due
to Yu. M. Ermoliev and others.
2.2 Properties of the coordination problem
In this section we shall investigate the essential character-
istics of problem (2.6). We shall make the following assumptions.
1 . There exists an open bounded set LCYO such that the solution
of the problem (2.6) belongs to 1.
![Page 159: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/159.jpg)
2 . The set c ( ~ i l , ~ , Z ) i s bounded .
3 . The f u n c t i o n s ( c , u ) + f ( c , u , z ) , ( c , u ) + g ( c , u , z ) , ( c , u ) + q ( c , u , z )
a r e c o n t i n u o u s l y d i f f e r e n t i a b l e f o r a l m o s t a l l z a n d a l l - 9 ,
u = Hy, c = c ( u , y , z ) , a n d t h e d e r i v a t i v e s a r e u n i f o r m l y bounded .
4 . The f u n c t i o n s z + ( f ( c , u , z ) , q ( c , u , z ) , q ( c , u , z ) ) a r e i n t e g r a b l e .
5 . F o r a l m o s t a l l z and a l l yES2 t h e s o l u t i o n s o f t h e p r o b l e m s DMi
s a t i s f y n e c e s s a r y o p t i m a l i t y c o n d i t i o n s . I t i s p o s s i b l e t o
c h o o s e Lag range m u l t i p l i e r s u n i f o r m l y bounded o n i ! x Z .
6 . One o f t h e f o l l o w i n g c o n d i t i o n s is s a t i s f i e d :
(at The f u n c t i o n s ( c . , u i ) + f i ( c i 1
, u i , z i ) a r e l i n e a r a n d t h e
f u n c t i o n s ( c i u i ) -+ g i ( c i , u i , z i ) , ( c i , u i ) + q i (ci , u . , z i ) a r e
convex f o r a l m o s t a l l z . ..
(b) The f u n c t i o n y + c ( H y ,y , z ) is u n i f o r m l y L i p s c h i t z con -
t i n u o u s o n R f o r zEZ.
F o r b r e v i t y w e u s e t h e e x p r e s s i o n s " a l m o s t a l l " a n d " i n t e g r a b l e "
f o r "P -a lmos t a l l " a n d " P - i n t e g r a b l e " a n d w e w r i t e " 2 " i n s t e a d o f
" Z \ Z o , where Z O i s a set o f n u l l m e a s u r e " .
We s h a l l now p r o v e t h a t t h e f u n c t i o n 3 i s w e a k l y c o n v e x a n d
we s h a l l d e v e l o p f o r m u l a e f o r t h e c a l c u l a t i o n o f i t s subgradients
( s e e r e f . 8 a n d t h e p a p e r b y E . A . Nu rminsk i i n t h i s volume f o r de-
f i n i t i o n s o f a weak ly convex f u n c t i o n a n d a s u b g r a d i e n t ) .
Ldrnrna I . The f u n c t i o n y + 9 ( y , z ) 1s weak ly convex o n 2 f o r a l m o s t
a l l z .
P r o o f . L e t u s d e f i n e t h e Lag range f u n c t i o n f o r DMi:
We d e n o t e by i i ( u i , y i , z i ) , u i ( u i , Y i r Z i ) , SnY m u l t i p l i e r s c o r r e -
s p o n d i n g t o t h e s o l u t l o n o f D M i . L e t
A
b e t h e g l o b a l L a g r a n q e f u n c t i o n and l e t 1 ( u , y , z) = ( 1 ( u l , y l , z l ) , . . . r 3 ~ N ( ~ N , y N , z N ) ) and ; ( u , ~ , z ) = ( u , ( u l , y , , z , ) , . . . ,wN ' ( U N t Y N , Z N ) ) .
![Page 160: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/160.jpg)
Then for any yEn, u = Hy, and for almost all z the following equality
holds:
Let ~ E R and ; = H?. For brevity we use the notation:
We then have
By virtue of condition 3 the function (c,u,y) +L(c,u,y,z,X,u) is
continuously differentiable for almost all z. Thus
- - - where the residual term rL is small with respect to /((c,u,y) - (c,u,y) 1 1 . Note that it follows from the necessary optimality
conditions for DMi that VCL(c,u,yIz,X,u) = 0. Next
If condition 6a holds then the function (c,u,y) +L(c,u,y,z,X,u)
is convex and r > 0. On the other hand, if condition 6b holds then L- rL 1s small with respect to 1 1 j - y 1 1 . In both cases, for almost
all z
![Page 161: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/161.jpg)
$(?,z) - ~J(Y,z) ) < ~ r ? - Y > + rZ(YIy), (2.10)
where
€, = H*VULic,~,y,Z,X,p) + '7 L(c,u,y,z,X,u), Y
(2.11)
and the residual term rZ (yey) satisfies the condition of uniform
smallness with respect to 11; - y / \ on compact subsets of 2. Final-
ly, the function y+@(y,z) is continuous for almost all z. Con-
sequently, taking into account (2.10) and (2.11), it is weakly
convex, and the lemma has been proved.
C o r o l l a r y . For any yER and for almost all z the vector defined
by (2.11) is a subgradient of the function y + g(y,z) at the point y.
L e m m a 2. The function Q is weakly convex on Q. For any ~ € 9 the
vector
with c = c(~y,y,z), u = Hy, X = ;(Hy,y,z), u = i ( ~ ~ , y , z ) is a sub-
gradient of @ at the point y, i.e., dEaQ (y) . P r o o f . Observe that by virtue of assumptions 3 and 4 the function
z + 9 ( y , z ) is integrable. Hence the function @ is well-defined and
continuous on R .
Now let us consider the inequalities ( 2 . 8 1 , (2.9), and (2.10).
In practice, by virtue of condition 5 we can always assume that the ,. functions z + X (u,y,z), z - il (u,y ,z) , z + e(u,y,z) are measurable (this is
true if we choose solutions and multipliers according to a certain
ordering rule, e.g., solut~ons and multipliers of minimum norm, if
they are not unique). Thus the functions z - 7 c L ( ~ ( ~ , y , ~ ) , ~ , y r ~ ,
i(u.y.zl, ;(u,y.z)), z+ouL(c(u,y,z)l u,y,z, ilu,y,z), ;(u.y,z)).
and z - 7 L(;(U,Y,Z), U,Y,Z, Z(U,Y,Z), ;(U,Y,Z)) are integrable. Y
The integrability of these gradients follows from assumptions 3
and 4 and the finite dimensionality of the c-,u- and y-spaces.
Consequently the vector d is well-defined by (2.12) . If condition
6a holds, then rZ(y,y) = 0 and the proposition of the lemma follows
immediately from (2.10). If condition 6b holds, then it follows
from the above considerations that the functron z +rL(c(Hy,yrz). A - -
Hy ,y ,c (Hy, y, z ) , H?,~,z, I (Hy ,y ,z) , ; (HY, , z ) I is integrable, and
![Page 162: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/162.jpg)
- - - - - - the function (y,;) + rL16(~y,yrz), HY,Y, ~(Hyvyrz) , HyrYr X(Hyry,z), j (Hy ,y ,z) ) is uniformly small with respect to 1 1 f-y ( [ for zEZ.
Hence the function rz (y,y) in (2.10) is integrable and ~{r, (Yry) 1 is small with respect to / ( y-y ( 1 . The proposition of the lemma
follows from (2.10) .
2.3 The algorithm
In this section we shall assume that the set satisfying con- 0 dition 1 is known and that there is a point y ER such that @(y) ,
0 Q(y + y for all y€yO\R and for some y > 0.
Let n i ( i ~ C m l ) be binary random variables associated with
the problems DMi such that pi = P{ni = 1 1 , 0. Let n = (nlr...,nN).
We shall now develop a two-level stochastic algorithm for the
solution of problem (2.6). The algorithm operates in conjunction k with two random number generators, which produce sequences (n 1
k and {z f from a series of mutually independent observations overn
and z. Based on these results, the algorithm constructs a random k sequence {y 1 , which is supposed to converge to the solution of
problem (2.6). The k-th iteration of the algorithm consists of
the following operations:
k k 1 . Draw a value nk = (nl,. . . independent of the previous
draws nJlzJ(j < k).
k k 2 . For a given yk determine the parameters yi.ui = Hiyk for the
problems DMi.
k k 3 . Draw a value zk = (zit . . . . zN) , independent of all nJ ( j 5 k) . zJ(j <k).
k 4 . Solve those problems DMi for which ni = 1 , and define
k - k k k Ci = c. hi' yi, Zi) ' 1
k - k k k X. = i. (U 1 it Yip Zi),
k - k k k ui = ui(ui, yip Zi),
k k k k k k k 1 = ~uiqi(ci, u , z k + V fi (ci, u , zi) ii k k k k
1 1 1 + Vu,gi(ci' u. , Zi) Pi. 1 1
1
![Page 163: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/163.jpg)
5 . Compute t h e v e c t o r s
6 . Compute
I Y o , o t h e r w i s e , 0 k
where s k 0 , a n d ok d e p e n d s o n l y o n ( Y , - . . , Y ) - Yheorem I . L e t t h e p r o b a b i l i t y t h a t t h e f o l l o w i n g c o n d i t i o n s a r e
s a t i s f i e d be 1 :
A d d i t i o n a l l y , l e t
Then a l l a c c u m u l a t i o n p o l n t s o f t h e s e q u e n c e { y k i a r e s t a t i o n a r y
( l . e . , b e l o n q t o t h e set Y * = ( y E R : OEa* ( y ) 1 ) w i t h p r o b a b l l l t y 1 , k
a n d t h e s e q u e n c e ( 4 ( y ) 1 c o n v e r g e s .
Proof. Obse rve t h a t
w i t h c k = 6 ( i i y k k k .̂ k k k , y , z 1 , uk = H ~ ~ , x ~ = . ( F I y k r y k , z k ) , u k = b (Hy , y , z ) .
Thus , by v i r t u e o f Lemma 2 ,
![Page 164: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/164.jpg)
k Convergence of the sequence {y 1 satisfying (2.13) - (2.15) to the set Y* follows from results obtained by Nurminski and Ermoliev
(see ref.8 and ref. 1 , Chapter IV, Theorem 5).
The above theorem is of great practical importance in computa- k k tion. Observe that for a given sequence of draws { z 1, {a 1 the
k k algorithm generates a path {y (w) 1 for the stochastic process {y 1 (W denotes the event corresponding to one run of the algorithm).
It follows from Theorem 1 that almost all of the paths taken by
this process converge to the set Y*. Therefore, for any practical
purposes, one can consider only a single trajectory of the process k y ? It is worth noting that neither the values nor the subgra-
k dients of are necessary to generate the path {y (w) 1.
Finally, let us discuss briefly the role played by the binary k variable n . If nk = (1,. . . ,I ) for all k 2 0, then we have to solve
all lower-level problems at each iteration of the algorithm. This
can be costly if we are dealing with a large problem involving many
subproblems DMi. The random variable n has therefore been intro-
duced in order to overcome this difficulty. At the k-th iteration k of the algorithm we solve only those problems DMi for which ni = 1.
k In particular, if a always has only one nonzero component then only
one subproblem is solved at each iteration of the algorithm.
2.4 The penalty method
It was assumed in the previous section that the set R which
satisfies condition 1 is known. This assumption seems to be rather
restrictive and we shall try to avoid the need to invoke it. A
serious difficulty arises at this point; we cannot guarantee that
all points yk belong to the set Yo. In order to overcome this
difficulty we shall introduce artificial variables into the prob-
lems DMi and formulate modified lower-level problems as follows: - (DMi) observe zi and choose ciIvi,wi,ai which s o l v e t h e problem
subject to
![Page 165: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/165.jpg)
The v a r i a b l e s vi,wi a r e t h e a r t i f i c i a l v a r i a b l e s , and t h e c o n s t a n t
R > 0 i s a p e n a l t y c o e f f i c i e n t .
Le t ERi ( u i , y i , z i ) , C R i ( ~ i , ~ i , ~ i ) , G . and R 1
--- be t h e s o l u t i o n s o f DMi- A s b e f o r e , l e t E R ( u r y r z ) =
Z ( u , y , z ) = ( E R l , . . . , 6,). L e t
where
and l e t
L e t u s c o n s i d e r t h e problem
min S R ( y ) . ( 2 . 1 9 )
We assume t h a t w e know an open bounded set 6 such t h a t f o r R s u f - - f i c l e n t l y l a r g e t h e s o l u t i o n o f (2 .19) b e l o n g s t o 2 . W e a l s o
assume t h a t c o n d i t i o n s 2-6 from S e c t i o n 2.2 are s a t i s f i e d f o r DMi
w ~ t h R r e p l a c e d by Q and 2 r e p l a c e d by E R . Note t h a t w e no l o n g e r
assume t h a t f icyO.
![Page 166: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/166.jpg)
Using a method s i m i l a r t o t h a t employed i n S e c t i o n 2 .2 , it - may be shown t h a t t h e f u n c t i o n a R i s weakly convex on 6. More-
o v e r , we can s o l v e t h e problem (2.19) u s i n g a modi f i ed v e r s i o n
of t h e a l g o r i t h m d e s c r i b e d i n S e c t i o n 2.3; a t each i t e r a t i o n o f
t h e a l g o r i t h m w e s o l v e t h e problems D % ~ i n s t e a d o f DMi, and w e
u s e m u l t i p l i e r s cor responding t o (2.16 and (2.17) i n s t e a d o f
m u l t i p l i e r s cor responding t o (2.1 and ( 2 . 2 ) . T h i s a l g o r i t h m k w i l l g e n e r a t e a sequence { y 1 whichconvergeswithprobabilityone
t o t h e set of s t a t i o n a r y p o i n t s o f problem ( 2 . 1 9 ) . However, even
f o r a v e r y l a r g e p e n a l t y c o e f f i c i e n t R t h e s o l u t i o n s o f (2 .19)
a r e n o t n e c e s s a r i l y t h e s o l u t i o n s o f ( 2 . 6 ) , a l t h o u g h t h e e x a c t
p e n a l t y approach is used. T h i s i s due t o t h e s t o c h a s t i c n a t u r e
of t h e problem; t h e o p e r a t i o n o f a v e r a g i n g may smooth o u t t h e
p e n a l t y f u n c t i o n . Thus w e have t o s t u d y t h e r e l a t i o n s between
t h e s o l u t i o n s o f (2 .6 ) and t h e s o l u t i o n s o f (2 .19) . Let be
a s o l u t i o n o f ( 2 . 6 ) and l e t yRcR be a s o l u t i o n o f ( 2 . 1 9 ) f o r a
s u f f i c i e n t l y l a r g e v a l u e o f R .
Lemma 3. L e t t h e f u n c t i o n s ( c i , z i ) + qi ( c i , u i , z i ) be un i fo rmly
bounded from below f o r U E H ~ . Then f o r s u f f i c i e n t l y l a r g e v a l u e s o f R
Proof. Note t h a t iR(Y) 5 4 ( y ) f o r Y & n ~ O . ~ h u s
On t h e o t h e r hand
> R E ~ I S ~ ( H ? ~ , ~ ~ , Z ) 1 , + c o n s t . - (2 .22)
The i n e q u a l i t y (2 .20) f o l l o w s from (2.21) and (2 .221 , and t h e
lemma h a s been proved.
![Page 167: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/167.jpg)
Corollary. For any E > 0
Now let us consider the sequence { y S ~ such that yS = - YR, ' R s - C m .
Theorem 2. Any accumulation point of the sequence iyS} is a solu-
tion of problem (2.6 ) . S S s Proof. Let us = ~y', c (2) = t (u ,y , z ) = (c:(z), . . . ,ci(z)),
R s S oS(z) = 5 (uS,yS,z) = (a:(z) ,..., aN(z)).
s Let y* be a limit point of the sequence {yS!. Obviously, we
can always assume that yS +y*. We shall prove that for almost all
z the sequence toS (2) i converges. It is easy to verify that the
point cT(zi) is a solution of the problem
S S s min [qi(ciru. ,z.) + Rs T . (c. ,ui,yi,Zi) I , 1 1 1 1
where
s s and ~ ~ ( c q ( z . 1 ,ui,yi,zi) = uS(zi). Thus, for any k - > 0
k k k k q . (ci (ti) ,ui,zi) + Rkoi (zi) qi(c;(zi) ,ui,zil +
k k k + R ~ T ~ (c: (zi) ,ui,yi,zi) 5 qi (c: (zi) ,Ui,zi) +
k k s s + ~ ~ " s i z ~ ) + R, ( T ~ (c;(zi) ,ui,y. ,zi) - T~(c:(~~) , U . ,yi,zi)).
Hence
where x:lk lzi) + 0 as s ,k + =. This groperty follows from the con- s s vergence of the sequences !ui(zi)?, lyi(zi)! and the continuity of
![Page 168: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/168.jpg)
the function (ui,yi) +Ti(ci,Ui,yi,Zi). It follows from the last in-
equality that the sequence {uS(z)) converges with probability one.
Consequently, from Lemma 3, lim us (z) = 0 for almost all z. Let
C* (2) be any accumulation point of the sequence {cS (2) 1 . Then
Thus y*€yO. On the other hand
Consequently = O(y*), and the theorem has been proved.
The above theorem shows that use of the penalty method with a
sufficiently large penalty coefficient yields good approximations
to the solution of problem (2.6). Thus we have overcome the last
of the three difficulties mentioned at the end of Section 2.1.
3. THE DUAL APPROACH
3.1 The interaction balance method with feedback
In this section we shall describe a variant of the dual method
for large-scale optimization problems. For simplicity, we assume
that z in ( 1 .3 ) is constant and that the performance index ( 1 . 5 )
and constraints (1.6) do not depend on z. Thus we have to solve
the following optimization problem:
subject to
u - H f(c,u,z) = 0,
If the exact value of the parameter z is known and the functions
involved are sufficiently regular, then the solution of the above
![Page 169: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/169.jpg)
problem can be found by well-known large-scale optimization tech-
niques (see, for example, ref. 5). However, in some control and
large-scale optimization problems we encounter situations in which
these techniques are not applicable. In control problems we do
not usually know the exact values of the parameters z; we can only
observe the behavior of the controlled system. In some large-scale
optimization problems we cannot handle the constraints (3.11, even
though we know z, because the functions (c,u) + f (c,u, z) are non-
differentiable. In both cases we have to content ourselves with
an approximate solution which must, however, satisfy the constraints
(3.1) and (3.2). In this section we develop a hierarchical method
which generates suboptimal solutions of this type. The method is
based on the i n t e r a c t i o n b a l a n c e p r i n c i p l e suggested in ref. 2 and
analyzed in refs. 6,9, and 10.
Let us introduce a coordination variable (price) p such that
p = (pl,. . . ,pN), and let us specify the problems to be solved
by the local decision-making units associated with the subsystems as
follows:
min Li (ci ,ui ,p) (3.3)
subject to
where
We assume that the functions mi are continuously differentiable approximations of the functions (ci ,ui) + fi (ci,u. ,z. ) . Let c (p) =
1 1
= ( E l (p) , . . . ,EN(p) ) and E (p) = (El (p) , . . . ,GN(p) ) be the solutions of problem (3.3). When the control is applied to the real
system described by (3.1 ) it produces interactions u (; (p) , z) . The c o o r d i n a t i o n t a s k is to find price @(z) such that
![Page 170: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/170.jpg)
This principle is an extension of the dual method of optimization
to the class of problems considered in this section. Indeed, ob-
serve that if $ (c ,ui) : fi (ci,ui,z) then the function i i
is the Lagrange function for the original problem. Note also that - condition (3.5) is equivalent to the condition u (p) = Hf ( c (p) ,u (p) ,z) , which is the optimality condition in the dual method of optimization
[5,71. However, if @i(ci,ui) # fi (cilui,zi) then the solution given
by (3.5) is not necessarily optimal, although it still satisfies con-
straints (3.1) and (3.2). Some bounds on the loss of optimality have
been derived in ref. 6, but it is still difficult to convince a mathe-
matician of the validity of the above approach since it is motivated
by preferences which cannot be formalized. However, once the coordi-
nation problem (3.5) has been formulated, we are faced with a well-
defined mathematical problem: find a method for the solution of equa-
tion (3.5). This is not a trivial matter since the functions p + c (p) , p + ; (p) and c + u (c, z ) are generally nondif f erentiable.
In the next section we shall investigate in detail the proper-
ties of the function p - (; (p) ,; ) . In Section 3.3, we shall con-
struct an iterative algorithm based on the properties of this func-
tion which can be used to solve equation (3.5). However, before
proceeding to these problems we shall introduce some useful nota-
tion and make a number of assumptions.
1. For any cEC the equation u = H@(c,u) defines a
unique solution uo(c). The linear operator
I-H@U(c,u) is nonsingular for ~ E C .
We introduce the following notation: x = (c,u) , XEX, Q(x) =
N = 1 qi(cilui), F(x) = u - H@(c,u), D(x) = u - uo(c),
i= 1
D, (x,z) = u - u (c,~). Using this notation we can write the Zower- level problems in the form
min [L(x,p) = Q(x) + <PI F(x)>l (3.6)
subject to
![Page 171: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/171.jpg)
The coordination problem (3.5) takes the form
where.x(p) = ;(p)) is the solution of (3.6).
Let p0 be the solution of the equation F ( x ( p ) ) = 0, and let - 0
xO = x(p 1 . This solution may easily be obtained by the classical 0
dual method. Let O Ox be neighborhoods of p ,xO which satisfy P '
the relation ;(R )CRx. We then make the following assumptions: P
2. The functions Q,F, and g are twice differentiable
on R and the function g is convex. X
3. There exists a constant v ,O such that V2 L(x,p) 1 v I 2 XX for xERx, pEf2 (i.e., the operator Vxx(L(x,p) - vI) is
P positive semidefinite).
3.2 Properties of lower-level solutions
In order to investigate the properties of the function p + x (p) , we have to study the effects of the scalar inequality constraints
g..(x) < 0 (i€{1,NI, j€{ni)) involved in problem (3.6). Let 11 -
and
- We assume that for ~ € 2
ptX = x(p), the linear operator r (p,x) is X
nonsingular. Under this assumption there exist unique Lagrange
multipliers IA ( p ) ~ ~ J which correspond to the constraints in prob- 2
lem (3.6). We define the operator W(p) = Vxx(L(;(P) ,p) +
+ <Q (p) ,g ( x ) > ) . Obviously, from assumptions 2 and 3, W(p) 2 vI for pE2 . - P
1 2 1 2 Let p , P E R ? and let p(t) = (1-t)p + t p . We shall in-
vestigate the behavior of the function t - ; (p (t) ) for tE [O, 1 I . Since the set Io(p(t)) may change as t ranges between 0 and 1,
this function may be nondifferentiable.
![Page 172: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/172.jpg)
Lemma 4 . If the set I. (p (t) ) is constant in a neighborhood
of the point to€ (0,l) then the function t + x (p (t) ) is differ- entiable at to, and
where
, if IO (p) = 0 (3.9) l-'z (rXw-' Tl1-l rXw-' , otherwise.
For brevity we have written W(p) as W and Tx(p,x (p) as rx. Moreover, for any p€Q B(p) = B* (p) and
P'
(N(A) and R ( A ) denote the null space and the range of the operator
A , respectively ) . Proof. The propositions of the lemma follow immediately from the
necessary optimality conditions for problem (3.6) and the assumption - that the operator Tx (p ,x (p) ) is nonsingular .
Observe that the function t+B(p(t)) is in general discon-
tinuous and so the function t + dx (p (t) /dt is also discontinuous.
However, if the set Io(p(t)) changes a countable number of times
as t ranges between 0 and 1, then the functions t +G(p(t) 1 ,
t + D (X (p (t) ) ) are absolutely continuous and
This condition does not seem to be very restrictive in practice
and we shall impose it on our problem.
Since we may assume that D, (x,z) = D (x) , equation (3.12) gives 2 1 a good approximation to the difference D, (G (p ,z) - D, (x (p ,z) .
Thus it seems reasonable to construct an algorithm for the solu-
tion of (3.7) based on equation (3.12) , in a way similar to that
![Page 173: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/173.jpg)
used i n r e f . 12. However, t h i s l e a d s t o a number o f problems. I f 0 p is c l o s e t o p0 t h e n Dx (; (p l ) and Fx (; ( p l I a r e c l o s e t o Dx ( x I and
0 F x ( x ) , r e s p e c t i v e l y , b u t n o s t a t e m e n t o f t h i s t y p e c a n b e m a d e a b o u t
t h e o p e r a t o r B ( p ) . Moreover, it seems u n r e a s o n a b l e t o compute B ( p ) , s i n c e t h i s i n v o l v e s many time-consuming o p e r a t i o n s . I n t h e n e x t
s e c t i o n we s h a l l deve lop an a l g o r i t h m f o r t h e s o l u t i o n of ( 3 . 7 )
which does n o t r e q u i r e t h e computa t iod o f B ( p ) ; t h i s i s due t o t h e
p r o p e r t i e s (3 .10) and (3 .11) of t h e o p e r a t o r B ( p ) .
3 .3 The a l g o r i t h m
Le t E > 0 and l e t A be a s e l f c o n j u g a t e p o s i t i v e l y d e f i n e d
o p e r a t o r such t h a t A : X + X , m I 5 A 2 M A I . We d e f i n e o p e r a t o r s 0 '0- - A - 1
Do = Dx(x 1 , Fo = F X ( x ) , and E = (DOA F:) . L e t u s c o n s i d e r
t h e a l g o r i t h m
I t i s e a s y t o v e r i f y t h a t t h e m a t r i x E i s w e l l - d e f i n e d , s i n c e i t
f o l l o w s from assumpt ion 1 t h a t N ( D o ) = N ( F O ) . W e s h a l l p rove t h a t k k t 1
t h e mapping p + p h a s a f i x e d p o i n t i n a neighborhood of p0 and
t h a t t h e a l g o r i t h m ( 3 . 1 3 ) converges t o t h i s f i x e d p o i n t . To do t h i s
we must make t h e f o l l o w i n g i m p o r t a n t a s sumpt ion .
4 . There e x i s t s 5 > 0 such t h a t f o r PER and v € R ( F ~ * ) P
t h e i n e q u a l i t y d ( v , R ( r ; ( p , x ( p ) 1 ) ) 2 6 I v 1 is
s a t i s f i e d , where d(v ,M) d e n o t e s t h e d i s t a n c e b e t -
ween t h e p o i n t v and t h e s e t M.
W e now i n t r o d u c e a new norm I / 1 ) i n U, d e r i v e d from t h e
s c a l a r p r o d u c t
S i n c e !v(F:) = iO} t h e n t h e norm ' I ' I O i s w e l l d e f i n e d .
Lemma 5 . L e t V ( p ) = p + E ~ ( x ( p ) . Let 1 1 W(p) ( 1 5 Mu f o r pen P '
F i n a l l y , l e t t h e i n e q u a l i t y
![Page 174: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/174.jpg)
1 2 1 2 h o l d f o r a n y p , p Enp. Then f o r a n y p , p E R
P
where
2 Proof. L e t p ( t ) = ( I - t ) p l + t p . L e t S = R (F;) and l e t Ps d e n o t e t h e p r o j e c t i o n on S . W e de-
f i n e l i n e a r o p e r a t i o n s As: S + S , B s ( p ) : S + S , Fs: S - c U , Ds: S + U a s - - f o l l o w s : A~ = P , A I ~ , ~ ~ ( p ) = P ~ B ( P ) i s , F~ = ~~l~~ D~ = Is. Next we d e f i n e
I t f o l l o w s from ( 3 . 1 0 ) t h a t f o r any XES
- 1 - 1 - 1 where s = x - r;(rxw ) rxW x . For b r e v i t y we have w r i t t e n
W ( p ( t ) ) a s W , and T x ( p ( t ) , x ( p ( t ) ) ) a s r x . Hence, from ( 3 . 1 1 ) and
a s s u m p t i o n 4 , we have
?
Thus
f o r XES. 2 1
L e t ; = ( I - E E D O R F;) ( p -p ) . S i n c e N ( D O ) = N(FO) i s , t h e n
E = D A F = F A D Thus
![Page 175: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/175.jpg)
-167-
2 2 1 2 1 2 1 - 1 2 1 i v Io = <p - p , p -p >O - 2~ < p - p , (F;)-'A~ R ~;(p -p +
2 -1 -1 2 1 -1 -1 2 1 + E <(F:) AS R F;(p - p 1 , (F:) AS R Fo ( p - p )>0 . (3.18)
We shall now estimate the components of (3.18). It follows from
the definition of <. and from (3.17) that
Next, we obtain from (3.1 1 ) the inequality
Finally, it follows from the definition of the norm . : I O that
Using the above three estimates, we obtain from (3.18) the in-
equality
where a ( € ) is defined by (3.16).
It is now easy to prove the inequality (3.15). It follows
from (3.12) that
Next, by virtue of (3.131,
![Page 176: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/176.jpg)
The inequality (3.15) follows from (3.19) and (3.201, and the
lemma has been proved.
Now let us define the operation
6(x,z) = D,(x,z) - D(x) = uo(c) - u (c,z) . Theorom 3. Let the assumptions of Lemma 5 be satisfied. Let also
for p ' wP2€i2p. Next, let
0 1 1 ED, (X ,z) ( I O ( r l . (3.22)
We define the following constants: 5 = ( ~ - U ( E ) ) / E , b = ( I E / ~ ~ ,
h = b ~ n , r ( 6 - bi - [ (5 - bi12 - 2h] 1/2) / b ~ . Let the follow- ing conditions be satisfied:
Then equation (3.7) has a unique solution G(z) in K(pO;r) and the k
sequence {p 1 generated by ( 3.1 3 ) converges to ( z ) . Proof. We define the operation V,(p) = p + E ED,(;(^) ,z) . It follows from Lemma 5 and (3.21) that
1 2 1 for any p ,p ER We define the function $: R' -c R , P'
1 )It) = 2 E bL t2 + (DIE) + ~ b i ) t + E n .
1 2 1 1 0 1t follows from (3.25) that for p ,p ER and t 2 7(11~ -p 1 1 0 + P
+ 1 p2- pO 1 0) the following inequality holds:
![Page 177: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/177.jpg)
N e x t ,
1 v , ( p 0 ) - Po / I O 5 EI I = $ ( O ) .
k I n a d d i t i o n t o t h e sequence {p 1 w e s h a l l c o n s i d e r t h e sequence
{ t k } d e f i n e d by t h e fo rmula tk+l = $ ( t k ) , to = 0 . I t i s e v i d e n t
t h a t tk f r , s i n c e r i s t h e s m a l l e r of t h e two r o o t s of t h e equa-
t i o n $ ( t ) = t . We a r e g o i n g t o p rove t h a t ( t k } i s t h e m a j o r i z i n g k k + 1
sequence f o r t h e sequence (p 1 , i - e . , t h a t i p - P k ! ! O ~ t k + l -tk. I t f o l l o w s from ( 3 . 2 7 ) t h a t
L e t us suppose t h a t
f o r 1 5 1 5 k. L e t p ( r ) = ( l - ~ ) p ~ " + r p k , and l e t
0 = r O 5 T~ .. . ' T ~ = 1 . I t f o l l o w s from ( 3 . 2 6 ) t h a t
where
Approachlna t h e l i m i t a s N +-, w e f i n d t h a t s u p ; r i + l - T . ' + 0 and 1
w e o b t a i n t h e i n e q u a l i t y
![Page 178: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/178.jpg)
k+ l k Thus, w e have proved by i nduc t i on t h a t Ilp -p ( t o 5 tk+,--tk
k f o r a l l k 2 0 , and hence t h a t t h e sequence {p I conve rges t o t h e
s b l u t i o n + ~ ( p O ; r ) o f equa t i on (3 .7 ) . I n o r d e r t o prove t h a t t h i s
s o l u t i o n is unique, l e t us assume t h a t V,(p) = 6 f o r some k K ( p O : r ) . 0
Let Eo = 1 1 ; - p 1 1 and tk+l = $ ( tk) . Adopting a method s i m i l a r k t o t h a t used above, it i s p o s s i b l e t o prove t h a t 1 1 p - p ( 1 2 f k - tk.
But 1Fm tk = 1Lm tk = r , and hence 6 5 . F i n a l l y , l e t u s no t e
t h a t i n a s i m i l a r f a sh ion it can be proved t h a t t h e s o l u t i o n p i s
unique i n ~ ( p ' ; ; ) , where ; = ( 5 - b i + [ ( 5 - b i I 2 - 2 h ~ ' ' ~ ) / b ~ is t h e
g r e a t e r of t h e r o o t s of t h e equa t i on $ ( t ) = t. Moreover, i f t h e
a l go r i t hm (3.13) is i n i t i a l i z e d wi th ~ O E i n t K ( p O ; r ) it gene ra t e s -k
a sequence {p ? which converges t o p. Thus t h e theorem has been
proved.
To sum up, we have so lved equa t i on (3 .7) which i nvo lve s non-
d i f f e r e n t i a b l e f u n c t i o n s p + x ( p ) and x + D, ( x , z ) , u s ing a Newton-
l i k e a lgor i thm ( 3 . 1 3 ) . The a lgo r i t hm e x p l o i t e d a number of s p e c i a l
f e a t u r e s of t h e problem under c o n s i d e r a t i o n , t h e most important o f
which were: D,(x,z) = D(x) ( cond i t i ons (3.21) and (3.23) 1 , B ( D o ) =
= N ( G o ) (assumption 1 ) and B > 0 on S (Lemma 4 and ( 3 . 1 7 ) ) . I t is
worth no t i ng t h a t i f w e r e p l a c e (3.23) by a s t r i c t i n e q u a l i t y then
t h e ope ra t i on V, has c o n t r a c t i o n mapping p r o p e r t i e s i n K ( p O ; r ) .
This f e a t u r e is important f o r c o n t r o l problems i n which z is a time-
vary ing parameter , because it is then p o s s i b l e t o t r a c e t h e moving
s o l u t i o n p ( z ( t ) o f t h e nons t a t i ona ry equa t ion D,(;(p) ,z ( t ) = 0
(see r e f . 11 1 .
![Page 179: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/179.jpg)
REFERENCES
Yu.M. Ermol iev , Methods o f S t o c h a s t i c Programming ( i n R u s s i a n ) , Nauka, Moscow, 1976.
V. F i n d e i s e n , M u l t i l e v e l Con t ro l S y s t e m s ( i n P o l i s h ) , PWN, Warsaw, 1974 (German t r a n s l a t i o n : H i e r a r c h i s c h e S teue rungssys te rne , V e r l a q T e c h n i c , B e r l i n , 1 9 7 7 ) .
W. F i n d e i s e n e t a l . , On- l ine h i e r a r c h i c a l c o n t r o l f o r s t e a d y - s t a t e sys t ems , IEEE T r a n s . Automat. Con t r . ( S p e c i a l I s s u e o n D e c e n t r a l i z e d Con t ro l and Large S c a l e S y s t e m s ) , Vol. AC-23 ( 1 9 7 8 ) , 139-209.
W. F i n d e i s e n , F . N . B a i l e y , M. BrdyB, K . Mal inowski , P. T a t j e w s k i , and A. Wotniak, Cont ro l and C o o r d i n a t i o n i n H i e r a r c h i c a l S y s t e m s , Volume 9 i n t h e IIASA I n t e r n a t i o n a l S e r i e s on App l i ed Systems A n a l y s i s , John Wiley, C h i c h e s t e r , 1980.
L.S. Lasdon, O p t i m i z a t i o n Theory f o r Large-Scale S y s t e m s , Macmillan, LJew York, 1970.
L. Malinowski and A. Ruszczyhsk i , A p p l i c a t i o n o f i n t e r a c t i o n b a l a n c e method t o r e a l p r o c e s s c o o r d i n a t i o n , Con t r o l and C y b e r n e t i c s , U-, 2 , ( 1 9 7 5 ) .
M.D. blesarovic e t a l . , Theory o f H i e r a r c h i c a l , i'4ul t i l e v e l S y s t e m s , Academic P r e s s , New York, 1970.
E.A. Nunu insk i , The q u a s i g r a d i e n t method f o r t h e s o l u t i o n o f n o n l i n e a r programming p rob lems , C y b e r n e t i c s , 9 , 1 ( 1 9 7 4 ) , 145-150.
A. Ruszczyhsk i , An a l g o r i t h m f o r t h e i n t e r a c t i o n b a l a n c e method w i t h f eedback ( i n P o l i s h ) , Archiwum Au tomatyk i i T e l e m e c n a n i k i . 2, 1 ( 1 9 7 6 ) , 137-150.
A. Ruszczyhski , Convergence c o n d i t i o n s f o r t h e i n t e r a c t i o n b a l a n c e a l g o r i t h m based o n a n approx ima te m a t h e m a t i c a l model, Cont ro l and C y b e r n e t i c s , 5 , 4 ( 1 9 7 6 ) , 29-43.
A. Ruszczynsk i , C o o r d i n a t i o n of n o n s t a t i o n a r y s y s t e m s , IEEE T r a n s . Automat. C o n t r . , Vol. AC-24 (19791, 51-62.
A.I. Zinchenko, On t h e approx ima te s o l u t i o n of f u n c t i o n a l e q u a t i o n s w i t h n o n d i f f e r e n t i a b l e o p e r a t o r s ( i n R u s s i a n ) , Matem. FCzika, 2, ( 1 9 7 3 ) , 5 5 - 5 8 .
![Page 180: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/180.jpg)
![Page 181: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/181.jpg)
LAGRANGIAN FUNCTIONS AND NONDIFFERENTIABLE OPTIMIZATION
A.P. Wierzbicki International Institute for Applied Systems Analysis Laxenburg, Austria
1. INTRODUCTION
Rapid development and intensive research into nondifferen-
tiable optimization techniques (Balinski and Wolfe, 1975) has
resulted recently in algorithms that are closely related or even
equivalent in the differentiable case to known and effective
techniques of differentiable optimization. A very interesting
quasi-Newton technique for nondifferentiable optimization was proposed and partly investigated in Lemarkchal (1978). To
understand fully possible weak and strong points of quasi-Newton
methods in nondifferentiable optimization, a more exhaustive
study of various relations between nondifferentiable and differ-
entiable problems is needed. Because of the large variety of
nondifferentiable problems, this goal cannot be achieved in a
short paper. However, some theoretical insight can be obtained
by analyzing the most simple type of nondifferentiable problems:
minimize f (x) ; f (x) = max f (x) XEX iEI
n where X is a convex set with nonempty interior in R (possibly X = R"), I is a countable set of indexes (possibly finite). It
is assumed that f is bounded from below on X and that maxiGI fi(x)
-173-
![Page 182: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/182.jpg)
for each x E X is attained at a finite subset A(x) = A C I;
fi:Itn - are twice-differentiable functions. It is not neces-
sarily assumed that the Haar condition is satisfied, that is, if
9 E Arg rnin&axiEIfi (x) , then for any subset C A(f) , the matrix composed of f (52) for i E has its maximal rank. If ix this condition is satisfied, then 52 is uniquely determined by fix(f), i E A(9) only, and some efficient algorithms for solving
the problem (1) are known (Madsen and Schjaer-Jacobsen, 1977);
however, this condition is rarely satisfied in practical problems.
Other conditions of second-order type resulting in the unique-
ness of 52 are further assumed to hold, together with conditions implying the uniqueness of baricentric coordinates in subdiffer-
ential sets.
The functions fi might be assumed convex or not; this prob-
lem is discussed in detail later. The assumptions of countabil-
ity of I and finiteness of A could actually be relaxed, although
this generalization is beyond the scope of this paper. If the
functions fi are not differentiable, it is often possible to re-
formulate the problem (1) by enlarging the set I in such a way that the modified functions fi are differentiable. It would seem,
therefore, that the class of nondifferentiable problems con-
sidered could be extended to cover almost all problems encoun-
tered in practice. However, still more assumptions are needed
for a theoretical investigation: that the activity set A(x) = A
can be determined explicitly for each x E X and that the sub-
differential af(x) of f at x can also be fully determined:
* where fix(x) are the gradients of functions f. at x (written as
1 column vectors, nence the transposition sign * ) . This assump-
tion is not always satisfied in practical problems of nondiffer-
entiable optimization and can even be considered as contradictory
to the very nature of nondifferentiable optimization techniques,
where one of the main problems is to estimate the subdifferential
af(x) without knowing its full description. On the other hand,
![Page 183: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/183.jpg)
in order to obtain a better theoretical insight, it is useful to
proceed in two stages: first, investigate the implications that
A = A(x) and af(x) are known explicitly, then try to relax this
assumption and check for which theoretical properties this
assumption is crucial.
Under the assumption that A = A(x) and af (x) are known ex-
plicitly, problem ( 1 ) is equivalent to a constrained differenti-
able optimization problem which can be studied by introducing a
normal or an augmented Lagrangian function, depending on convex-
ity assumptions. In this way, the relation of nondifferentiable
techniques for solving problem ( 1 ) to known techniques of dif-
ferentiable optimization can be investigated, sufficient con-
ditions of optimality can be studied and some strong (super-
linear or even quadratic) convergence properties of a special
variant of nondifferentiable quasi-Newton techniques can be
deduced. These strong results substantiate the introduction of
a special class of nondifferentiable optimization problems with explicit-
l y known subdifferentials, which are in fact equivalent to differen-
tiable problems and can be solved efficiently by appropriate
techniques--provided the number of elements In the activity set,
IA(x)~, is not too large to make the explicit definition of the
subdifferential computationally cumbersome.
If A = A(x) and af (x) are not known explicitly, or their ex-
pliclt definition is computationally cumbersome, then only some
subqradients g E 3f(x) can be computed without specifying fix(x)
and the baricentric coordinates X.. This constitutes the opposite
class of nondifferentiable optimization pmblems with Implicit subdifferen-
tials. In this class, it is very difficult to construct a quasi-
Newton method that would converge superlinearly since the sub-
gradient g cannoc be used to obtain a sufficiently accurate
approximation of a Hessian matrix that would result in strong
convergence properties; in fact, such a construction seems to be
impossible. However, many algorithms with sublinear or linear
convergence are known for the case of implicit subdifferentials
and these algorithms are in fact more practically efficient for
problems in which the explicit definition of the subdifferentla1
is computationally cumbersome.
![Page 184: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/184.jpg)
2. A BASIC LEMMA
One of the fundamental problems in nondifferentiable opti-
mization is as follows. Given the set af (x) or an approximation
G thereof, expressed by convex combinations of a set of vectors n
gi E R for i E A and by some accuracy parameters a > 0, i E A i - (where A = A(x) , gi = fix (x) , a - 0 if the set af (x) = G is given i - explicitly), check whether 0 E G, and if not, find the vector
3 E G of minimal norm, subject to accuracy corrections if a. > 0. 1
When using quasi-Newton methods of nondifferentiable optimization,
the norm in which 6 is minimized must be chosen accordinq to some 2 other properties of the problem. Therefore, denote flgUti-, =
tg , H-lg>, where H-':R" + Rn is a given positive definite matrix.
The basic problem can be stated as follows:
1 2 minimize (711gl + 1 xiyi) : H i a
The following lemma is actually only an extension of the
results given in Lemarechal (1978), but, since it is fundamental
for the investigation of relations between nondifferentiable
optimization methods and equivalent Lagrangian function approaches,
it is presented here in detail.
L s m 1. The problem (3) is equivalent to the following dual
problem
where 1 1 ~ 1 1 = , HZ>. The equivalence is to be understood in H the sense that, if 9 , y are solutions to the problem (3) with
![Page 185: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/185.jpg)
Lagrange m u l t i p l i e r s x f o r t h e c o n s t r a i n t g*- liaYigi = 0 and A - xO f o r t h e c o n s t r a i n t 1. - 1 = 0 , t h e n TO, I a r e s o l u t i o n s
o f problem ( 4 ) w i t h Lagrange m u l t i p l i e r s Pi f o r c o n s t r a i n t s - - < g . , X > - xO - a . < 0 w i t h ? = - ~ - ' 5 , = - 1 8 1 ~ -
1 - a . y . < 0 . 1 H-1 Z ~ E A 1 1 -
The f o l l o w i n g e q u i v a l e n c e s a l s o h o l a :
- X O = 0 i f any o f 2 . 1 = 0 ;
- g e n e r a l l y , -; > I?.II* 0 and -$ O*- H - IiEAaivi 2 0. Moreover, t h e
s o l u t i o n s ';iO , z, 4 o f t h e problems ( 3 ) , ( 4 ) a r e u n i q u e , whereas
9 is un ique i f v e c t o r s hi = ( -1 , g i ) E R"+' a r e l i n e a r l y indepen-
d e n t f o r i E A, and , g e n e r a l l y , g E ?, where G is a compact con-
vex s e t . Even i f 9 is n o t un ique , it minimizes liEAai7i o v e r
ji E y = ( y : yi 2 0 , l iayi = 1 , Tiayigi = 41. I f A g i s A u n i q u e ,
t h e n , f o r any p o s i t i v e d e f i n i t e H-1 , t h e s o l u t i o n s xo , x, 6 , ? depend L i p s c h i t z - c o n t i n u o u s l y on t h e d a t a g i , a i .
h o f . Both problems a r e convex. Cons ide r f i r s t t h e q u e s t i o n o f
t h e u n i q u e n e s s o f t h e i r s o l u t i o n . Problem ( 3 ) c l e a r l y h a s a
un ique s o l u t i o n 9 i n q and , i f h . a r e l i n e a r l y i n d e p e n d e n t f o r 1
i E A , a un ique s o l u t i o n 9 i n y . Observe t h a t t h e l i n e a r depen- n+ 1 dence o f t h e v e c t o r s hi = ( - 1 , g i ) E R , t h a t i s , t h e e x i s t e n c e
o f l i # 0 such t h a t IiG181ihi = 0 , is e q u i v a l e n t t o t h e e x i s t e n c e
o f i i f 0 s u c h t h a t I i E I x i = 0 , 1 . = -1. . I . , and a . q . = -1. . a . g . , I lfl 1 3 3 lfl 1 1
whlch, i n t u r n , is e q u i v a l e n t t o t h e e x i s t e n c e o f A i = - ( . l i / 2 . ) , 1
A . = 1 , and 9 . = 7 . 1 . g . I f hi a r e l i n e a r l y i n d e p e n d e n t , L i f j 1 1 l f 1 ' 1 1 '
such a s i t u a t i o n c a n n o t o c c u r and a n a r b i t r a r y g . j E I , c a n n o t I '
be a convex combina t ion o f o t h e r q: t h i s i m p l i e s t h e u n i q u e n e s s 1'
of b a r i c e n t r i c c o o r d i n a t e s pi . I f hi a r e l i n e a r l y dependen t ,
choose a minimal s u b s e t ji C A such t h a t q = liExpigi, Iiayi = 1 ,
![Page 186: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/186.jpg)
pi ) 0, and pu t 7i = 0 f o r i g A. I f t h e choice of A is n o t
unique, d e f i n e such a 7 f o r each x; d e f i n e a s e t ? a s t h e convex
h u l l of a l l such 9. S ince a l l y E ? r e s u l t i n t h e same 4, t h e
s e t of op t imal 9 i s def ined by ? = Arq minygliaaiyi; f i s a
compact and c lo sed s e t . Problem ( 4 ) has a unique s o l u t i o n ,. A A A A - - - 1 2 2 1 ' 2 - -
(x,.,,x!, s i n c z xO1 + $Ix1IIH,= xO2,+ $ 1 ~ ~ 1 1 ~ + p l i e s xOlA= xO2 i f ,. A - - - x1 = x2; i f z1 f X2, t hen xo = 0xOl + (1-0)XO2, x = 6xl + ( l - 0 ) x 2
f o r 0 € ( 0 ; 1) would y i e l d a s m a l ~ e r va lue of ZO + $131 i. Consider now problem ( 3 ) and d e f i n e t h e Lagrangian func t i on :
S ince t h e e q u a l i t y c o n s t r a i n t s i n ( 3 ) a r e a f f i n e , each s o l u t i o n
(!,$) of (3) t oge the r w i th t h e corresponding Lagrange m u l t i p l i e r s - -
( x , x o ) a r e s add l e p o i n t s of t h e func t i on ( 5 ) under t h e a d d i t i o n a l A A
c o n s t r a i n t y . > O . Hence, i f 2 , go denote s e t s of p o s s i b l e La- 1 - - A
grange m l l l t i p l i e r s x, xo f o r (? ,e ) E ? x (61, then:
2 x Xo x ? x ( 4 ) = Arg min max L (x ,XOty , g )
= Arg max min L ( z , G , y , g ) ( Z . Z , ) € R ~ + ~ ylO,g€~n
where Arg rnin max is t h e s e t of p o i n t s r e s u l t i n g i n rnin max, e t c .
Compute t h e value g t h a t minimizes L f o r a given x, - - X 0 ' Y;
C l ea r ly , = @ ( x ) = -Hx; t h i s imp l i e s t h a t $ = -H"Q and 2 = I;} i s unique. Moreover, a f t e r easy cornputatlon
![Page 187: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/187.jpg)
I = - 2 - 2 1 2 and a t t h e s a d d l e p o i n t , -$xl A x(! ; ylG"'H-l + Ii,aiji;2since 1 1 3 1 ~ = ~ $ 1 ~ t h i s i m p l i e s z0 = -1Ipl -
H' 1 H A ,. - liari?i. Hence xo is a l s o un ique , X,, = {To]. Obvious ly , x = 0 - 4 = 0 and eo = 0 - - X ' O , 4 = 0 .
- The f u n c t i o n 1: i n ( 7 ) is t h e Lagrangian f u n c t i o n f o r prob-
l e m ( 4 ) . Observe t h a t problem ( 4 ) s a t i s f i e s t h e S l a t e r c o n d i t i o n , -
s i n c e x, = 0 , xO, > 0 a r e a d m i s s i b l e f o r t h e problem and < g i , x l > - - xO1 - 1. < 0 f o r a l l i € A. Moreover, i t i s w e l l known t h a t
1
,. - A A - Arg min max L ( y , x o , x ) = (yo } x {"I x P ,
( , ) R y10
,.
where xo, x a r e unique s o l u t i o n s of ( 4 ) and ? is t h e s e t of
c o r r e s p o n d i n g Lagrange m u l t i p l i e r s . But r e l a t i o n ( 7 ) i m p l i e s
t h a t :
< 'iO * ? = Arg min - - m a x n + l L ( ~ , Z o t Y t < ( ~ ) ) y,O ( x , x O ) € R
- - - rnin n+l L ( y , x 0 , x ) = ' {Go) ? = Arg max - -
yz0 (x ,xo)ER
\ - Hence ? = Y. If Y = i y l , t h e L i p s c h i t z - c o n t i n u i t y o f 2, xo, y ,
and a . r e s u l t s from g e n e r a l p r o p e r t i e s o f s o l u t i o n s of
s e t s o f e q u a t i o n s and i n e q u a l i t i e s - - ( s e e Szymanowski, 1977 and
X i e r z b i c k l , 1978) . Moreover, s i n c e xo is t h e s o l u t i o n o f ( 4 ) , - - - x = O _a - xO 1- - a . i E A , and xo = -min.
,. 1' lEA'zi; i f any o f xi = 0, - - t h e n x = 0 x = 0. Converse ly , x =
,. o IiEAzipi = - m i " . 2 . = - 1EA 1 4 = o * x = o .
![Page 188: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/188.jpg)
A large part of the above lemma is given in Lemarkchal (1978), A A
however, without the full interpretation of zO, h s Lagrange
multipliers for (3) and without the uniqueness or Lipschitz-
continuity arguments. It is also observed in Lemargchal (1978)
that problem (3) is easier to solve computationally than ( U ) ; in
fact, the equation 4 = liEAPiqi defines 4 explicitly, and is treated as a constraint in the lemma only in order to provide an
A
interpretation for z. There exist very efficient algorithms for
solving (3) in 9 and 4, if ai = 0 (see Wolfe, 1976 and
Hohenbalken, 1978); these algorithms can also be adapted to the A A -
case when ai > 0. Once 9 and 4 are defined, x and xo are easily computed.
Lemma 1 also allows a straightforward generalization for
problems with infinite and innumerable variables and constraints
in Hilbert spaces; but this generalization is beyond the scope
of this paper.
3. NONDIFFERENTIABLE OPTIMIZATION WITH EXPLICIT SUBDIFFERENTIALS
3.1 Fundamentals
If the activity set A(x) and the subdifferential 3 f (x) are
given explicitly at each x E X, then the nondifferentiable prob-
lem ( 1 ) is equivalent to the following differentiable one:
minimize x (xo ,x) EXo
I) '
with the activity set A(x) defined equivalently by
PO (x) = max fi (x) } . iEI
![Page 189: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/189.jpg)
If the functions f. are convex, then problem (9) is convex
and clearly satisfies the Slater condition with any x, E X and
xO1 > go (xl ) . Thus, the normal Lagrange function:
has a saddle point (?,20,g) at a solution (kO,k) of problem (9)
with a corresponding Lagrange multiplier 9 , whereas 2 is a solu- tlon of (1 ) and jtO = f (2) = minx= maxiEI f (x) is the minimal
value of f. It is assumed further that P is a unique solution
to (9) and an internal point of X.
If the number 1 1 1 of constraints in (9) is large, then a
purely dual method for solving (9) by assuming arbitrary y =
{yijiEI,yi 2 0 and then minimizing the Lagrangian function (11) is clearly not efficient. But a primal-dual method for solving
(9), which consists of determining the activity set A(x) or an
approximation A thereof and eliminating inactive constraints by
setting yi = 0 for i E I\A, might be quite efficient; it is
shown further that one of these primal-dual methods is probably
the most efficient algorithm for nondifferentiable optimization,
if / ~ ( 2 ) / is not too large.
Suppose yi 2 0 for i E A are chosen in such a way that 7 . y. = 1. Then Lx (y,xo,x) = 0 and -1- 1 0
- - g E 3f(x) . (12) (if A = A(x))
Thus, if only A = A(2) and Pi 2 0, i E A ( ? ) , liEA(2)?i = 1
9 . f . (2) = g = 0 were known, then solving the such that Iia(g) 1
equivalent problems ( I ) , (9) would also be equivalent to mini-
mizing the function:
![Page 190: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/190.jpg)
However, not only are the optimal values Pi not known, but the activity set A(x) also changes, often in arbitrary neighbor-
hoods of 9. Also, the strong activity set
might change in any neighborhood of (y,x). These difficulties
are not uniquely related to nondifferentiable problems; they are
also known in constrained differentiable problems. A typical
way of resolving them (see, e.g., Wierzbicki, 1978) is to con-
struct approximations A of A(x) and S of SA(y) such that yi = 0
for i & A, yi > 0 for i E Sf and
and that, for (y,x) in some neighborhood of (9,B) :
A detailed way of constructing such an approximation is
discussed in Section 4. Here note only that a measure of the
distance from (?,8) to (y,x) is useful when constructing such
approximations. Define
where
![Page 191: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/191.jpg)
Here Lyi is not precisely the derivative of L in yi, but measures
the violation of Kuhn-Tucker necessary conditions for optimality "
of (9,P)--if A(?) C A and Lx 7 0, Lyi = 0 for i E A, w = 0, and
9, 9 are unique, then clearly y = and x = 2 . A lemma on the
estimation of Y(y - 9 , x - P)! by w is given in Section 4.
3.2 Quadratic Approximations
Consider now an approximation of the subdifferential af(x)
by the set G:
and assume that 0 = Iiaqifix(x) E G. Although G is only an
approximation, the relation 0 E G might imply x = 2 provided - that riEAqi (GO (x) - f. (x) ) = 0, since then Lx = 0, L = 0, and
w = 0. This leads to a problem analogous to ( 3 ) : Y i
1 2 minimize (2UgUa-, + 1 aiyi) ; a = f0(x) - fi(x) ;
(yrg)€YG i a
where H-' is a positive definite matrix, not chosen as yet. But,
due to Lemma 1 , ( 1 8a) is equivalent to :
![Page 192: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/192.jpg)
(lab)
- - - - Xo = ( (zo.?) E Itn" : fix (x) x -x0 - x0 (x) + fi (x) 2 0 I i E A)
and the choice of H-' or H is now clear: (18b) is a well-known
quadratic approximation problem for the Lagrangian function (ll),
(see, e.g., Szymanowski, 1977 and Wierzbicki, 1978) and the opti-
mal choice of the matrix H is to approximate the Hessian of the
Lagrangian function (11) as closely as possible,
either by direct computation of second-order derivatives (a
Newton-type method) or, for example, by variable metric techniques
based on the data liESyifix (x) for (y,~) close to ( 9 , 8 ) and S close to SA(9).
Another useful interpretation of problem ( 19a) results from
its relation to the distance w. Observe that the norm used in
(16.1 might be arbitrary and. after a slight redefinition of i y ,
the following specific expression for w can be used: 1
However, this coincides precisely with the minimized func-
tion in (18a) and can be interpreted as follows: given a point
(y ,x) and the set A, w or (:)
can be determined from (1 9b) . By A A - -
solving (18a) in y, new y, x, xo and
![Page 193: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/193.jpg)
2 are found. Clearly, (w) 2 ( 3 ) . On the other hand, ( G ) can
also be interpreted as an upper bound for a new (w) 2, obtained
after x is changed to x + x and y is changed to 9 (here y does
not denote the optimal Lagrange multiplier for the original prob-
lem, but only for its approximation (18b))--see Section 4. An-
other interpretation of Po and (Pi is that both approximate the - gain f ( x ) - f (x + x) of the objective function f--xo is a linear approximation of this gain and (0) a quadratic one.
Clearly, the linear approximation is more optimistic than the
quadratic one, but, because of convexity, the linear approxima-
tion can also give an estimation of the distance f(x) - f(2) from above, thus being more useful for some algorithmic purposes; - moreover, -xo also gives an estimation from above for the new
( w ) ~ , obtained after changing x to x + $ and y to 9 . All these
properties are discussed in more detail in Section 4 ; the above
discussion only justifies the role of quadratic approximations
in nondifferentiable optimization.
3.3 Sufficient and Necessary Conditions of Optimality
The basic necessary condition of 2ptimality for nondifferen-
tlable problems
is known to hold under various assumptions related to the
definition of the subdifferential if(x)--see Clarke (1975),
>ilfflrn (1977) , Nurminskr (1973) . In particular, if the func-
tions fi in the problem (1) are continuously differentiable, then
the function f is weakly convex--see Nurminski (1973)--and
![Page 194: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/194.jpg)
the necessary condition (20) holds with af(2) defined by (2).
If the functions fi are convex, then the condition (20) is also
sufficient. However, even if the functions fi are convex, the
condition (20a) does not necessarily specify 8 uniquely. To
obtain uniqueness, we must either assume strong convexity of fi,
or use the Haar condition. The Haar condition is sufficient for
a unique solution of (20) even in the nonconvex case, but the
requirements of the Haar condition are rather strong. In order
to weaken these requirements, second-order approximations for
nondifferentiable problems might be used.
The next three subsections present a more detailed discus-
sion of the above-mentioned conditions. First, a geometric in-
terpretation of the first-order conditions--the condition (20)
and the Haar condition--is given, with an indication of possible
generalizations of these conditions. Second, a second-order
uniqueness condition for convex problems is derived in a natural
way from the normal Lagrangian function. Third, second-order
sufficient and necessary conditions for the optimality of a solu-
tion P of the problem ( 1 ) in the nonconvex case are derived from
augmented Lagrangian functions.
3.4 First-Order Conditions
The condition (20) can be interpreted geometrically if we
consider vectors hi E Rn+'. of the form:
hi = (-l,fiX(x)) for i E A ( x ) ;
hi = (-l,fix(?)) for i E A(8)
We shall also use the following notation:
eo = (-I,O) E R"+'
![Page 195: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/195.jpg)
where 0 denotes the zero element in Rn. Taking into account (2) , it immediately follows that the necessary condition (20) can be
written equivalently as
The condition (21c) can also be used in infinite-dimensional
spaces. In fact, if f : E + R1 is a subdifferentiable function
defined on a linear topological space El and its subdifferential
af(x) C E* is defined, where E* is the dual space to E, then we
can define -K* = (h E R' x E* : h = a (-1 ,g) , g E af (x) , 3 2 0 1 , n
and eo E -K is equivalent to 0 E a£(!?). Note also that the n
polar cone to -K , defined by = (k E R' * E : <h,k> 2 0 for all h E - R * } , is a conical approximation of the epigraph of the
function f at 2 . The particular sense of this approximation
depends, clearly, on the type of definition of the subdifferen-
tial we use; however, it is not the goal of this paper to pursue
possible generalizations in detail (see Xierzbicki, 1972, for a
discussion of similar ideas in nondifferentiable dynamic opti- n
mization). If E = Rn and -K is given by (21c), then K is the
tangent cone to the epigraph of f (x) = maxiEIfi (x) at 9 . The
geometric interpretation of the first-order necessary condition
(21c) is given in Figure 1: any element of the tangent cone
must have a nonpositive scalar product with the downward pointing
vector eo.
7m interpretation and generalization of the first-order
Haar sufficient condition is given in the following lemma.
L z m 2. The Haar sufficient condition for a point 2 satisfying
(20) to be a unique (local in nonconvex case) solution of the
problem ( 1 ) --usually formulated as the requirement that matrices
F- have maximal rank for all subsets of A ( 2 ) , where F- is a A A matrix composed of vectors fix(?) for 1 E x--can be equivalently
![Page 196: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/196.jpg)
F i g u r e 1 . A geometric i n t e r p r e t a t i o n o f t h e f i r s t - o r d e r n e c e s s a r y c o n d i t i o n ( 2 1c) .
![Page 197: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/197.jpg)
s t a t e d i n t h e f o l l o w i n g form:
A
e E -;* ds !h E -K : < h , k > < 0 f o r a l l k E K , k # 01 0
- Before p rov ing t h e lemma, o b s e r v e t h a t K is d e f i n e d a s t h e
q u a s i - i n t e r i o r o f t h e cone K , p o l a r t o t h e c o n i c a l approx imat ion
2 o f t h e e p i g r a p h of f . T h i s s u g g e s t s t h a t ( 2 2 ) might be used
a s a g e n e r a l i z a t i o n of t h e Haar c o n d i t i o n i n a l i n e a r t o p o l o g i c a l
s p a c e . However, t h i s c o n j e c t u r e i s n o t proven h e r e ; we l i m i t t h e
lemma t o t h e e q u i v a l e n c e o f (22) and t h e Haar c o n d i t i o n i n R".
Observe t h a t i n such a c a s e t h e q u a s i - i n t e r i o r is s imply t h e * i n t e r i o r of K . looof. Note t h a t t h e Haar c o n d i t i o n i m p l i e s n e c e s s a r i l y t h a t
t h e r e are a t l e a s t n + 1 e l ements o f A ( 2 ) . Suppose t h a t t h e r e
a r e less t h a n t h i s , no more t h a n n. According t o (20) t h e r e
e x i s t nonzero ii such t h a t r i a ( % ) i. f lx . ( 2 ) = 0 and t h u s a l l t h e
v e c t o r s f i x ( 2 ) f o r i E A(2) a r e a lways l i n e a r l y dependen t ; b u t i f
no more t h a n n v e c t o r s of n e l e m e n t s e a c h a r e l i n e a r l y dependen t ,
w e c a n n o t form m a t r i c e s o f f u l l rank from a l l c o l l e c t i o n s o f
t h e s e v e c t o r s . Thus, t h e r e must be a t l e a s t n + 1 v e c t o r s f i x ( P )
f o r i E A ( ? ) , and each c o l l e c t i o n o f , s a y , n o f them must be
l i n e a r l y independen t . T h i s i m p l i e s i n t u r n t h a t a t l e a s t n + 1
b a r i c e n t r i c c o o r d i n a t e s A ; must be p o s i t i v e , s i n c e , i f ( ~ ( 2 ) I =
n + 1 , t h e n f o r e a c h j weAhave ( , + f i x ( 2 ) = - 2 . f ( 2 ) # 0 A ,. 3 IX
and l i 2 0 which i m p l i e s h i > 0 . But i f a t l e a s t n + 1 b a r i -
c e n t r i c c o o r d i n a t e s a r e p o s i t i v e , t h e n eo = ( - 1 , O ) is i n t h e A *
' i n t e r i o r o f t h e cone K = cone (-1, f ix ( 2 ) ) . Converse ly , i f eo A *
is i n t h e i n t e r i o r o f K , t h e n a t l e a s t n + 1 o f v e c t o r s f i x ( 2 )
sum up t o z e r o w i t h c o e f f i c i e n t s g r e a t e r t h a n z e r o , and e a c h
c o l l e c t i o n of n o r l e s s of t h e s e v e c t o r s i s l i n e a r l y independen t .
The g e o m e t r i c a l i n t e r p r e t a t i o n o f t h i s r e f o r m u l a t e d Haar
c o n d i t ~ o n is a l s o g i v e n i n F i g u r e 1 . However, we s e e from
Lemma 2 t h a t t h e Haar c o n d i t i o n is v e r y r e s t r i c t i v e : a t l e a s t
n + 1 f u n c t i o n s f i ( x ) must be a c t i v e a t R and , on t h e b a s i s o f
![Page 198: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/198.jpg)
t h e i m p l i c i t f unc t i on theorem, we can determine 9 uniquely from
t h e s e t of n equa t ions f i ( x ) = f j ( x ) I f o r any f i x e d j E A (9)
and f o r n chosen i E A(P) , i # j. However, t h e r e a r e o f t e n c a s e s A *
when I A ( 9 ) I 2 n; then t h e cone K does no t have an i n t e r i o r , t h e
Haar cond i t i on cannot be s a t i s f i e d , and x must be determined on
t h e b a s i s of ou r a d d i t i o n a l information-- this t ime us ing second-
o r d e r cond i t i ons .
3.5 Second-Order Uniqueness Condit ion f o r t h e Convex Case
I f a l l t h e func t i ons f i and t h u s t h e func t i on f a r e convex,
then t h e s t r o n g l o c a l convexi ty o f t h e func t i on f a t 2 s u f f i c e s
f o r t h e uniqueness o f 9. However, t h e func t i on f is no t d i f f e r -
e n t i a b l e and w e cannot u se t h e c l a s s i c a l d e f i n i t i o n of s t r o n g
convexi ty v i a second-order d e r i v a t i v e s . On t h e o t h e r hand, w e
could use t h e Lagrangian func t i on (11) f o r t h e equ iva l en t prob-
l e m ( 9 ) t o c h a r a c t e r i z e t h e s t r o n g convexi ty o f t h e problem ( 1 ) .
h m n 3. I f , a t a s o l u t i o n 9 of t h e problem ( 1 ) s a t i s f y i n g (20)
wi th a f ( 2 ) de f ined by ( 2 ) , t h e fo l lowing mat r ix
i s p o s i t i v e d e f i n i t e , where Qi = xi a r e de f ined (no t n e c e s s a r i l y
uniquely) by t h e b a r i c e n t r i c coo rd ina t e s o f 0 i n a f ( P ) , then 2
i s a l o c a l l y unique s o l u t i o n of ( 1 ) . I f , a d d i t i o n a l l y , t h e
func t i ons f i a r e convex, then 2 i s t h e g l o b a l l y unique s o l u t i o n . A
Moreover, i f t h e v e c t o r s hi = ( - 1 , fix (2) ) f o r a l l i E A (2 ) a r e
l i n e a r l y independent , then t h e b a r i c e n t r i c coo rd ina t e s Pi a r e
a l s o de f ined uniquely.
Proof. I t is known t h a t i f a Lagrangian func t i on has a l o c a l o r
g loba l minimum i n pr imal v a r i a b l e s whi le t h e dua l v a r i a b l e s
s a t i s f y t h e Kuhn-Tucker cond i t i ons , then t h i s minimum is a l s o a
l o c a l o r g l o b a l s o l u t i o n t o t h e pr imal problem. But t h e p o s i t i v e
![Page 199: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/199.jpg)
d e f i n i t e n e s s o f L (P ,kO,!k) s u f f i c e s f o r a l o c a l o r , i n t h e con- X X
vex c a s e , g l o b a l minimum o f t h e Lagrangian f u n c t i o n ( 1 1 ) i n x.
I n xO, t h e c o n d i t i o n liEA(%) 9 . = 1 g u a r a n t e e s a nonunique minimum;
however, xo i s o n l y an a u x i l i a r y v a r i a b l e . A s f o r t h e Kuhn-Tucker
c o n d i t i o n s , t h e y a r e s a t i s f i e d by t h e d e f i n i t i o n o f A(:), t h e
d e f i n i t i o n o f t h e b a r l c e n t r i c c o o r d i n a t e s and by t h e condi- i
t i o n ( 2 3 ) s i n c e pi = 0 f o r i A ( k ) , Pi 2 0 f o r i E ~ ( k ) ,
l i E A ( P ) P i ( f i ( P ) - k O ) = l i E I Q i ( f i ( P ) - Po) = 0 , and L x ( 9 , ~ o l ~ ) =
9 . = 0. The un iqueness o f ZiEA(St)gi f ix(f ) ' '1 L ~ O = - I iEA(R) 1 b a r i c e n t r i c c o o r d i n a t e s y . i f h . a r e l i n e a r l y independen t h a s
been e s t a b l i s h e d i n t h e proof o f Lemma 1 .
Observe t h a t t h e r equ i rement t h a t t h e v e c t o r ( - l , f i x ( k ) ) =
hi i s l i n e a r l y independen t is much l e s s r e s t r i c t i v e t h a n t h e
Haar c o n d i t i o n ; t h i s r equ i rement c a n a l s o be s a t i s f i e d i f
( A ( 2 ) ( < n + 1 . On t h e o t h e r hand, t h e r e q u i r e m e n t t h a t
Lxx (QIPO,k) i s p o s i t i v e d e f i n i t e i s r a t h e r r e s t r i c t i v e and n o t
r e a l l y n e c e s s a r y : it would s u f f i c e f o r t h i s Hess ian m a t r i x t o
be p o s i t i v e d e f i n i t e o n l y i n t h e subspace t a n g e n t t o s t r o n g l y
a c t i v e c o n s t r a i n t s . T h i s , however, i s r e l a t e d t o t h e more
g e n e r a l form o f s u f f i c i e n t c o n d i t i o n s f o r nonconvex problems.
3 . 6 Second-Order S u f f i c i e n t and S e c e s s a r y conditions f o r t h e Nonconvex Case
I f t h e f u n c t i o n s f i a r e n o t even l o c a l l y convex, t h e n t h e
normal Lagrang ian f u n c t i o n ( 1 1 ) f o r t h e e q u i v a l e n t problem ( 9 )
might have no s a d d l e - p o i n t s , b u t o n l y an i n f l e x i o n p o i n t i n x
a t a s o l u t i o n 9. of t h e problems ( I ) , ( 9 ) . However, an augmented
Lagrangian f u n c t i o n might have a s a d d l e - p o i n t .
Xn augmented Lagrangian f u n c t i o n f o r t h e problem ( 9 ) h a s
t h e form
![Page 200: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/200.jpg)
where the operation (.) + is defined by (z)+ = max (0 ,z) and
p , 0 is a penalty coefficient; it can be shown that with p = 0
the function (24) reduces to the normal Lagrangian function ( 1 1 ) . The sufficient conditions for an augmented Lagrangian function
of type (24) to have a saddle-point, resulting in the optimality
of 2, were given by Rockafellar (1974, 1976). However, for the
purpose of this paper, these results must be slightly modified,
since the augmented Lagrangian (24) need not necessarily have a
minimum in the auxiliary variable xo.
L e m 4 . If 9 = arg minx= A (9, p, Po, x) with PO = max fi (9) , iEI p z 0, and 9 = arg max A(y,p,Po,P), then 9 is an optimal
yE~l ' 1 solution of the problems (1 ) , (9) , that is, maxiEI fi (x) ' 20 for all x E X.
h o f . Since 11 is a differentiable function of its arguments,
the unrestricted maximization in y implies that (fi(x) - xo + Yi ) = - Yi for all i E I, which can happen if and only if (see P + P also Wierzbicki and Kurcyusz (1977) for possible generalizations
of this equivalence) Pi 2 0, fi (2) = do for i E A (2) and pi = 0,
fi(2) < do for i & A(P). Thus, yi are Lagrange multipliers for
the problem (9) . If we require, additionally, that tf (9, p ,Po , P I = Y i
1 + P IiEI (fi (x) - Xo - T)+ = 1 - 0 IiaiP)yi = 0, then yi can
also be interpreted as baricentric coordinates; however, we do
not need to state in the lemma that So minimizes A. Since we
assume that 2 does minimize A, we obtain after obvious trans-
formations :
Suppose now that the thesis of the lemma is not true and there
exists an x E X such that fi(x) < 20 for all i E I. However,
this would imply that
![Page 201: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/201.jpg)
-193-
Pi Pi ( f i ( x ) - xO + p)+ < - P f o r a l l i E I
which would c o n t r a d i c t ( 2 5 a ) . Thus t h e t h e s i s is t r u e .
A c t u a l l y , Lemma 4 c a n b e p roven under much more g e n e r a l
a s s u m p t i o n s . Without chang ing t h e p r o o f , X migh t b e any set--
s a y , i n l i n e a r t o p o l o g i c a l s p a c e . The c o u n t a b i l i t y and f i n i t e -
n e s s o f I i s a l s o n o t e s s e n t i a l : I might b e g e n e r a l i z e d t o re-
p r e s e n t , f o r example , any s u b s e t o f a H i l b e r t s p a c e , a s i n
W i e r z b i c k i and Kurcyusz ( 1 9 7 7 ) .
The way i n which Lemma 4 is proven s u g g e s t s t h e f o l l o w i n g
e q u i v a l e n t s t a t e m e n t o f t h e f i r s t - o r d e r n e c e s s a r y c o n d i t i o n ( 2 0 )
f o r t h e o p t i m a l i t y o f s o l u t i o n s o f problems ( I ) , ( 9 ) :
Semm 5. I f 2 i s a n o p t i m a l s o l u t i o n o f problems ( I ) , ( 9 ) , t h e n
t h e r e e x i s t 9 . 2 0 , pi = 0 f o r i ~ ( 9 ) = E i E I : f i ( 9 ) = f ( 9 ) = 1
maxjEI f j ( p ) l s u c h t h a t 9 and p a r e s t a t i o n a r y p o i n t s o f
A ( y , ~ , ~ ~ , x ) w i t h G O = maxiEI f i ( 9 ) and a n a r b i t r a r y p > 0. More-
o v e r , 9 i s a g l o b a l maximal p o i n t o f 11. I f , a d d i t i o n a l l y ,
l ia(p) Pi = 1 , t h e n k0 i s a g l o b a l minimal p o i n t o f ,A .
Proof. I f x i s a n o p t i m a l s o l u t i o n o f ( 1 ) , ( 9 ) , t h e n it s a t i s -
f i e s ( 2 0 ) w i t h 3f ( 2 ) d e f i n e d by ( 2 ) . However, t h e s c a l i n g o f Pi might be a r b i t r a r i l y changed i n ( 2 U ) , s i n c e P i s a n a r b i t r a r y
p o s i t i v e number; hence w e need n o t r e q u i r e t h a t 1 y . = 1 . i a ( B ) 1
I t was shown i n t h e p roof o f Lemma 4 t h a t y i 2 0 , f i ( 9 ) = 80 f o r i E A(.'?) and ?i = 0 , f i ( 2 ) < R O f o r i E ~ ( 9 ) a r e e q u i v a l e n t
t o a s t a t i o n a r y p o i n t o f A i n y ; it i s a l s o a g l o b a l l y maximal
p o i n t , s i n c e .A i s a concave f u n c t i o n o f y , see W i e r z b i c k i and
Kurcyusz ( 1 9 7 7 ) . The c o n d i t i o n 0 E af ( 9 ) o r liEA(R)Pifix(?.) = 0
is e q u i v a l e n t t o t h e s t a t e m e n t t h a t 9 i s - a s t a t i o n a r y p o i n t o f i;i
s i n c e i I x ( 9 , s , j i 0 , k ) = a I i E I ( f i ( P ) - Q0 + -) D + f . LX ( 9 ) = 7 ' i & ( x ) 9 . f . 1 LX (2)
9 . = 1 , t h e n a l s o 2l (9 , D , P , 2 ) = 0 ; s i n c e A i s = 0 - If r i a ( R ) 1 Xo 0 convex i n x,,, t h i s i m p l i e s t h a t R O i s a g l o b a l minimal p o i n t .
![Page 202: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/202.jpg)
The conclusions of Lemmas 4 and 5 might be summarized as
follows: although an optimal solution x of problems ( 1 ) , (9) is a stationary point of the augmented Lagrangian function (24),
the corresponding multipliers Pi maximize this function (without any constraints--the nonnegativeness of pi results from this maximization) and, if 9. are interpreted as baricentric coordinates
1
through appropriate scaling, Po also minimizes this function;
we are not generally certain that 8 minimizes this function. If
it does, this is also a sufficient condition for optimality.
From Lemma 4, a sufficient condition for a local solution
of a nonconvex problem of the type (1) can be derived:
Theorem 1. Suppose that at a given 8 E int X there exist Pi 2 0, 9.f. (8) = 0, where A(P) = Pi = 0 for i d ~ ( 8 ) , such that IiEA(S) lx
{i E I : fi (8) = f (8) = max f. (9) 1. If, for some p > 0, the 1EI 7
following matrix is positive definite:
where S (9) = {i E A (P) : pi > 0) and fIx ( 2 ) denotes the gradient of f; in column form, then 2 is a local solution of problems (1 ) ,
I
(9). WithPO=max. 1€I fi(8), if IiEA(~)Pi= 1 , thenpand (P0,8) correspond to a saddle-point of the function (26) .
Proof. Consider the function
![Page 203: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/203.jpg)
I t h a s been proven i n W i e r z b i c k i (1978) t h a t t h i s f u n c t i o n i s a
l o c a l q u a d r a t i c approx ima t ion from below o f t h e f u n c t i o n ( 2 4 ) ,
t h a t is , t h e r e e x i s t s a neighborhood U ( 9 , P 0 , k ) o f ( 9 , d o , k ) s u c h
t h a t
S A ( Y ~ P ~ X ~ ~ X ) ' A ( Y , P , X ~ , X ) f o r a l l ( y , x o , x ) E u ( p , P O , ? )
where t h e i n e q u a l i t y c a n be r e p l a c e d by e q u a l i t y i f A ( ? ) = S ( 9 ) .
sere 9 , d 0 , k d e n o t e a s t a t i o n a r y p o i n t o f t h e f u n c t i o n ( 2 4 ) .
I f we choose Po = max. f i ( P ) t h e n , under t h e a s sumpt ions o f LEI t h e theorem, 9 and P i n d e e d c o r r e s p o n d t o a s t a t i o n a r y p o i n t ,
w h i l e 9 i s a g l o b a l maximum po in t - - see t h e p roof o f Lemma 5.
However, we d o n o t n e c e s s a r i l y r e q u i r e t h a t Pi a r e s c a l e d t o
liEA(8) P i = 1 , and 20 is n o t n e c e s s a r i l y a s t a t i o n a r y p o i n t : due
t o t h e p a r t i c u l a r form o f ( 2 4 ) , ( 2 7 a ) , t h e i n e q u a l i t y (27b) a l s o
h o l d s f o r a n a r b i t r a r i l y f i x e d PO.
While e v a l u a t i n g t h e Hess ian m a t r i x o f 1' i n x a t 9 . o ,PO ,P, we o b t a i n A:x a s g i v e n i n ( 2 6 ) . I f i t i s p o s i t i v e d e f i n i t e ,
t h e n ; L ~ h a s a u n i q u e l o c a l minimum i n x a t 8 , w i t h y , s , 2 , , f i x e d S
( i t i s e a s y t o check t h a t 2 i s a l s o a s t a t i o n a r y p o i n t o f 11 ) .
With f i x e d 9 , ? , j i 0 t h e i n e q u a l i t y (27b) i m p l i e s t h a t t h e f u n c t i o n
,I a l s o h a s a un ique l o c a l minimum i n x a t 2 . Hence, Lemma 4 c a n
be a p p l i e d w i t h X r e p l a c e d by a neighborhood o f 8 , and we c o n c l u d e
t h a t 8 1s a l o c a l l y o p t i m a l s o l u t i o n o f problems ( 1 ) , ( 9 ) . Observe t h a t t h e un iqueness o f t h e minimum o f i n x w i t h f i x e d
y = 9 d o e s n o t n e c e s s a r i l y imply t h e un iqueness o f x a s a s o l u -
t i o n o f problems ( 1 ) , ( 9 ) . I f we s c a l e pi t o pi = 1 ,
t h e n Lemma 5 i m p l i e s t h a t 9 and ( P O , ? ) r e p r e s e n t a s a d d l e - p o i n t
o f 1. Even i n t h i s c a s e , p i might be nonunique i f S (9 ) # A ( 2 ) , o r i f t h e v e c t o r s :I. = ( -1 , f i x ( 8 ) ) f o r i E A ( ? ) a r e l i n e a r l y
dependen t .
![Page 204: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/204.jpg)
Theorem 1 is in fact only a slight modification and adapta-
tion to the particular problem (1) of the results given by
Rockafellar (1974, 1976); similarly, his results on the second-
order necessary conditions of optimality can be adapted to obtain:
Theorem 2. If 2 is an optimal solution for problems (11, (9)
satisfying (20) with af (2) given by ( 2 1 , then, for any o > 0,
the following matrix
is positive semidefinite.
*of. Following Rockafellar (1974, 1976) we conclude that for
the optimality of 2 it is necessary that the following Hessian matrix
- De positive semidefinite, where x = (xo,x), hi = (-1 , fix(8)) and
A xow compute a quadratic form c (XO ,:) , !AGx (xO, x) > to obtain
![Page 205: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/205.jpg)
T h i s e x p r e s s i o n must be n o ~ e g a t i v e f o r a l l (Go ,?), i n p a r t i c u l a r - if " = 0 ; b u t i f xo = 0 , t h e n
Thus, t h e p o s i t i v e s e m i d e f i n i t e n e s s o f AA is n e c e s s a r y . XX
Theorem 2 a l s o h a s t h e f o l l o w i n g i n t e r p r e t a t i o n . As shown i n
Wierzbicki ( 1 970) , t h e f u n c t i o n :
is an upper bound f o r t h e f u n c t i o n ( 2 4 ) i n a neighborhood
u c q , a 0 , n ) of ( p , a 0 , ~ ) :
A A ( Y , J , x ~ , x ) A ( y , o , x ) f o r a l l ( y , x o , x ) E u ( y , S 0 , 2 )
(30b)
A Thus, t h e npper-bound f u n c t i o n (1 n u s c s a t i s f y t h e second-o rde r
n e c e s s a r y c o n d i t i o n f o r a minimum i n x a t 2 i f 2 i s o p t i m a l .
![Page 206: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/206.jpg)
4. A QUASI-NEWT014 METHOD FOR NOIiDIFFERENTIABLE OPTIMIZATION WITH EXPLICIT SUBDIFFERENTIALS
The possibility of constructing quadratic approximations
to Lagrangian functions for the problems (1 ) , (9) , in both the convex and nonconvex cases, justifies the analysis of quasi-
Newton methods for nondifferentiable optimization. A specific
algorithm of this class, based on a corresponding algorithm for
differentiable optimization described in Wierzbicki (1978), is
presented in this section. For the sake of clarity, only the
convex case is described in detail, although subsection 4.7
implies the possibility of extending this algorithm to the non-
convex case.
4.1 Estimation of the Distance from the Optimal Solution
Most algorithms of nonlinear or nondifferentiable optimiza-
tion produce a sequence (xk,yk1 of approximations of the primal k k and dual solution (d,g). Some measure of the distance of (x ,y )
from (9,g) is explicitly or implicitly used. Here, we shall use k -k k -k the variable wk = I (Lx,L fi with L: = jiWkyrfi(x ) and L =
k k k Yk Y i yi(fi(x ) - 20 (X ) ) , Po (X ) = maxiGI fi(xk)--see equations (16a,
b)--for this purpose. This is justified by the following lemma.
L e m 6 (Wierzbicki, 1978). Suppose 2 is an optimal solution of
problem (9) with convex functions fi. Let 2 E int X I and let 9 be the corresponding vector of Lagrange multipliers, with
jiEIpi = 1. Suppose that the vectors Ai = (-1, fix (9) ) are linear-
ly independent for i E A(2) --hence y is unique--and let the
matrix L = 9.f. (9) be positive definite--hence, 9 is ^xx liss(p) 1 ixx unique. Then there exists a neighborhood U(Q,2) and a constant
o > 0 such that
11 (yk - 9 , xk - 2) 11 - < t, wk for all (yk,xk) E ~ ( ~ , j i )
where wk is defined as in the preceding paragraph, with an k arbitrarily chosen norm, and with Ak > A(x ) .
![Page 207: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/207.jpg)
In p a r t i c u l a r , we can use t h e norm given i n equat ion (19b)
t o o b t a i n wk corresponding t o t h e minimal func t ion i n (18a) .
4 . 2 Approximations of A c t i v i t y S e t s
k k Consider now t h e s i t u a t i o n when (y , x ) a r e given elements k k m
of a sequence (y , x l o . Denote by t h e upper index k a l l va lues k k of func t ions eva lua ted a t (yk,xk) with f o (x ) = xo, e t c . Denote
by Ak t h e approximation of t h e s e t A(2) a s eva lua ted a t (yk ,xk )
and by sk t h e approximation of t h e s e t S ( 9 ) . I f (yk ,xk) converges
t o ( 9 , 8 ) and wk converges t o zero , then t h e fol lowing formulae k k f o r A , S can be used:
where P > 0 i s a chosen cons t an t , depending on t h e s c a l i n g of k k t h e problem; ( c l e a r l y yi E LO , 1 1 bu t £ (x ) = xk can have a r b i -
k k0 t r a r y s c a l i n g ) , and where n f > 0 , q k 0 and q , n k converge t o Y Y
zero but more slowly than wk. For example, formulae of t h e
fol lowing type may be used:
k q f = < f ( w k - l ) i ; n k Y = min (0 .01 , c Y ( w k - l ) * )
( 3 2 ~ )
where S f , Sy a r e chosen cons tan ts : aga in , t h e b e s t choice of
t he se cons t an t s depends on t h e s c a l i n g of t h e problem, t h a t is , on L ipsch i t z cons t an t s f o r func t ions f , o r on t h e norms of
k g rad i en t s fix. But t h e assumption t h a t ' i f , n k converge t o zero
Y more slowly than wk impl ies t h e d e s i r e d r e s u l t sk = S ( g ) , Ak =
A ( R ) f o r s u f f i c i e n t l y small wk even i f t h e Lipschi tz cons t an t s
a r e no t known e x p l i c i t l y . This fol lows from t h e fol lowing lemma:
![Page 208: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/208.jpg)
Lemrn 7 (Wierzb ick i , 1978) . Suppose R i s a unique s o l u t i o n o f
problem (9) and t h e cor responding unique Lagrange m u l t i p l i e r . k k Le t t h e sets A(B) , S ( P ) be d e f i n e d by ( l o ) , (14) and A , S by
k k (32c, b ) w i t h Urnk- (w /qf) = l?-(w / q y ) = 0 , where wk i s
d e f i n e d by (16a, b ) . Then t h e r e e x i s t s a number 5 > 0 such t h a t
f o r a l l (yk,xk) E U ( 9 , 8 ) = ( ( y k , ~ k ) : wk < G}.
However, t h e above r e s u l t s a r e v a l i d independent ly of t h e k norm used when d e f i n i n g w . I f t h e norm (19bl is used and
approximates wk from above, a more u s e f u l e x p r e s s i o n t h a n (32c)
can be ob ta ined . Suppose t h e range o f f , denoted by Rf, can be
e s t i m a t e d . Then, a f t e r some h e u r i s t i c r e a s o n i n g , assuming t h a t a t h e i n i t i a l I x0 ( = Rf,
0 q f = 1 0 - ~ R f and q0 = and e x p e c t i n g
Lk y t h e f i n a l accuracy t o be r e l a t e d t o lxo 1 of t h e o r d e r 1 0 - ~ ~ f ,
t h e fo l lowing e x p r e s s i o n s :
s a t i s f y t h e assumptions and r e s u l t i n n: = lo-%£, q k = i f Y
= 1 0 - ( ~ f . Th is means t h a t a f u n c t i o n f i such t h a t
£ ( P I - f i ( f ) l ~ - ~ R f might s t i l l be inc luded i n t h e probably
a c t i v e s e t and a Lagrange m u l t i p l i e r w i t h 9 < lo-' might be k excluded from t h e s t r o n g l y a c t i v e s e t S . However, t h i s can be
c o n s i d e r e d a s an a c c e p t a b l e r i s k - - p a r t i c u l a r l y s i n c e it w i l l be
shown l a t e r t h a t t h e e x a c t e s t i m a t i o n of a c t i v i t y ( 3 3 ) does n o t
i n f l u e n c e t h e s imple convergence of a l g o r i t h m s and is needed
o n l y when e s t a b l i s h i n g s u p e r l i n e a r o r q u a d r a t i c convergence.
![Page 209: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/209.jpg)
4.3 A Quadratic Approximation Algorithm for Nondifferentiable Optimization With Explicit Subdifferentials
The algorithm minimizes a function f(x) = maxiEI fi(x) for
x E R", where a minimal point 2 is supposed to exist (a modifica- tion for the case x E X where X is a compact convex set is pos-
sible but not described here). The functions fi are assumed to
be convex and twice-differentiable. It is also assumed that the k k k k k k values f(x ) = f , fi(x ) = fit fix(x ) = fix can be computed
for i in any subset of I. The algorithm is based on quadratic
approximations ( 18a, b) to the Lagrangian function ( 1 1 ) . Sub-
routines for a variable metric approximation of the Hessian
matrix of this function (discussed in Section 4.5) and for a
directional search (described, for example, in Appendix 1) are
assumed to be available.
Step 0. Choose parameters x1 - initial guess of the solu- tion, supplied by the user; Rf - estimated range of the function values, supplied by the user; cff - final accuracy of function values, supplied by the user (suggested cf = 1 o - ~ R ~ ) ; y E (0 ; 1 )
- desired rate of convergence of gradient values (suggested -( = 0.1 ) ; ma E (0 ; 0.5), % E (0.5 ; 1 ) - linear search parameters (suggested ma = 0.3, % = 0.7) ; H ' - initial approximation of the
1 = Rf, yi = - Hessian (suggested HI = I). Set xo , i E ~ , k = l . 1 I ;
Step 1. k k Compute of, q k from (32). Compute fk and f for
k k l k i E I and determine the sets A and S (32a, b), saving only f, k k k for i € A . Compute fK and oi = fk - fi for i E A . Set-yFL= k 1X k
0 for i A , rescale proportionally the remaining yi to obtain 7 = 1 Compute wk (19b). If (w*)~ < 1tk-l < cff, stop. - i a k Yi
k If k > 1 , update H .
k Step 2. Solve the problem (18a) to obtain 9 , tjk, compute 2k 2k x , x0 from Lemma 1 and i?* from ( l9c) .
k -k Ilk- 1 4 t e p - l e t T = 1) i = xk + If x o 1 ~ 1 ~ ~ I 11 Rf and w 2 vw 1s not satisfied, compute zk = f(xk). 1f either
![Page 210: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/210.jpg)
k̂ Ak- 1 < y 4 Rf and wk 2 yw k- 1 , set x k+l - - ;kt yk+l , ?kt or Ix0l I y/x0 I -
k: = k + 1, and go to Step 1.
Step 4. Perform a linear search for rk such that:
(or any other rk' resulting in f(xk + rk'$k) < i(xk + ,*Ck), where T~ satisfies (34b), see Appendix 1). Set xk+' = xk + T ~ $ ~ ~ k+ 1 - yk + ~ ~ ( 9 ~ - yk), k: = k + 1, and go to Step 1. Y -
Comments. Observe that all f: for i E I must be evaluated k when computing f . It is best to combine this with the deter- k k k k mination of sets A , S , saving only fi for i E A . But it is
not known whether rk = 1 will be accepted when checking condi- 4c tion (34a). Therefore, if Ix,, is already small enough and
decreases and the desired convergence rate Y for wk is attained,
rk = 1 is accepted without checking. In fact, wk is computed
only for this purpose--and to double-check the stopping test. k Other redundant information, such as the sets sk, the values 4 , ,. k w , or even the rescaled values yt, need not be computed if the
computation of wk were deemed unnecessary. However, this in-
formation is valuable in analyzing the algorithm and in possible
debugging.
A full analysis of the simple convergence of the algorithm
is omitted here, since the proof of the following theorem can be
easily derived from results given in Lemarhchal (1978),
Szymanowski (1977), or Wierzbicki (1978). It is only necessary Lk to note that (wk) * < I$:-' 1 will eventually be satisfied if /xu 1
k k- 1 converges to zero (see Section 4.4) , and that w ( YW implies 4c convergence if Ix I is small enough and decreases. The double- 0
check in Step 3 is also redundant, since the linear convergence
of 1 1 alone implies convergence of the algorithm in the con- vex case; but the algorithm is also designed to be used in cases
of only local convexity.
![Page 211: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/211.jpg)
Theorem 3. Suppose 2 is the unique minimizing point of f (x) =
maxieI fi(x), where fi are twice-differentiable functions, and
let the vectors hi = ( - 1 , fix ( 2 ) E R"+' be linearly independent
for i E A(B) = {i E I : f (2) = f (2) 1 . This implies that the
corresponding Lagrange multiplier vector 9, 9. > 0 for i E A(2), 1 -
Pi = 0 for i A(?.) and 1. Pi I 1, is also unique. Let - ixx - [i,(P,9ifixx(B) be positive definite. Let U(2) be a
neighborhood of point 2 such that the (not necessarily convex)
function f has no generalized subdifferentials containing zero
other than at point x = 2; if f is convex, let U(2) = fin. Let
the matrices H~ be uniformly positive definite. Then, for any
x1 E U (2) , the sequence (yk,xk) generated by the above algorithm with E~~ = 0 converges to the point (P,?).
To prove the theorem, combine the results given, for example,
in Lemarkchal (1975) and Wierzbicki (1978).
4.4 Properties of Quadratic Approximations to Laqrange Functions
lWo basic properties of quadratic approximation problems
(18a, b) are iiilportant for the superlinear or quadratic con-
vergence of the above algorithm:
L e m 8. Let the assumptions of Theorem 3 and Lemma 3 hold.
Then there exists a neighborhood U (Q,R) of (P,?) and a number
13 > 0 such that, for any (yk,xk) E U(9,2), problems (18a) (18b) -k k k -k have solutions with x , = y + y satisfying the following
inequality:
where wk is defined as in (16a, b) with any norm, for example,
the norm ( 19b) . For a general proof of the lemma, see, e.g., Wierzbicki
(7978) ; when using the norm (19b) for wk the proof becomes quite
straightforward.
![Page 212: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/212.jpg)
L e m 9. L e t t h e assumptions o f Theorem 3 and Lemma 3 ho ld . k+l -
Suppose t h e s o l u t i o n s o f ~ r o b l e m s ( l a d ) * ( l a b ) d e f i n e x - xk + gk, ykcl = yk + pk. Then wkcl d e f i n e d a s i n (16a , b )
w i t h any norm s a t i s f i e s t h e f o l l o w i n g i n e q u a l i t y a t t h e p o i n t
( Y k+ l , x k + l )
k k k f . ( x ) and o ( z ) d e n o t e s a f u n c t i o n w i t h t h e where IXX = riUkyi IXX
p r o p e r t y t h a t lmI1 ,,, +(, o(z)/l lzIl = 0.
For a g e n e r a l proof o f t h e lemma, s e e , e . g . , Szyrnanowski
(1977) and Wierzb ick i (1978) ; a g a i n , t h e p roof c a n be s i m p l i f i e d k+ 1
by c o n s i d e r i n g t h e p a r t i c u l a r norm (19b) f o r w . f u r t h e r c o n c l u s i o n s can be drawn from a more d e t a i l e d
a n a l y s i s o f Lemmas 1 , a , 9 u s i n g t h e s p e c i f i c norm (19b) f o r w.
For example, t h e g e n e r a l r e l a t i o n (36) c a n be t r ans formed t o :
which i n d i c a t e s t h a t , f o r ( y k , x k ) i n a neighborhood o f ( 9 , 9 ) and k Lk f o r t h e norm o f ( H k - Lxx)x s m a l l enough when compared t o t h e
kLk norm of H x , t h e i n e q u a l i t y (w k + l ) -Lk h o l d s . More g e n e r a l l y , x;k Lemma 9 i n d i c a t e s t h a t t h e norm of ( H E - xx)?k i s r e s p o n s i b l e
f o r t h e speed o f convergence o f q u a d r a t i c approx imat ion a l g o r i t h m s .
4 . 5 P r o p e r t i e s o f V a r i a b l e M e t r i c Approximations
A v a r i a b l e m e t r i c Hk shou ld approximate t h e Hessian m a t r i x
![Page 213: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/213.jpg)
S i n c e A(x) changes i n every neighborhood o f 9 , it i s n e c e s s a r y k t o d e f i n e s e t s A w i t h t h e p r o p e r t y = A(P) even i f e v a l u a t e d
a t ( y k , x k ) i n a neighborhood o f ( P , P ) . I f t h e f o l l o w i n g ( n a t r i x -
va lued) f u n c t i o n i s def ined :
t h e n t h i s f u n c t i o n i s con t inuous i n ( q k , x k ) and ix ( 9 . 2 ) = ixx; -k moreover, i t can be shown t h a t Lxx can be used i n Lemma 9 in -
k s t e a d of Lxx. I t i s t h i s m a t r i x i k x t h a t can be approximated by
a v a r i a b l e metric techn ique .
A t y p i c a l v a r i a b l e m e t r i c approximat ion of t h e ( n * n) k m a t r i x t X is based on a set of d a t a { s J , r1 1 j=k-N+, such t h a t :
where o ( - ) is a f u n c t i o n converging t o z e r o f a s t e r t h a n t h e norm
of i t s arguments. The number o f d a t a v a r i e s ; c l e a r l y N 2 n is
r e q u i r e d f o r a s e n s i b l e approximat ion. The d a t a s J , rJ r e l a t e d
t o t h e f u n c t i o n ixx can be d e f i n e d by
Observe t h a t r J # - $-' = r1 Aj ( ] - I ) * ..j-1 + IIEAky f ix ; i f - g
were used i n s t e a d of r J , t h e requ i rements (38b) c o u l d n o t be
s a t i s f i e d , s i n c e t h e d i f f e r e n c e between them converges t o z e r o A .
only a s f a s t a s 7'. The m a t r i x H~ approximat ing itx i s now con-
s t r u c t e d i n a way t h a t g u a r a n t e e s t h a t :
![Page 214: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/214.jpg)
under va r i ous a d d i t i o n a l assumptions. I n t h e most widely used
rank-two v a r i a b l e metric procedures , an i n c r e a s i n g l y a c c u r a t e
d i r e c t i o n a l s ea r ch r e s u l t i n g i n a lmost con juga te subsequent
d i r e c t i o n s of s ea r ch is needed t o guaran tee (39b) . I f a rank-
one v a r i a b l e metric procedure is used, r e l a t i o n s (39a, b ) a r e
independent of t h e s t e p - s i z e c o e f f i c i e n t s and of t h e choice of
d i r e c t i o n s ; on t h e o t h e r hand, a rank-one v a r i a b l e metric approx-
imat ion Hk may no t be p o s i t i v e d e f i n i t e even i f itx a r e p o s i t i v e
d e f i n i t e . However, t h e r e a r e s p e c i a l v a r i a n t s of t h e rank-one
v a r i a b l e metric t h a t guaran tee t h a t Hk w i l l be p o s i t i v e d e f i n i t e
(Kreglewski, 1977) . n I f N 2 n and t h e d a t a ( s J span R , t h en it can be
shown (Kreglewski, 1977) t h a t t h e r e l a t i o n s (38b) and (39a, b )
imply t oge the r t h a t
. A .
I f s3 = ,'-I, =hen t h e e s t ima t e (40) t o g e t h e r w i th (36) from
Lemma 9 r e s u l t s i n t h e s u p e r l i n e a r convergence of a q u a d r a t i c
approximation method (see next s e c t i o n ) . Note, however, t h a t
e s t ima t e (40) does no t imply (a l though it is impl ied by) -k k
limk+O I L x x - H 1 = 0; on ly r a t h e r s p e c i a l t ypes of v a r i a b l e me t r i c procedures approximate i n t h e norm. This is
X X why t h e q u a d r a t i c convergence of a quasi-dewton method can be
-k ob t a ined i n p r a c t i c e only when tik = Lxx i s computed e x p l i c i t l y .
4.6 Supe r l i nea r and Quad ra t i c Convergence o f Q u a d r a t i c Approximation Methods
Lemmas 8 and 9 , t o g e t h e r w i t h t h e p r o p e r t i e s of v a r i a b l e k me t r i c H , r e s u l t i n t h e fo l lowing theorem:
![Page 215: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/215.jpg)
Theorem 4. Let the assumptions of Theorem 3 and Lemma 7 hold.
Then, for any desired convergence rate y E (0;1), there exists a
number 5. = S(y) > 0 and a neighborhood U(Q,P) of (P,k) such that, k ^k if (y"xk) E U(Q,i) and Il(ik - H )x 1 2 cwk, then wkfl c yw k
XX
and 1 F:" 1 c y 1 1 and the algorithm from Section 4.3 converges 0 at the desired rate. If
then the algorithm converges superlinearly, lim (wkf '/wk) = 0. 'k k+-
If Lxx = H* and the second-order derivatives f i x ) , i E A ( P ) , are Lipschitz-continuous, then the algorithm converges quadrati-
cally, limK+= sup (wk+1/(wk)2) = a c +=. The proof of the theorem is quite standard--see, for example,
the proof of Theorem 1 in Wierzbicki (1978) --and is omitted here.
It is worth noting that practical experience with quadratic
approximation methods shows that they are the most efficient
algorithms for constrained differentiable optimization
(Szymanowski, 1977). A similar performance might be expected
from the algorithm given in Section 4.3, since it is only an
adaptation of quadratic approximation methods to the special
class of nondifferentiable problems. Moreover, the author's
attention was recently drawn to a paper (Iwladsen and Schjaer-
Jacobsen, 1977) describing an algorithm similar in nature--though
different in many details and in the theoretical justification--
to that of Section 4.3, for the same class of problems; the
results of numerical tests given by Madsen and Schjaer-Jacobsen
confirm that the algorithm given in Section 4.3 should be very
efficient in practice.
4.7 Nonconvex Nondifferentiable Optimization With Explicitly Given Subdifferentials
Following the results given in Section 3.7, it is possible
to derive a quadratic approximation algorithm extending the al-
gorithm from Section 4.3 to even the locally nonconvex case.
![Page 216: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/216.jpg)
The algorithm uses the sets sk (defined redundantly in algorithm 4.3) in order to determine convexifying terms for the quadratic
approximation problem (lab), which now takes the form:
minimize 1 -k k-k ( 2 + 2 <X , H x > + p 1 (qfk zk - -k iESk 2 ix X ~ ) (2; ,
or, equivalently:
minimize (;:(I + p 1 k 1 k k 2 kai) + T ~ / S 1 (x0)
(24 , irk) 4 i s
k k where Fk is a matrix composed of elements fk for i E S , a . =
k k 1X 1 fk - fi, IS 1 is the nurnber of elements in sk, and
k It is interesting to note that, if sk = A and all con-
straints are active for a solution of (411, the problem is fully
equivalent to a dual problem as in Lemma 1; in all other cases
the dual problem for (41) is more complicated, but might lead to
interesting results. A quadratic approximation algorithm re-
quires a variable metric approximation either of the matrix k k
FIk Iiakyifixx, Or of the matrix Eik + OF**F~; the latter is positive definite, if the second-order sufficient condition of
optimality is satisfied. Under this assumption, the superlinear
convergence of the algorithm can also be proved for the noncon-
vex case by a modification of results given in Wierzbicki (1 978) .
![Page 217: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/217.jpg)
4 . 8 Nondifferentiable Optimization With Implicitly Given Subdifferentials
A large number of algorithms has been proposed for the more
general class of nondifferentiable optimization problems in which
af(x) are not given explicitly and it is possible to compute only
function values £ ( x ) and subgradients g E a £ (x) without any more specific knowledge of their baricentric coordinates (see, e.g.,
Mifflin, 1977). This is largely because such problems arise
quite often in large-scale optimization algorithms, as well as
in many other cases. However, in most such problems some addi-
tional knowledge of baricentric coordinates, etc., is implied by
the specific nature of the problem; ignoring this information is
a simplification resulting in more straightforward, but less
effective, algorithms.
The first quasi-Newton algorithm of this type, based in
fact on results closely related to Lemma 1, was given with con-
vergence proofs by Lemarkchal (1975). However, Lemarhchal did
not specify what the matrix Hk should approximate; it was re-
quired only that Hk should be uniformly positive definite, which
is sufficient for simple convergence. The results given in
previous sections of this paper make it clear that H~ should
approximate (in the sense described in Section 3.6) either the k k Hessian liEAkYifixx or, in the nonconvex case, the augmented
Hessian of type (30) . But the results of previous sections also show that such an
approximation is actually impossible if no additional knowledge
of the baricentric coordinates is assumed. The use of consecu- k tive gk E 2f (x ) gives no second-order information, if gk =
k f k where ii might be arbitrary, not even converging lia(xk) 1 ix to the optimal baricentric coordinates Pi (if they are unique) if xk converges to 2 . The use of the elements fjk closest to zero
as a convex combination of previous gj, j = O,l, ..., k gives more k information, at least if 4 converges to zero, because then some
corresponding baricentric coordinates should converge to Pi; but k later 4 yield averaged information related to many previous xJ,
j = 0,7, ..., k, and it is difficult to extract from these the current information related to xk that is necessary for a variable
metric approximation.
![Page 218: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/218.jpg)
The above remarks do not prove that it is impossible to
construct a superlinearly convergent algorithm for nondifferen-
tiable optimization with subdifferentials given only implicitly;
however, they do show that some stronger assumptions, either re-
lated to the choice of subgradients or to the basic nature of
the problem, are necessary. For example, if the Haar condition
is satisfied, then even a linear approximation algorithm could
be superlinearly convergent. However, it is clear that the prob-
lem of obtaining superlinearly convergent algorithms in cases of
nondifferentiable optimization with implicitly given subdifferen-
tials requires further study.
4.9 Other Extensions and Research Directions
Some of the results given in this paper, for example,
Lemmas 1 and 2, can be generalized for problems with infinite or
innumerable constraints. The continuous minimax problem
minimize max f (x,z) x€x zEZ
can be approached in this way, and, in the convex case, should
not present great difficulties; the nonconvex case is, however,
essentially more complex, since only a partial generalization of
the augmented Lagrangian theory to infinite-dimensional spaces
is now available--see Wierzbicki and Kurcyusz (1977).
APPENDIX 1
An Efficient Line-Search Method for Nonsmooth Optimization
It is assumed that, at a given point xk, a search direction Lk k x and a linear estimation of the difference f (xk + ?k) - f (x ) Lk Lk x0 < 0 are given. Function values fri = f (xk + rix ) are com-
puted in order to find ff = min fri and rf = arg minri fTi, ri
where ri are elements of a specially generated sequence. The
sequence {ri} starts with ro = 1 (or, optionally, with the value
accepted for rf in a previous run of the line-search algorithm).
The sequence {ri} ends with a value r = r which satisfies two g i
conditions:
![Page 219: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/219.jpg)
where 0 < ma < m, < 1 ; s u g g e s t e d v a l u e s f o r ma and m, a r e ma =
0 .3 , mb = 0.7. To g e n e r a t e t h e sequence , an expansion o r con-
t r a c t i o n r a t i o r is a l s o used; s u g g e s t e d v a l u e r = 10.
The a l g o r i t h m is as fo l lows :
( i ) Compute f , . . I f f , < f f , s e t r f : = r i , f f : = f . I f 1 ' i 1
f T , s a t i s f i e s ( a ) and ( b ) , s t o p . 1
(ii) I f f. does n o t s a t i s f y ( a ) , s e t = T . . I f ui = 0 ! i 1 i + l
+ = I . I £ u i = + I , s e t = 2 . o r 'ul = - 1 , s e t u .
(ill, I f f - does n o t s a t i s f y ( b ) , s e t rmin: = T , . I f di = 0 ! i
o r ill = +I, s e t o i + l : = + I . o i = -1 , s e t o i + l = 2.
i + 1 i + 1 ( i v ) i f , W = I , s e t ? i l l : = rd T ~ . 1f ui" = 2 , s e t
T . 1+1 ' - = (T T )A max min . S e t i : = i + l , g o t o ( i ) .
Comment: d i + l = 2 1 means t h a t T ~ + ~ s h o u l d be i n c r e a s e d o r de-
c r e a s e d by a f a c t o r of r . J i+l = 2 means t h a t bo th a lower
bound T~~~ and an upper bound T~~~ f o r ;f have a l r e a d y been
found and t h c y shou ld be t i g h t e n e d by computing T ~ + ~ a s t h e i r
g e o m e t r i c a l mean. The l a s t v a l u e T of T . , whlch s a t i s f i e s (a) 9
and (b) , o f t e n g i v e s u s e f u l ~ n f o r m a t l o n . I f some e x t e r n a l
bounds lmit t h e v a l u e of :i, t h e a i g o r ~ t h m must be modif ied
a c c o r d i n g l y .
![Page 220: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/220.jpg)
REFERENCES
Balinski, M.L. and P. Wolfe, eds., Nondifferentiable Optimization, &themt ical P z w g d n g Study 3, North-Holland Publ. Co. , Ameterdam, 1975.
Clarke, F.H., Generalized Gradients and Applications, T m a c t w n s o f the American k t h e m t i c a l Society, =(I9751 , 247-262.
Hohenbalken, B. von, Least Distance Methods for the Scheme of Polytopes, &themt ical Pzwgrcmdng, 2(1978), 1-11.
Kreglewski, T. and A. P. Wierzbicki, M h e r Pzwpertiee and Mdifica- t w n s o f the Rmk-One Variable Metric Mthod, International Con- ference on Mathematical Programming, Zakopane, 1977.
~emar&chal , C . , Nondifferentiuble Optimization: Subgradient and c-Subgradient Methods, Lecture Notes: Numerical Methods in Optimization and Operations Reeearch, Springer Verlag, 1975, 191-199.
Lemar&chal, C. , Nonenvoth Optimization and Descmt Methods, RR-78-4, International Institute for Applied Systems Analysis, Laxenburg, Auetria, 1978.
Madsen, K. and H. Schjaer-Jacobsen, Linearly Constrained Mini- max Optimization, Mzthemtical Pzwgramning, E( 1977) , 208-223.
Mifflin, R., An Algorithm for Constrained Optint iat ion With Semismoth ELmtwns, RR-77-3, International Institute for Applied Systems Analysis, Laxenburg, Austria, 1977.
Nurminski, E.A., The Quasigradient Method for Solving Nonlinear Programming Problems, Qbernetics, 9, 1 (1973) , 145- 150.
Rockafellar, R.T., Augmented Lagrange Multiplier Functions and Duality in Nonconvex Programing, SIAM Journal on Control and Optinriaation, - 12 (1974), 497-51 3.
2.ockafellar, R.T . , Lagrange 1.lultipliers in Optimization, SIAM-AM Pmceedings, z(1976) , 145-168.
Szymanowski, 3. et a1 . , Computational ~ e t h o d s o f Optimization. Basic ,?esearch and NwnelcicaL Tests (in Polish) , Research Report, Institute of Automatic Control, Technical University of warsaw, 1977.
wierzbicki, A.P., &ximum Principle for Semiconvex Performance Functionals, SIAM Jouznul on Control and Optimization, 2 (1 972) , 444-459.
Wierz~icki, A.P. and St. Kurcyusz, Projection on a Cone, Penalty Functionals and Duality Theory for Problems with Inequality Constraints in Xilbert Space, SIAM Journal on ControL and Q t i m i z a t w n , s(1977), 25-26.
![Page 221: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/221.jpg)
d i e r z b i c k i , A. P. , ,.I & x i r a t i c A?pmximation ,Method Sased 9n Augmented Lugranqian L%rtctions t o r .Vonconvez Yonlinear F'wqramnng Wble rns , WP-78-61, I n t e r n a t i o n a l I n s t i t u t e f o r A p p l i e d S y s t e m s A n a l y s i s , L a x e n b u r g , A u s t r i a , 1 9 7 8 .
W o l f e , P . , F i n d i n g a Nearest P o i n t i n a P o l y t o p e , MathematicaZ P m g d n g , = ( 1 9 7 6 ) , 128-149 .
![Page 222: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/222.jpg)
![Page 223: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/223.jpg)
BIBLIOGKAI'HY ON NONIlIFFERENTIAT3IE: OPTIMIZATION
E. Nurminski
lnternatlonal lnst l tute for Appl~ed Systems Analysis.
Laxenburg, Austrla
T h ~ s 1s a research btbllography wlth all t h e advantages and shortcomings
t h a t thls l m p l ~ e s The author has used ~t a s a blbllographlcal d a t a base when
wrltlng papers , and ~t IS therefore largely a re f lec t~on of h ~ s own personal
research ~ n t e r e s t s However. ~t 1s hoped tha t t h ~ s b~bllography w~ll nevertheless
be or use t o o thers ~ n t c r c s t e d in riond~fTcrent~able opt~mLzatlon
1 MONOGRAPHS
I'hls sectlon contalns monographs related to nondlflerentlable op t lmlza t~on
It also contalns b l b l ~ o g r a p h ~ e s on nondlflerentlable optlrnlzatlon whlch a r e not
p a r t s of books, ar t lc les ~n journals, r e p o r t s e t c
M A ibzerman, E ?v! Braverman, and 1. I Rozonoer, Potent~al PlLncttons
Method tn Uachtne Learning Theory (zn Russzan). Nauka. ?.loscow (1970)
J -P Aubln, !Aathemattcal Methods of Came and Economzc Theory. North-
Holland. Amsterdam (:979)
![Page 224: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/224.jpg)
J.-P. Aubin. Methodes Ezpl ic i tes de L'Optimisatinn, Dunod, Paris (1982).
J.-P. Aubin. Lecons d'llnalyse Non-Lineaire e t de ses Motivations
Economigues. Presses Universilaires de France. Paris (1984).
. A. Austender. m t i m i z a t i n n Methodes N u m e r i q u e s , Masson. Paris (1976).
M. Avr~el, Nonlinear Programming , A n d y s i s and Methods, Prentice Hall,
Englewood Cliffs, New Jersey ( 1976).
J . Cea. g t i m i z a t i o n : Theoris e t Algorithmes, Dunod, Paris (1971)
, J.M. Danskin, The Theory o f M a - M i n , Springer. New York (1967)
G.B. Dantzig, Linear Programming and E z t e n s i o w , Princeton University
Press, Princeton (1863).
V.F. Demyanov and A.M. Rubinov, A p p r o z i n a t i n n Methods in @ t i m i z a t i o n
Problems, American Elsevier. New York (1970).
V.F. Demyanov and V.N. Malozemov, Int-reduction t o Min imaz , John Wiley,
New York (1974).
V.F. Demyanov and L.V. Vasiliev, Nondif ferent iable Opt imizat ion (an Ihcs
sian), Nauka, Moskow (1981).
I. Ekeland and R. Temam, Analyse Conveze e t R o b l e m e s V ~ i n n n e l l e s ,
Dunod. Paris (1974).
Yu.M. Ermoliev, Stochast ic Programming Methods ( i n Russ ian) , Nauka. Mos-
cow (1976).
. V.V. Fedorov. Numer ica l Methods for M a z m i n Problems ( i n Russian) ,
Nauka. Moscow (1979).
. E.G. Golstein. Theory of Canuez Programming , Amerlcan Mathematical
Society, Translations of Mathematical Monographs. Vol. 36, Providence, R . I .
(1972).
![Page 225: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/225.jpg)
. A.M. Gupal, Stochastic Methods l o r Solving Nonsmooth E z t r e m u m P m b l e m s
( i n Aussian), Naukova Dumka, Kiev ( 1979).
. J. Gwinner. "Bibliography on Nondifferentiable Optimization and Nonsmooth
Analysis," Journal o l Computational and Applied Mathematics 7(4) pp. 277-
285 (1981).
V.Ja. Katkovnik. Linear Est imates and Stochast ic Optimizat ion Problems
( i n Russian). Nauka, Moscow (1976).
P.J. Kelly and M.L. Wiess, Geometry a n d Convezi ty . John Wiley, New York
( 1979).
. L.S. Lasdon. Optimization Theory l o r Large S y s t e m , Macmillan. New York
(1970).
. P.J. Laurent. Pqrprozimation e t Optimizat ion, Hermann, Paris (1972).
. C. Lernarechal and R. Mifllin. Nonsmooth Optimization: Pmceedings o l a
IIASA Workshop. Pergarnon Press, Oxford (1978).
. 0. Mangesarian, N o d i n e m Programming, McGraw-Hill. New York (1969).
. A.S. Nemirovski and D.B. Judin. Compleri fy and Effect iveness o l Optimiza-
t i o n Methods ( i n Russian) , Nauka. Moscow ( 1979).
. E.A. Nurmlnski. Numerical Methods JOT Solving Deterministic and Stochas-
t i c Minmaz Problems ( i n Russian) , Naukova Dumka, Kiev (1979).
E.A. Nurminski, "Bibliography on Nondifferentiable Optirnlzation," WP-82-32,
International Institute for Applied Systems Analysis, Laxenburg, Austrla
(1982).
. J.M Ortega and W C. Rheinboldt, Iterative Solut ion oJ Nonlinear Equations
in Several Variables, Academlc Press, New York (i970).
![Page 226: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/226.jpg)
B.N. Pshenichmy. Necessary Conditions for E z t r e m u m Problems (in Rus-
sia). Nauka, Moscow (1969). English translation: Marcel Dekker, New York
(1971).
B.N. Pshenichniy and Yu.M. Daniltn, Numer ica l Methods for E z t r e m u m R o b -
Lems (in Russ ian) , Nauka. Moscow (1975).
B.N. Pshenichniy, Convez Analysis and E z t r e m a l Problems ( i n Russian) .
Nauka, Moscow (1980).
A.W. Robers and D.E. Varberg, Convez f i n c t i o n s , Academic Press. New York
(1973).
R.T. Rockafellar, Convez Analysis, Princeton University Press, Princeton
(1972).
R.T. Rockalellar. "The Theory of Subgradients and its Applications to Prob-
lems of Optimization. Convex and Nonconvex Functions," in R e s e a m h and
E d u c a t w n in Mathematics , b 1 . 1 , ed. K.H. Hoffman and R. Wille,Helderrnann,
Berlin (1981).
. D.L. Russell, Calculus of Variations and Control Theory. Academic Press,
New York (1976).
J.F. Shapiro, Mathematical Programming: S t r u c t u r e s and Algorithms, John
Wiley, New York (1979).
. N.Z. Shor. Methods f o r Min imiza t ion of Nondi f ferent iable Frcnctions and
the i r Applications ( i n Russian) , Naukova Dumka. Kiev (1 979).
. J a Z . Tsypkin. Adaptation and Learning in Automatic S y s t e m s (in Russian) ,
Nauka, Moscow (1968). English translation: Academic Press. New York
(1971).
J. Warga, Optimal Control 01 Differential and Frtnctional Equat ions ,
Academic Press, New York (1972).
![Page 227: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/227.jpg)
Thls section deals with algorithms. It also contains reports on applications
of nondifferentiable optimization and computational experiments in this field.
References
S. Agmon. "The Relaxation Method for Linear Inequallties." Canadian Jour-
nal o j Mathematics 6 pp. 382-392 ( 1954).
A. Auslender, "Methodes Numeriques pour la Decomposition e t la Minimisa-
tion de Fonctions Non-DifTerent~ables." Numerische Mathematik 18 pp. 213-
223 (1971).
. A. Auslender, "Prograrnmation Convexe avec Erreurs: Methodes de Epsilon-
Sous-Gradients," Comptes Rend- de L'Academie d e s Sciences (Paris) ,Ser ie
A 284(2) pp. 109-1 12 (1977).
. A. Auslender. "Penalty Methods for Computing Polnts that Satisfy Second
Order Necessary Conditions," Mathematical Programming 17(2) pp. 229-
238 ( 1979).
. A.B. B a k u s h s k i and B.T. Poljak, "On the Solutlon of Variational Inequalities
(in Russian)," DokLady Akademia Nauk S S S R 219 pp. 1038-1041 (1974).
English translation: Soviet Mathematics Doklady pp. 1705-1710.
J.W. Bandler and C. Charalambous. "Nonhnear Programming Using Min~max
Techruques," Journal o j Optimizabion Theory and Applications 13 pp. 607-
sla (197a).
M.S. Bazaraa and J . J . Goode, "The Traveling Salesman Problem: A Duality
Approach," Mathematical Programming 13(2) pp. 221-237 (1977)
. M.S. Bazaraa, J . J Goode, and R.L. Rarhn, "A Firute Steepest-Ascent Method
for M a d m l z q P~ecewise-hnear Concave Functions," Journal o j Optimiza-
Cion Theory and Applications 25(3) pp. 437-442 (1978).
![Page 228: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/228.jpg)
M.S. Bazaraa and J . J . Goode. "A Survey of Various Tactics for Generating
Lagrangian Multipliers In the Context of Lagrangian Duality," European
J o w n a l oJ O p e r a h n a l R e s e a r c h 3(3) pp. 322-338 (1979).
L.G. Bazhenov, "On the Convergence Conditions of a Method for Minimizing
Almost-Differentiable Functions (in Russian)." Kiberne t ika 8(4) pp. 71-72
(1972). English translation: Cybernetics Vol. 8(4)pp. 607-609.
D.P. Bertsekas, "Stochastic Optimization Problems with Nondifferentiable
Cost Functionals," Journal oJ Op t imi za t ion Theory a n d Appl icat ions 12 pp.
218-231 (1973).
D.P. Bertsekas and S.K. Mitter. "A Descent Numerical Method for Optimiza-
tion Problems with Nondifferentiable Cost Functionals." SIAM Journal o n
Control 11 pp. 637-652 (!973).
. D.P. Bertsekas, "Nondiflerentiable Optimization Via Approximation," pp. 1-
25 in Nondif lerent inble Op t imi za t ion , ed. M.L. Balinski and P.
WolCe,Mathematical Programming Study 3, North-Holland, Amsterdam
(1975).
. D.P. Bertsekas, "On the Method of Multipliers for Convex Programming."
IEEE 'Pramac t ions o n Automat ic Coni-rol 20 pp. 358-388 (1975).
D.P. Bertsekas, "Necessary and Sufliclent Condition for a Penalty Method to
be Exact," Mathemat i ca l P r o g r a m m i n g 9(1) pp 87-99 (1975).
D.P. Bertsekas, "A New Algorithm for Solution of Nonlinear Resistive Net-
works Involving Diodes," IEEE Transac t ions o n Circui t Theonj 23 pp. 599-
608 (1976).
D.P. Bertsekas, "Mmimax Methods Based on Approximation," pp. 463-465 in
Proceed ings of the 1976 Johns Hopkins Conference o n In format ion S c i e n c e
a n d S y s t e m s , Baltimore, Maryland (1976).
![Page 229: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/229.jpg)
D.P. Bertsekas. "Approximation Procedures Based on the Method of Multi-
pliers," Journal o j Op t imi za t ion Theory and Appl icat ions 23(4) pp. 487-510
(1877).
B. Birzak and B.N. Pshenichniy, "On Some Problems in the Minimization of
Nonsmooth Functions (in Russian) ." f i b e r n e t i k a 2(6) pp. 53-57 ( 1966).
English translation: Cybernetics Vol. 2(6) pp. 43-46,
C.E. Blair and R.G. Jeroslow. "An Exact Penalty Method for Mixed-Integer
Programs," Mathemat i c s of Opera t ions R e s e a r c h 6(1) pp. 14-18 ( 1981).
J . Bracken and J.T. McGill, "A Method lor Solving Mathematical Programs
with Nonlinear Programs in the Constraints," Opera t ions R e s e a r c h 22 pp.
1097-1 101 (1974).
R. Brooks and A.M. GeofTrion, "Finding Everett's Lagrange Multipliers by
h n e a r Programming." Oparat ions R e s e a r c h 14(6) pp. 1149-1 153 (1966).
A. Butz, "l terat~ve Saddle Point Techniques," SIAM Journal o n & p l i e d
Mathemat i c s 15 pp. 719-726 (1967)
. P.M. Carnerinl. L. Fratta, and F MafTioli, "A Heurist~cally Gu~ded Algorithm
for the 'I'ravelmg Salesman Problem," Journa l o j the Ins t i tu t i on o j Com-
p u t e r Sc i ence 4 pp 31-35 (1973).
. P.M. Camerini. L. Fratta, and F. Maff~oli, "On lmprovlng Relaxation Methods
by Modified Gradient Techmques." pp. 26-34 in Nondif ferent iable Op t imi za -
t w n , ed. M.L. Balinski and P. Wolfe.Mathemat~cal Programming Study 3,
North-Holland, Amsterdam ( I 975).
J. Cea and R. Glowinski. "Minimisation des Fonctlonnelles Non-
Differentiables," Report No. 7105, I .N.R.I .A. Le Chesnay. France (1971).
C. Charalambous. "Nonlmear Least p-th Optimlzatlon and Nonlinear Pro-
gramming," Mathemat ical P r o g r a m m i n g 12(2) pp. 195-225 (1977).
![Page 230: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/230.jpg)
. C. Charalambous, " A Lower Bound for the Controlling Parameters of the
Exact Penalty Functions." Mathemat ical P r o g r a m m i n g 15(3) pp. 278-290
(1978).
. C. Charalambous, "On Condtions for Optimality of the Nonlinear L-1 Prob-
lem." Mathemat ical P r o g r a m m i n g 17(2) pp. 123-135 (1979).
. J. Chatelon, D. Hearn, and T.J. Lowe, "A Subgradient Algorithm for Certain
Minimax and Minisum Problems." Mathamat ical P r o g r a m m i n g 14(2) pp.
130-145 (1978).
E.W. Cheney and A.A. Goldstein. "Newton's Method for Convex Programming
and Chebyshev Approximation." N v m e r i s c h e Mathemat ik l (1) pp. 253-268
(1959).
A.R. Conn. "Constrained Optimization Using a Nondifferentiable Penalty
Function ." SIAh4 Journal o n Numer i ca l Analys is 10 pp. 760-784 (1973).
A.R. Conn. "Linear Programming Via a Nondifferentiable Penalty Function,"
SIAM Journal o n Numer i ca l Analys is 13(1) pp. 145-154 (1976).
G. Cornuejols, M.L. Fisher, and C.L. Nemhauser, "Location of Bank Accounts
t o Optimize Float: An Analyt~c Study of Exact and Approximate Algo-
rithms," Management Sc i ence 23(B) pp. 789-810 (1977).
H. Crowder, "Computational Improvements for Subgradient Optimization."
RC4907. IBM. T. Watson Research Center. New York (1974).
J Cullum. W.E. Donath, and P. Wolfe. "The Mimmization of Cer ta~n
Nonaeren t i ab le Sums of Eigenvalues of Syrnmetr~c Matrices," pp. 35-55 in
N o d i f f e r e n t i a b l e m t i m i z a t i o n , ed. M.L. Balinski and P Wolfe,Mathematical
Programming Study 3, North-Holland. Amsterdam (1975).
G.B. Dantzig. "General Convex Object~ve Forms," in Mathemat ical Methods
in the Social Sc i ences , ed. S. Karlin, K.J. Arrow, and P. Suppes.Stanford
![Page 231: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/231.jpg)
University Press, Stanford (1960).
V.F. Demyanov. "On Minimax Problems (in Russian)." Kiberne t ika 2(6) pp.
58-68 (1966). English translation: Cybernetics Vol. 2(6) pp. 47-53.
V.F. Demyanov, "Algorithms for Some Minimax Problems." Journa l of Com-
p u t e r and S y s t e m S c i e n c e s 2 pp. 342-380 (1968).
V.F. Demyanov and A.M. Rubinov, "Minimization of Functionals in Norrned
Spaces," SIAM J o w n a l o n Conh-01 6 pp. 73-89 (1968).
V.F. Demyanov, "Seeking a Minimax on a Bounded Set ( in Russian)," Dok-
l a d y Academi i Nauk S S S R 191 pp. 517-521 (1970).
V.F. Demyanov, "A Continuous Method for Solvlng Minmax Problems (in
Russian)," Z u m a l Vycisl i tel h o i Matemat lk i i Matemat i ce sko i f i ik i 15 pp.
592-588 (1975). English translat~on: USSR Computational Mathematics and
Mathematical Physics Vol. 15 pp. 45-52.
V.F. Demyanov and L.V. Vasiliev, "The Relaxatlon Method of Generalized Gra-
dient (in Russian)," Opt imi zac ia lQ(36) pp. 48-52 (1977).
V.F. Demyanov. L.V. Vasiliev, and S.A. Lisina. "Mimmizatlon of a Convex
Function by Means of E-Subgradients (in Russian)," pp. 3-22 in Control of
Dynnmica l S y s t e m s , Leningrad State University, Leningrad (1978).
V.F Demyanov, "A Multistep Method for General~zed Gradlent Descent ( ~ n
Russian)." Zurnal Fycislitel h o i Maternatiki i Mat smat i ce sko i fiiki
18(5) pp 11;2-1118 (1978).
V.F. Demyanov, "A Modified General~zed Gradlent Method in the Presence of
Constrwnts (in Russian)." Vestnlk Leningradskogo U n i v e ~ s i t e t a , Matema-
t i k a . Mekhunika, As t ronomiya 19(4) pp. 25-29 (1978). English translation: to
appear in Vestnik Leningradskogo Universiteta.
![Page 232: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/232.jpg)
V.F. Demyanov and L.V. Vasiliev, "The Method of (e.m,t)-Generalized Gra-
dient Descent in the Presence of Constraints (in Russian)." Vestnik Len-
i n g r a d s k o g o h i v e r s i t e t a , Matemut ika , Mekhanilca, As t ronomiya 12(1980).
V.F. Demyanov, "Subgradient Method and Saddle Points (in Russian)," Vest-
n i k Leningradskogo U n w e r s i t e t a . Mathemat ika . Mekhanika. A s t r o n o m i y a
13(7) pp. 17-23 (1081).
L.C.W. Dixon. "Reflections on Nondiflerent~able Optimization, Part I:The
Ball-Gradient." Journal of Op t imi za t ion Theory a n d Appl icat ions 32(2) pp.
123-134 (1980).
L.C.W. Dixon and M . Gaviano, "Reflections on Nondifferentiable Optimization,
Part 2: Convergence," Journal of O p t i m i z a t i o n Theory a n d Appl icat ions
S ( 3 ) pp. 254276 (1980).
J . Elzinga and T.G. Moore. "A Central Cutting Plane Algorithm for the Convex
Programming Problem." Mathemahca l P r o g r a m m i n g 8 pp. 134-145 ( 1975).
1.1. Eremin, " A Generalization of the Motzkin-Agmon Relaxation Method (in
Russian)." U s p ~ k h i Matemat ichaski Nauk 20 pp. 183-187 (1965).
1.1. Eremin, "The Relaxation Method of Solving Systems of Inequalities with
Convex Functions on the Left Side (in Russian)," Doklady Academia Nauk
SSSR 160 pp. 994-996 (1965). Engllsh translation: Soviet Mathematics Dok-
lady Vol. 6 pp. 219-221(1965)
I I . Eremin. "On t h e Penalty Method in Convex Prograrnmng (in Russ~an)."
Doklady Akademi i Nauk SSSR 173(4) pp 63-67 ( 1967).
1.1. Eremin. "Standard Iteration Nonsmooth Optimization Process lor Nons-
tationary Convex Programming Problems.Part 1 (in Russian)," Zurnal
VgcBl i te l 'mi Matemat ik i i Matemat i ce sko i Fiz& 18(6) pp. 1430-1442
(1978).
![Page 233: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/233.jpg)
1.1. Eremin, "Standard Iteration Nonsrnooth Optimization Process for Nons-
tationary Convex Programming Prob1ems.Part 2 (in Russian)," Zurnaf
&cislitel 'noa Matemat* a Matematiceskoi Fiziki 19(1) pp. 11 2-120 (1979).
1.1. Eremin, "On Nonstationary Processes of Mathernetical Programming (in
Russian)." pp. 13-19 in Methods 01 Optimizat ion and Pat tern Recognition an
Planning Problems, Institute of Mathematics and Mechanics, Sverdlovsk
( 1980)
Yu.M. Errnoliev, "Methods of Solution of Nonlinear Extremal Problems (in
Russlan)," Kibernetlka 2(4) pp 1-17 (1966).
Yu.M. Errnoliev and Z.V. Nekrylova. "Some Methods of Stochastic Optimiza-
tion (in Russian)." Kibernetika 2(6) pp. 96-98 (1966). English translation:
Cybernetics Vol. 2(6) pp. 77-78.
Yu.M. Errnoliev and N . Z . Shor, "On the Mimmization of Nondiflerentiable
Functions (in Russlan)," Kibemet ika 3(1) pp. 101-102 (1967). English trans-
lation. Cybernetics Vol. 3(1) pp. 72-80.
Yu.M. Errnoliev and N Z. Shor, "Method of Random Walk for the Two-Stage
Problems of Stochastic Programming and its Generalization (in Russian)."
Kibernetika 4(1) pp. 90-92 (1908). English translation: Cybernetics Vo1.4(1)
pp. 59-60.
Yu.M Errnoliev, "On the Method of Generalized Stochastic Gradients and
Stochastic Quasi-Fejer Sequences (in Russian),'' Kibernetika 5(2)(1969).
English translation: Cybernetics Vol. 5(2) pp. 208-220
Yu.M Errnoliev and L.G. Errnolieva, 'The Method of Pa ramet r~c Decomposl-
tion (in Russian)." Kibernetika 9(2) pp. 66-69 (1973). English translation:
Cybernet~cs Vol. 9(2) pp. 262-266.
![Page 234: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/234.jpg)
Yu.M. Ermol~ev and E.A. Nurminski, "Llmit Extremal Problems (in Rus-
sian)," Kibernetika Q(4) pp. 130- 132 (1973). English translation: Cybernet-
ics Vol. 9(4) pp. 69 1-693.
Yu. M. Ermoliev, "Stochastic Models and Methods of Optimizat~on (in Rus-
sian)," k5berneti.a 1 l(4) pp. 109-1 19 (1975). English translation: Cybernet-
ics Vol. 1 l(4) pp. 630-641.
J.P. Evans and F.J. Gould. "Application of the Generalized Lagrange Multi-
plier Technique to a Production Planning Problem," Naval Research Logis-
tic Quarterly lB(1) pp. 59-74 (1971).
J.P. Evans. F.J. Gould, and J . W . Tolle, "Exact Penalty Functions in Nonlinear
Programming," Mathematicd Programming 4 pp. 72-97 (1973).
H. Everett, "Generalized Lagrange Multiplier Method for Solving Problems
of Optimum Allocation of Resources," Operations Research 11 pp. 399-417
(1963).
J.E. Falk, "A Linear Max-Min Problem." Mathematical Programming 5(2) pp.
169-188 (1973).
A. Feuer, "An Implementable Mathematical Programming Algorithm for
Admissible Fundamental Functions," Ph. D. Dissertation, Columbia Univer-
sity. New York (1974).
M.L. Fisher, W.D. Northup, and J.F. Shapiro, "Using Duality to Solve Discrete
Optimization Problems: Theory and Computational Experience," pp. 56-94
in Nonctifferentiable Optimization, ed. M.L. Balinski and P.
Wolfe,Mathematical Programming Study 3, North-Holland, Amsterdam
(1975).
M.L. Fisher, "A Dual Algorithm for the One-Machine Scheduling Problem."
Mathematical Programming l l ( 3 ) pp. 229-251 (1976).
![Page 235: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/235.jpg)
M.L. Fisher, "The Lagrangian Relaxation Method for Solvlng Integer Pro-
gramming Problems." Management Science 27( 1)(1981).
R. Fletcher, "An Exact Penalty Function for Nonlinear Programrn~ng with
Inequalities," MathemaCicd Programming 5(2) pp. 129-150 (1973).
R. Fletcher, "Methods Related to Lagrangian Functions." pp. Chapter 8 in
Numerical Methods for Constrained @ t i n i z a t i o n , ed. P.E. Gill and W.
Murray,Academic Press. London (1974).
R. Fletcher. "Conjugate Gradient Methods for IndefFnite Systems," pp. 73-89
in N u m e r i c d h a l y s i s , Dundee, ed. G. A. Watson,Springer, Berhn (i975).
R. Fletcher, "Methods for Solvlng Nonlinear Constrained Optimization Prob-
lems," University of Dundee Report NA i 6 (1976). Also to be published in
the Proceedings of the York State-of-the-Art Conference.
A.A. Gaivoronski, "Nonstatlonary Stochastic Programming Problems (in
Russian)." KibernetiLra 14(4) pp. 8 4 9 2 (1978). English translation: Cyber-
netics Vol. 14(4) pp. 575-579.
A.M. Geoffrion. "Prunal Resource-Directive Approaches for Optimz~ng Non-
linear Decomposable Systems," Gperafions Research 18(3) pp. 375-403
(1070).
A.M. Ceoffrlon. "Duality tn Nonlinear Programming: A Slmplifled Applica-
tions - Oriented Development," SlAMReview 13pp. 1-37 (197;).
A . M . Geoffrlon and G Craves, "Mult~commod~ly Dlstrlbution System Design
by Benders' Decomposition," Management Science 20(5) pp. 822-844
(:974).
O.V. Glushkova and A.M. Gupal, "Numerical Methods for the Minimization of
Maxunum Functions without Calculating GradLents (in Russian)," Kiber-
n e t i k a 18(5) pp. 141-143 (1980).
![Page 236: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/236.jpg)
0.V Glushkova and A . M . Gupal. "On Nonmonotonous Methods for Minimizing
Nonsmooth Functions with Gradient Averaging (in Russian)." Kibemet ika
16(6) p. 128 (1980).
J.L. Godin. "On the Fimte Convergence of the Relaxation Method for Solving
Systems of Inequalities." ORC 71-36, Operations Research Center Report,
University of California, Berkeley (1971).
J.L. Goffin, "On Convergence Rates of Subgradient Optimization Methods."
Mathematical Programming 13(3) pp. 329-347 (1977).
J.L. Goflin. "The Relaxation Method for Solving Systems of Linear Inequali-
ties." Mathematics o i Operatinns Research 5(3) pp. 388-414 (1980).
A . A . Goldstein, "Optimization with Corners," pp. 215-230 in Non-Linear
Programming. Vol. 2. Academic Press. New York (1975).
A.A . Goldstein. "Optimization of Lipschitz Continuous Functions,"
Mathematical Programming 13 pp. 14-22 (1977).
E.G. Golstein, "Generalized Gra&ent Method for Finding Saddle Polnts (in
Russian)." Ekonomika i Matematicheskie Metody 8(4)(1970).
R.C. Grinold, "Steepest Ascent for Large-Scale Linear Programs," SIAM
Review 14 pp. 447-464 (1 872).
L. Grlppo and A. La Bella, "Some Abstract Models of Convergent Algorithms
in Non&fferentiable Optlmlzatlon." R 78-19. Universlta di Roma, Instltuto &
.Automatics (! 978)
M Grotschel, L Lovasz, and A Schrljver, "The Elhpso~d Method and ~ t s
Consequences In Combmatonal Optimization." Report No 80151-OR. Instl-
tute fur Okonometrle und Operations Research, Rheln~sche Friedrlch-
Wilhelms-Umversitat, Bonn (1980)
![Page 237: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/237.jpg)
A.M. Gupal. "One Stochastic Programming Problem with Constraints of a
Probabilistic Naturc (in Russian)." K i e r n e t k a 10(5) pp. 94-100 (1974).
Enghsh translation: Cybernetics Vol. lO(8) pp. 1019-1026.
A.M. Gupal, "On a Mirumlzation Method for Ahost-Diflerentiable Functions
(in Russian)," Kberne t ika 13(1) pp. 114-118 (1977).
A.M. Gupal and V.I. Norkin. "A Minimization Algorithm for Discontinuous
Functions (in Russian)," Kibernetika 13(2) pp. 73-75 ( 1977).
A.M. Gupal, "Method for the Minimization of Functions Satisfying the
Lipschitz Condition (in Russian)." K i b e r n e t d a 16(5) pp. 9 1-94 (1980).
O.V. Guseva, "Convergence Rate of the Generalized Stochastic Gradient
Method (in Russian)," Kibernetika 7(4) pp. 143-145 (1971). English transla-
tion: Cybernetics Vol. 7(4) pp. 734-742.
J. Hald and H. Schrajer-Jacobsen. "Linearly Constrained Minimax Optlmiza-
tion without Calculat~ng Derivatlves," Third 3 m p o s l u m o n Operations
Research ( h w e r s i t y of Mannheim. Mannheim), Sect ion 1 , pp. 289-301
(1878).
J. Hald and K. Madsen. "A 2-Stage Algorithm for Minmax Optimization." pp.
225-239 in lnterrrational S y n p o s i u m o n S y s t e m s Optimizat ion and
Analysis, Rocquencourt. December 11-13.1978. Lecture Notes in Control and
Information Science, Vo1.14, Springer, Berlin (1979).
J. Hald and K Madsen, "Combined LP and Quasi-Newton Methods for
Minimax Optunizatlon," Mathematical Programming 20(:) pp. 49-62 (1981).
[ Also technicaI report Nl-79-05. Numerisk lnstitut Danmarks Tekniske
Hojskole, Lyngby. Denmark, 1979. ]
S.P. Han, "Variable Metric Methods for Mirurnizmg a Class of
Nondifferentiable Functions," Mathematical Programming 20(l) pp. 1-13
![Page 238: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/238.jpg)
(1981).
M. Held and R.M. Karp, "The Traveling Salesman Problem and Minimum
Spanning Trees. Part 1," O p s m t i o n s Research 18 pp. 1138-1 162 (1970).
M. Held and R.M. Karp, "The Traveling Salesman Problem and Minimum
Spanning Trees. Part 2," Mathsmat ical Programming 1 pp. 8-25 (1971).
B. von Hohenbalken, "Least Distance Methods tor the Scheme of Polytopes,"
Mathematical Programming 15 pp. 1-1 1 (1978).
D.B. Ju&n and A.S. Nemirovski. "Evaluation of Information Complexity for
Mathematical Programming Problems (in Russian)," Ekonomika i Matema-
tichoskie Matody 12( 1 ) pp. 128-142 (1978).
D.B. Judin and A.S. Nemirovski, "Information Complexity and Effective
Methods for Solving Convex Extremum Problems (in Russian)," Ekonomika i
Matemat icheskie Metody 12(2) pp. 357-368 (1878).
C.Y. Kao and R.R. Mayer. "Secant Approximation Methods for Convex Optim-
ization." pp. 143-162 in Mathematical Programming a t C & m o l ~ a c h , ed. H.
Korug, B. Korte, and K. Ritter,Mathematical Programming Study 14, North-
Holland. Amsterdam (1981).
A.A. Kaplan, "A Convex Programm~ng Method with Internal Regularization
(in Russian)," Dokladrj Akadamia N a v k S S S R 241(l) pp. 22-25 (1978).
English translation: Soviet Mathematics Doklady Vol. 19(4) pp. 785-799.
S. Kaplan. "Solution of the Lorle-Savage and Sirmlar Integer Programming
Problems by the Generalized Lagrange Multiplier Method," Operations
Research 14 pp. 1130-1 136 (1966).
N .N . Karpinskaja. "Methods of Penalty Functions and the Foundations ot
Pyne's Method (in Russian)," Avtomat ika a Telemekhanika 28 pp. 140-146
(1967). English translation: Automation and Remote Control Vol. 28 pp.
![Page 239: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/239.jpg)
J.E. Kelley, "The Cuttlng Plane Method for Solvlng Convex Programs." Jour-
nal o l ths Soc ie ty l o r Industrial and Applied Mathematics 8(4) pp. 703-712
(ieso).
L.G. Khachlyan. "A Polynomial Algorithm in Linear Programming (in Rus-
sian)," Doklady Akademii N a d S S S R 244 pp. 1093-1096 (1978).
L.G. Khachlyan, "Polynomial Algorithm in Linear Programming (in Rus-
sian)." Z u r n a l @cisli tel'noi Matematiki i Matemnticeskoi Fiziki 20(1) pp.
51-88 (1980) English translation: USSR Computational Mathematics and
Mathematical Physics Vol. 20(1) pp.53-72.
K.C. Kiwiel, "A Variable Metric Method of Centers for Nonsmooth Minimiza-
tion." CP-81-23, International Institute for Applied Systems Analysis. Lax-
enburg. Austria (1981).
S.K. Korovin and V.I. Utkln. "Method of Piecewlse Smooth Penalty Functions
(in Russian)." Avtomatika i Telemekhanika 37 pp. 94-105 (1976). English
translat~on: Automation and Remote Control Vo1.37 pp. 39-48.
B. Korte and R. Schrader, " A Note on Convergence Proofs for Shor-
Khachyan Methods," Report No. 80156 - OR. lnstitute fur Okonometrie und
Operations Research. Rheinische Friedrich-Wilhelms-Universitat. Bonn
(1980).
M K. Kozlov. S.P. Tarasov, and L.G. Khachyan, "Polynomial Solvability of
Convex Quadratic Programming (in Russian),'' Doklady Akademii N a u k
S S S R 248(5) pp. y049-1051 (i98O).
O.V. Kupatadze, "On the Gradient Method for Mmimizing Nonsmooth Func-
tions ( ~ n Russian)," in Optimalnye i Adaptivnye S i s t e m y . Trudy 4 Vsesojuzn.
Sovesch, po Avt.Upr. (Tbilisi, 1968). Nauka. Moscow (1972).
![Page 240: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/240.jpg)
A.1. Kuzovkin and V.M. Tihomirov, "On the Quantity ol Obsewations Reqlured
to Find a Mtnimum of a Convex Function (in Russian)," Ekonomika i
Matematicheskie Metody 3(1) pp. 95-103 ( 1967).
C . Lemarechal, "An Algorithm lor Minimizing Convex Functions," pp. 552-
556 in In/ormatinn Processing '74, ed. J.L. Rosenfeld,North-Holland,
Amsterdam (1974).
C . Lemarechal. "Note on an Extension of Davidon Methods to
Nonbfferentiable Functions," Mathematical Programming 7(3) pp. 384-387
( 1974).
C. Lemarechal, "Nondifferentiable Optimization; Subgradient and E-
Subgradient Methods," pp. 101-199 in Lecture Notes in Economics a d
MathematicaL S y s t e m s . Vo'ol. 11 7, ed. W Oettli,Sprmger, Berlin (1975).
C . Lernarechal, Combining Kelley's and C o n p g a t e Qradient Methods,
Abstracts. 9th International Symposium on Mathematical Programming.
Budapest (1976).
C . Lemarechal, "A View of Line-Searches," pp 59-78 in Optimizat ion and
Optimal Control, ed. A.Auslender, W.Oettli, and J.Stoer,Proceedings of a
Conference held a t Oberwolfach. March 16-22, 1980, Lecture Notes in Con-
trol and Information Science, Vo1.30, Springer, Berlin (198i).
A Yu Levin, "On an Algorithm for the Minimlzatlon of Convex Functions (in
Russian)," Doklady .4kademii Nauk SSSR 160 pp 1244-1247 (1965). English
translation: Sovlet Mathematics Doklady Vo1.6 pp. 286-290.
E.S. Levitin and B.T. Poljak, "Convergence of Mln~rnizing Sequences in Condi-
tional Extremum Problems (in Russian)," Doklady Akademii Nauk S S S R
180 pp. 993-996 (1966). English translation: Soviet Mathematics Doklady
v01.7 pp. 764-767
![Page 241: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/241.jpg)
E.S. Levltin. " A General M~nunization Method for Nonsmooth Extremal Prob-
lems (in Russian)." Zurnal &cisl i tel h o i Matemat ik i i Matemat i ce sko i fiiki
8 pp. 783-808 (1969). English translation: USSR Computational Mathematics
and Mathematical Physics Vol. 9 pp. 83-69.
D.G. Luenberger, "Control Problems with Kinks," IEEE Transac t ions o n
Au tomat i c Control 15 pp. 570-575 (1970).
K. Madsen, "An Algorithm for Minimax Solution of Overdetermined Systems
of Nonlinear Equations," Journal o j tha Ins t i tu t e of Mathemat i c s a n d its
,4pplications 16 pp 321-328 (1975).
K. Madsen, "Minimax Solution of Nonlinear Equations without Calculating
Derivatives," pp. 110-126 in Nad.r;Berentiable O p t i m i z a t i o n , ed. M.L. Balin-
ski and P. Wolfe.Mathernatical Programming Study 3, North-Holland,
Amsterdam (1975).
K. Madsen and H. Schrajer-Jacobsen. "Linearly Constrained Minimax Optimi-
zation," Mathemat ical Programming 14(2) pp. 208-223 (1978).
O.L. Mangasarian. "Iterative Solution of Ltnear Programs," Computer Sci-
ence Technical Report #327, Computer Science Department,University of
Winsconsin. Madison (1979).
R.E. Marsten, "The Use of the Boxstep Method in Discrete Optimization," pp.
127-144 in Nondif ferent iable Op t imi za t ion , ed. M.L. Balinski and P.
WolPe,Mathematical Programming Study 3, Korth-Holland. Amsterdam
( 1975)
R.E. Marsten. W.W. Hogan, and J.W. Blankenship. "The Boxstep Method for
Large-Scale Optimization," Operat ions R e s e a r c h 23(3) pp. 389-405 ( 1975).
G.D. Maystrovsky, "On a Gradient Method for Searching for Saddle Points (in
Russian)," Ekonomika i Matemat i chesk i s Metody 12(5) pp. 917-929 (1876).
![Page 242: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/242.jpg)
R. Mifflin. "Semlsmooth and Semiconvex Functions in Constrained Optimiza-
tlon." RR-78-21, International Institute lor Applied Systems Analysis. Lax-
enburg. Austria (1978). [ Also in SIAM Journal on Control and Optimization
Vo1.15(6) pp.959-972 (1977). ]
R. Mifflin, "An ALgorithm for Constrained Optimization with Semismooth
Functions." Mathematics of Operations Research 2 pp. 191-207 (1 977).
R. Mlfflm, "A Stable Method lor Solvlng Certain Constrained Least-Squares
Problems,'' Mathematical Programming 18(2) pp. 141-158 (1979).
V.S. Mikhalevich. Yu.M. Errnoliev, V.V. Skurba, and N.Z . Shor. "Complex Sys-
tems and Solution of Extremal Problems (in Russian),'' K t b e m t i k a 3(5) pp.
29-39 (1967). English translation: Cybernetics Vol. 3(5) pp. 25-34.
V.S. Mikhalevich. 1.V. Serglenko, and N.Z. Shor, "A Study of Methods for Solv-
ing Opt~mization Problems and Their Applications (in Russian)," K*ernetika
17(4) pp. 89-1 13 ( 1 981).
H . Mine and M . Fukushirna, "A Minimization Method for the Sum of a Convex
Function and a Continuously Dlflerent~able Function," Jownal of @timiza-
t w n Theory and Applications 33(1) pp. 9-24 (1981).
T. Motzkin and I.J. Schoenberg. "The Relaxation Method for Linear Inequali-
ties, ' ' Canadian. Journal of Mathematics 6 pp. 393-404 (1954).
J .A. Muckstadt and S.A. Koenig, "An Application of Lagrangian Relaxation to
Scheduling in Power-Generation Systems," @eratiom Research 25(3) pp.
387-403 ( 1077).
W . !.furray and M.L. Overton. "Steplength Algorithms for a Class of
Kondfferentiable Functions," CS-78-679, Stanford Umversity, Stanford
(1978).
![Page 243: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/243.jpg)
. A.S. Nemirovski, "Effective lterative Methods of Solving Equations with
Monotone Operators (in Russian)." Ekonomika i Matematicheskie Metody
17(2) pp. 344-359 (1981).
V.I. Norkin, "Two Random Search Algorithms for the Minunization of
Nondifferentiable Functions (in Russian)," in Mathamatical Methods in
Operations Research and Reliabili ty Theory, ed. Yu.M. Ermoliev and I.N.
Kovalenko.Ukrainian Academy of Sciences, Institute of Cybernetics. Kev
(1978).
V.1. Norkin. "Minimization Method for Nondifferentiable Functions with
Averaging of Generalized Gradients (in Russian)," Kibernet ika 16(6) pp. 88-
a9 (leeo).
E.A. Nurminski, "Quasigradient Method for Solving Nonlinear Programming
Problems (in Russian) ," Kibemet ika 9(1) pp. 122-125 (1973).
. E.A. Nurminski. "Minunization of Nondifferentiable Functions Under Noise
(in Russian) ." K d m e t i k a lO(4) pp. 59-61 (1974).
E.A. Nurminski and A.A. Zhelikhovski. "Investigation of One Regulating Step
in a Quasi-Gradient Method for Minimizing Weakly Convex Functions (in Rus-
sian)." KzberneCika 1 q 6 ) pp. 101-105 (1974). English translation: Cybernet-
ics Vol. lO(6) pp. 1027-1031.
E A Nurminski and A.A. Zhelikhovskl, "An E-Quasigradient Method for the
Solut~on of Nonsmooth Extremal Problems ( ~ n Russian)," Kibemet ika
13( 1) pp. 109-1 13 (1977). English translation: Cybernetics Vol. 13.
E.A. Nurminski, "On E-Subgradient Methods of Nondifferentiable Optimiza-
tion," p p. 187-195 in International S y m p o s i u m o n S y s t e m s Opt imizat ion
and A n d y s i s , Rocquencourt. December 1 1 - 1 3.1978, Lecture Notes in Con-
trol and Information Science. Vol.l4,Springer, Berlin (1979).
![Page 244: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/244.jpg)
E.A. Nurminski, "An Application of Nondflerentiable Optimization in
Optimal Control," pp. 137-158 in Numerical Optimization of Dynamic Sys-
tams, ed. L.C.W. Dixon and G.P. Szego.North-Holland. Amsterdam (1980).
. M.W. Padberg and M.R. Rao. "The Russian Method for Linear Inequalities."
Technical Report, Graduate School of Business Admin~stration, New York
University, New York (1979).
V.M. Panin, "Linearization Method for the Discrete Mirumax Problem (in
Russian)," Kibernetika 16(3) pp. 86-90 (1980).
C. Papavassilopoulos. "Algorithms for a Class of Nondderentiable Prob-
lems." Journal of Optimization m e o q and &plicatiOm 34(1) pp. 4i-82
(1981).
S.C. Parikh, "Approxunate Cutting Planes in Nonlinear Programming,"
Mathematical Programming l l ( 2 ) pp. 194-198 (1976).
T. Pietrzykowski. "An Exact Potential Method for Constrained Maxima,"
SIAM Journal on Numericd Analysis 6 pp. 217-238 (1969).
B.T. Poljak, "A General Method of Solving Extremal Problems (in Russian)."
Doklady Akademii Nuuk SSSR 174pp. 33-36 (1967). English translation:
Soviet Mathematics Doklady Vol. 8.
B.T. Poljak. "Minimization of Nonsmooth Functionals (in Russian)." Zurnal
Vycislitel'noi Matematzki i Matematiceskoi fiziki 9(3) pp. 509-521 (1969).
English translation: USSR Computational Mathematics and Mathematical
Physics Vo1.9(3) pp 14-29 (1969).
. B.T. Poljak and Ja.Z. Tsypkm. "Pseudogradient Adaptation and Trainmg (in
Russian). " Avtomatika i Telemekhanika 34 pp. 45-68 (1 973). English trans-
la t~on: Automation and Remote Control Vol. 34 pp. 377-397.
![Page 245: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/245.jpg)
. B.T. Poljak, "Stochastic Regularized Algorithms." in Supplement to Pre-
pritl ts, Stochast~c Control Syrnpos~um. IFAC, Budapest (1974).
B.T. Poljak. "Convergence and Convergence Rates of Iterative Stochastic
Algorithms, 1. General Case (in Russian)," h t o m a t i k a i Telemekhanika
37(12) pp. 83-94 (1976). English translation: Automation and Remote Con-
trol Vol. 37 pp. 1850-1 088.
R.A. Poljak. "On the Best Convex Chebyshev Approximation (in Russian),"
h k l a d y Akademii Nauk SSSR 200 pp. 538-540 (1971). English translation:
Soviet Mathematics Doklady Vol. 12 pp. 1441-1444.
R.A. Poljak, "Controlling Sequence Methods for Solution of Dual Problems in
Convex Programming (in Russian)," pp. 95-1 11 in M d h e m a t i c d Methods l o r
Solution o l Economic Problems, ed. N.P. Fedorenko and E.G.
Golstein.Optimal Planning and Control Serles, Vol.8, Nauka. Moscow (1979).
M.J.D. Powell. "AView of Unconstrained Optimization." pp. 117-152 in Optim-
i z a t i o n in Action, ed. L.C.W. Dixon,Academic Press, London (1976).
. B.N. Pshenichniy, "Dual Method in Extremum Problems (in Russian)," Kiber-
ne t ika l(3) pp. 89-95 (1985). English translation: Cybernetics Vol. l(3) pp.
91-99.
, B.N. Pshenichniy. "Convex Programming in a Normalized Space (in Rus-
sian)," Kibernetika l(5) pp. 46-54 (1965). English translation: Cybernetics
Vol. l(5) pp. 46-57
. B.N. Pshenichniy, "On One Method for Solv~ng a Convex Programming Prob-
lem (in Russian)." Kibernetika 16(4) pp. 48-55 (1980).
. J.K. Reid. "On the Method of Conjugate Gradients for the Solution of Large
Sparse Systems of Linear Equations," pp. Chapter 18 in Lurge Sparse Se t s
o l L i h e w Equuf iom, ed. J . K . Reid,Academic Press, London (1971).
![Page 246: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/246.jpg)
S.M Rob~nson. "A Subgradlent Algorithm lor Solving K-Convex Inequalities,"
pp. 237-245 in Optimizat ion and Operations Research, ed. M.A.
Wolte,Proceedings of the Conference held a t OberwoUach, 1975, Lecture
Notes on Economics and Mathematical Systems. Vol. 117 Sprmger, Berlin
(1976).
R.T. RockafeUar, "Augmented Lagrangians and AppLications of the Proximal
Point Algorithm in Convex Programming," Mathematics of Operat iom
Research l(2) pp. 97-1 16 (1976).
R.T. Rockafellar. "Monotone Operators and the Proximal Point Algorithm,"
S I N Journal o n Control a d Opt imizat ion 14 pp. 877-898 ( 1976).
S.V. Rzhevsky, "Use of the Convex Function Subdflerential in One Method of
Function Minimization (in Russian)." Kibernet ika 16(1) pp. 109-1 11 (1980).
R. Saigal. "The Fixed Point Approach to Nonlinear Programming," pp. 142-
157 in Point-to-Set Mups in Mathematical Programming , ed. P.
Huard,Mathematical Programming Study 10, North-Holland, Amsterdam
(1979).
C. Sandi, "Subgradient Optimization." Comb. Opt imiz . Lect. S u m m e r Sch.
Comb. Opt im. Vrbino, 1977, pp. 73-91 (1979).
M.A. Sepilov. "The Generalized Gradient Method lor Convex Programming
Problems (in Russian)." Ekonomika i Matematicheskie Metody l l ( 4 ) pp.
743-747 ( 1975).
M.A. Sepilov, "On the Generalized Gradlent Method for Extrernal Problems
(in Russian)," Zurnal Vycislitel 'noi Matematiki i Matematiceskoi f i ikz
16( 1) pp. 242-247 ( 1976).
J.F. Shaplro. "General~zed Lagrange Mult~pllers In Integer Prograrnm~ng,"
O p e r a t w m Research 19(1) pp. 68-75 (1971).
![Page 247: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/247.jpg)
. J.F. Shapiro. "Non&fferentlable Optmlzation and Large-Scale Linear Pro-
gramming," pp. 196-209 In international Symposium on a s t e r n s Optimiza-
t ion and hbralysis, Rocquencourt. December 11-13,1978, Lecture Notes in
Control and Information Science, Vo1.14, Springer, Berlin (1979).
. M.B. Shchepakin, "On the Modficatlon of a Class of Algorithms for
Mathematical Programming (in Russian) ," Zurnal Vycislitel 'mi Matematiki
i Mafemaficeskoi Fiziki lQ(6) pp. 1387-1395 (1979).
. N.Z. Shor. "Application of the Gradient Method for the Solution of Network
Transportation Problems (in Russian)." Notes, Scientific Seminar on Theory
and Application of Cybernetics and Operations Research, Institute of Cyber-
netics of the Ukrainian Academy of Sciences. Kiev (1962).
N.Z. Shor. "On the Structure of Algorithms for the Numerical Solution of
Problems or Optimal Planning and Deslgn (in Russian)," Dissertation, Kiev
( 1964).
N.Z . Shor, "Application of Generalized Gradient Descent in Block Program-
ming (in Russian)," Kbernetika 3(3) pp. 53-55 (1967). English translation:
Cybernetics Vol. 3(3) pp. 43-45.
N.Z. Shor, "The Rate of Convergence of the Generalized Gradient Descent
Method (in Russian)," kibernetika $3) pp. 98-99 (1968). English transla-
tion: Cybernetics Vol. 6(3) pp. 79-80.
N . Z . Shor and M.B. Shchepakln. "Aigorithms for Solmng Two-Stage Stochas-
tic P r o g r a m i n g Problems (in Russian) ." Kibernetika 4(3)(1988). English
translation: Cybernetics Vol. 4(3) pp. 48-50.
N . Z . Shor. "Convergence Rate of the Gradient Descent Method with Dilation
of the Space (in Russian)," Kibernetika 6(2) pp. 80-85 (1970). English trans-
lation. Cybernetics Vol. 8(2) pp. 102-108.
![Page 248: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/248.jpg)
N.Z. Shor, "Utlizat~on of the Operation of Space Dilation in the M~nimizat~on
of Convex Functions (in Russian)," Kibernetika 8(1) pp. 8-12 (1970). English
translation: Cybernetics Vol. 6(1) pp. 7-15.
N.Z. Shor and P.R. Gamburd, "Certain Questions Concerning the Conver-
gence of the Generalized Gradient Method (in Russian)," Kibernetika
7(6) pp. 82-84 (1971). English translat~on: Cybernetics Vol. 7(6) pp. 1033-
1036.
N.Z. Shor and N.G. Zhurbenko, "A Minimization Method Using Space Dilation
In the Direction of the Difference of Two Successive Gradients (In Russian)."
K*ernetika 7(3) pp. 51-59 (1971). English translation: Cybernetics Vol. 7(3)
pp. 450-459.
N.Z. Shor, "A Class of Almost-Differentiable Functions and a Minimization
Method for Functions of this Class (in Russian)," K i e r n e t i k a 8(4) pp. 65-70
(1972). English translation: Cybernetics Vol. 8(4) pp. 599-806.
N.Z. Shor and L.P. Shabashova. "Solution of Minimax Problems by the Gen-
eralized Gradient Method w t h Space Dilation (in Russian)." Kibernetika
8(1) pp. 82-94 (1972). English translation: Cybernetics Vol. 8(1) pp. 88-94.
N.Z. Shor, "Convergence of a Gradient Method with Space Dilation in the
Duection of the Difference between Two Successive Gradients (in Russian),"
Kibernetika l l (4 ) pp. 48-53 (1975). English translat~on: Cybernetics Vol.
1 l(4) pp. 564-570.
N . Z Shor, "Generalized Gradient Methods for Nonsmooth Functions and
Their Applications to Mathematical Programming Problems (in Russian),"
Ekonornika i Matematicheskie Metody 12(2) pp. 332-356 (1976).
N . Z . Shor. "Cut-Off Method with Space Extension m Convex Programming
Problems (in Russian)," Kibernetika 13(1) pp. 94-96 (1977). English transla-
![Page 249: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/249.jpg)
tlon: Cybernetics Vol. 13(1) pp. 94-96
N.Z. Shor, L.A. Galustova, and A.I. Momot, "Application of Mathematical
Methods in Opt~mum Designing of the Unified Gas Supply System with
Regard for Dyrnamcs of Its Development (in Russian)," K i m e t i k a 1q1) pp.
69-74 ( 1978).
N.Z. Shor and V.I. Gershovitch. "On One Family of Solution Algorithms Por
Problems of Convex Programming (in Russian)," Kibsrnetika 15(4) pp. 62-67
(1979).
. V.A. Skokov, "Note on Minimization Methods Using Space Dilation (in Rus-
sian)," Kbernetika 10(4) pp. 115-1 17 (1974). English translation: Cybernet-
ics Vol. lO(4) pp. 689-692.
J.J. Strodiot and V H Nguyen. " A n Exponential Penalty Method tor
Nondserentiable Minimax Problems with General Constraints," Journal of
Optimization Theory and Applications 27(2)(1979).
Ja.Z. Tsypkin and B.T. Poljak, "Attainable Accuracy of Adaptation Algo-
rithms (in Russian)," Doklady Academii Nauk SSSR 218(3) pp. 532-535
(1974).
S.P. Uryasyev. "On One Regulation of a Step in Limiting Extreme-Value
Problems (in Russian)." Kzbernetika 17(1) pp. 105-108 (1981).
. P. Wolte. "Convergence Theory in Nonlinear Programming." pp. Chapter 1 in
Integer and Nonlinear Programming, ed. J . Abadie,North-Holland, Amster-
dam (1970).
P. Wolfe. M . Held, and H. Crowder, "Validation of Subgradient Optimization,"
Mathematical Programming 6 pp. 62-88 (1974).
P. Wolfe, "Note on a Method of Conjugate Subgradients for M~nimizing
NondifTerentiable Functions," Mathematical Programming 7(3 ) pp. 380-383
![Page 250: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/250.jpg)
P. Wolfe, "F~nding the Nearest Polnt In a Polytope," Mathematical Program-
ming l l (2 ) pp. 128-149 (1976).
1. Zang, "Discontinuous Optimization by Smoothng," Mathematics of @era-
thns Research 6(l ) pp. 14D152 (1981).
N.G. Zhurbenko, E.G. Pinaev. N.Z. Shor, and C.N. Yun. "Cholce of Fleet Com-
position and Allocation of Aircraft to Civil Airline Routes (in Russian),"
KTernetika 12(4) pp. 138-141 (1976). English translatlon: Cybernetics
Vo1.12(4) pp. 836-641.
N.G. Zhurbenko, "Study of One Class of Algorithms for Minimization of
Nonsrnooth Functions and Their Application to Solution of Large-Scale Prob-
lems (in Russian)." Dissertation. Kiev (1977).
. S. Zlobec, "A Note on Opt~rnization Methods with Conditioned Gradients,"
Zeitschrift fur hgewandte Mathamatik und Mechanik 5Qpp. 279-281
(1879).
S.1. Zukhovitski, R.A. Poljak, and M.E. Primak, "An Algorithm for the Solu-
tion of the Problem of Convex Chebyshev Approximation (in Russian)," Dok-
lady Akademii N a u k SSSR 151 pp. 27-30 (1963). English translation: Soviet
Mathematics Doklady Vo1.4 pp. 901-904.
This section contains papers on general notions of differentiability, optimal-
i ty conditions in the nondlfierentiable case, properties of perturbation functions
in parametric programming and the stability of optimum programs in connec-
tlon w t h nonhfferentiable optimization.
![Page 251: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/251.jpg)
References
. A.D. Alexandrov, "The Existence Almost Everywhere of the Second
D~flerential of a Convex Function and Some Associated Properties of Convex
Surfaces (in Russian)," Ucenye Z+ki Leningradskogo Cosl ldars tvennogo
Lhriversiteta S e r i y a Matemat ika 37(6) pp. 3-35 ( 1939).
E. AapAund and R.T. Rockatellar, "Gradients of Convex Functions," R a n s a c -
t i o n s o f the Amer i can Mathemat ical S o c i e t y 139 pp. 443-467 (1969).
. N.N. Astafiev, "Stability and Marginal Values of Convex Programming Prob-
lems (in Russian)," Sibirski i M a t e m a t i c h e s k i Z h u m a l lQ(3) pp. 491-503
(1978).
H . Attouch and R.J.-B. Wets. "Approximation and Convergence in Nonlinear
Optimization." WP-80-142, International Institute for Applied Systems
Analysis, Laxenburg. Austria (1980).
J.-P. Aubin and F.H. Clarke, "Multiplicateurs de Lagrange en Optimisation
Non-Convexe e t Applications," C o m p t e s R e n d u s d e 1 'Academic d e s S c i e n c e s
(Paris), Ser i e A 285 pp. 451-454 ( 1977).
J.-P. Aubin. "Gradients Generalises de Clarke," h n a l e s d e s S c i e n c e s
Mathemat iques d e Quebec 2 pp. 197-252 ( 1978).
J.P. Aub~n and F.H Clarke, ' Shadow P r ~ c e s and Duallty for a Class of
Opbmal Control Problems," SIAM Journnf o n Control a n d O p t i m i z a t i o n
17(5) pp 567-568 (1979).
J.-P. Aubin, "Further Properties 01 Lagrange Multipliers in Nonsmooth
Optirmzation." Journal of Applied Mathemat i c s a n d O p t i m i z n t i c n 57 pp. 79-
90 (1980).
J.-P. Aubin. "Contingent Derivatives of Set-Valued Maps and Existence ol
Solutions to Nonlinear Inclusions and Diderential Inclus~ons." pp. 160-232 In
![Page 252: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/252.jpg)
Advances in Mathematics. Supplementary S tud ies , ed. L. Nachb~n~Academic
Press (1981).
J.P. Aubin, "Lipschitz Behavior of Solutions to Convex Minimization Prob-
lems," WP-81-76, International Inst~tute for Applied Systems Analysis, Lax-
enburg. Austria (19Bi).
A. Auslender, "Convex Programming with Emors: Methods of e-
Subgra&ents," in Survey of Mathematical Programming, ed. A.
Prekopa,Proceedings of 9th International Mathematical Programming Sym-
posium, Budapest, August 23-27, 1976, North-Holland, Amsterdam - Aka-
demiai Kiado, Budapest (1879).
A. Auslender, "Dflerentiable Stability in Nonconvex and Nondifferentiable
Programming," pp. 29-41 in Point-to-Set Maps in Mathematical Program-
m i n g , ed. P. Huard,Mathematical Programming Study 10. North-Holland,
Amsterdam (1979).
A. Auslender, "Sur la Differentiabdite de la Fonction d'Appui du
Sousdflerential a e-Pres," Comptes Rendus de LSAcademie des Sciences
(Paris). Serie A 292(3) pp. 221-224 ( 1981).
M. Avriel and I. Zang, "Generalized Arcwise-Connected Functions and Char-
acterization of Local-Global Minimum Properties." Journal of Opt imizahon
Theory rmd Applications 32(4) pp. 407-426 (1 980).
M.S Bazaraa. J F. Goode, and C M. Shetty. "Optunahty Crlterla In Nonlinear
Programming Without Differentiability," Operations Research 19 pp. 77-86
(1871).
A. Ben-Tal and A. Ben-Israel, "Characterization of Optimality in Convex
Programming:Nondflerentiable Case." Applicnhle Analysis 9(2) pp. 137-156
(1979).
![Page 253: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/253.jpg)
V.I. Berdysev. "Continuity of Multivalued Mappings Connected with the
Minimization of Convex Functionals (in Russian)." Doklady Akademii Nauk
SSSR 243(3) pp. 561-584 (1978).
V.V. Beresnev, "On Necessary Optimality Conditions for the Systems of
Discrete inclusions (in Russian)," Kibernetika 13(2) pp. 58-64 (1977).
J.M. Bomein, "Fractional Programming without Differentiability,"
Mathematical Programming l l (3 ) pp. 283-290 (1976).
J.M. Borwein, "Tangent Cones, Starshape and Convexity," International
Journal 01 Mathematics and Mathematical Sciances l(4) pp. 497-590 ( 1978).
J.M. Borwein, "The Mmimutn of a Farndy of Programs," Third Symposium on
Operations Rasearch ( lhrwersity of Manhaim. Mannheim) 1 pp. 100-102
(1978).
J.M. Borwein, "A Multivalued Approach to the Farkas Lemma," in Point-to-
Set Maps i n Mathematical Prog-ramming, ed. P. Huard,Mathernat~cal Pro-
gramming Study 10, North-Holland. Amsterdam (1979).
J.M. Borwein. "A Note on Perfect Duality and Limlting Lagranglans."
Mathematical Programming lB(3) pp. 330-337 ( 1980).
J. Bracken and J.T. McGill. "Mathematical Programs with Optimization Prob-
lems in the Constraints," Operations Research 21 pp. 37-44 (1973).
A. Brondsted and R.T. Rockafellar, "On the Subdifferentiability of Convex
Functions." Proceedings of the American Mathmatical Society 16 pp. 605-
61: (1965).
A. Brondsted. "On the Subdifferential of the Supremum of Two Convex Func-
tions." Mathemt&.cn S c a n d i n h a 31 pp 225-230 (1972).
S.L. Brurnelle. "Convex Operators and Supports." Mathematics sf Opera-
tions Resea~ch 3(2) pp. 171-175 (1978).
![Page 254: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/254.jpg)
S. Chandra and M . Chanramohan, "Duality In Mixed lnteger Nonconvex and
Nondifferentiable Programming." Z e i t s c h q t f u r Porgewmdte Mathemat ik
u n d Mschanik 59(4) pp. 205-209 (1979).
F.H. Clarke, "Generalized Gradients and Applications," 7?ansactions of the
Amer i can Mathemat ical Soc i e t y 2tXi pp. 247-262 (1 975).
F.H. Clarke, "A New Approach to Lagrange Multipliers," Mathemat i c s of
Operat ions R e s e a r c h 1(2) pp. 165-174 (1076).
F.H Clarke, "Optimal Control and the True Hamiltonian," SIAM R e v i e w
21(2) pp. 157-166 (1979)
F.H. Clarke, "Generalized Gradients of Llpschitz Functions," Advances in
Mathemat i c s 40(1) pp. 52-67 (1980).
F.H. Clarke, "Nonsmooth Analysis and Optimization." pp. 847-853 in
Proceed ings of t h e Internat ional Congress of Mathemat i c s ( H e l s i n k i 1978),
Finnish Academy of Sciences, Helsinki (1980).
J.P. Crouzeix. "Conditions for Convexity of Quasiconvex Functions,"
Mathemat i c s of Operat ions R e s e a r c h 5(1) pp. 120-125 ( 1980).
J.P. Crouzeix, "Some Differentiability Properties of Quasiconvex Functions
on R-n," pp. 9-20 in O p t i m i z a t i o n m d @ t i m u l Control . ed. A.Auslender,
W.Oettli, and J.Stoer.Lecture Notes in Control and Idormation Science.
Vo1.30, Sprlnger (1981).
J . M . Danskin, "The Theory of Max-Min w ~ t h Appl~catlons," S I N JournaL o n
Applied Mathemat i c s 14 pp. 641-664 (1 966).
V.F. Demyanov and V.N. Malozemov, "The Theory of Nonlinear Minimax Prob-
lems (in Russian)," Uspekhi Matamatchesk i Nauk 26 pp. 53-104 (1971).
V.F. Demyanov, "Second Order D~rectional Derivatives of a Function of the
Maximum (in Russian)," f i be rne t ika 9(5) pp. 67-69 (1973). English transla-
![Page 255: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/255.jpg)
tion: Cybernetics Vo1.9(5) pp. 797-800
V.F. Demyanov and V.K. Somesova. "Conditional Subdifferential of Convex
Functions (in Russian)," Doklady Akademii Nauk SSSR 242(4) pp. 753-756
( 1978).
V.F. Demyanov and A.M. Rubinov, "On Quasidiflerentiable Functionals (in
Russian)," Doklady Academii Nauk SSSR 250(1) pp. 21-25 (1980).
. V.F. Demyanov. "On the Relation between the Clarke Subdflerential and the
Quasidifferential (in Russian)." Vestnik Leningrndskogo h i v e r s i t e t a 13 pp.
18-24 (1980). English translation: Vestnik Leningrad Umversitet Vo1.13 pp
183- 189.
V.F. Demyanov and A.M. Rubinov. "Some Approaches to Nonsmooth Optimi-
zation Problems (in Russian)." Economika i Matematicheskie Metody
17(8) pp. 1153-1 174 (1981).
. S. Dolecki and S Rolewcz, "Exact Penalty for Local Minima." S/AM Journal
on CbnC701 and Optimization 17 pp. 596-606 (1979).
. A.Y. Dubovitskii and A.A. Milyutin. "Extremum Problems In the Presence of
Constraints (in Russian)," Doklady Akademii Nauk S S S R 149 pp. 759-761
(1983). English translation: Soviet Mathematics Doklady Vol. 4 pp. 452-
455( 1963)
A.Y. Dubovitshi and A.A. Milyutin, "Extremum Problems in the Presence of
Restrictions (in Russian)," Zurnal @cislitel 'noi Maternatiki i Matemati-
ceskoi Fiziki 5 pp. 395-453 (1985). English translation: USSR Computational
Mathematics and Mathematical Physics Vo1.5
Pham Canh Duong and Hoang Tuy. "Stability, Surjectivity and Local Inverta-
bihty of Nondifferentiable Mappings.'' Acta Mathematics Vietnamica 3(1) pp.
89- 105 ( 1978).
![Page 256: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/256.jpg)
. J.J.M. Evers and H.van Maaren, "Duality Principles in Mathematics and Their
Relations to Conjugate Functions," Technical Report. Department of
Applied Mathematics. Technische Hogeschool Twente, Enschede. The Neth-
erlands ( 198 1).
W . Fenchel, "On Conjugate Convex Functions." Canad ian Journal of
Mathamat i c s 1 pp. 73-77 (1949).
R. Fletcher and G.A. Watson, "First and Second Order Conditions For a Class
of Nondiflerentiable Optimization Problems," Mathemat i ca l P r o g r a m m i n g
16(3) pp. 291-307 (1980).
J. Gauvin and J.W. Tolle, "Differential Stability in Nonlinear Programming,"
SIAM Journal o n Control and Qpt inr iza t ion 15(2) pp. 294-3 11 (1977).
J. Gauvin, "The Generalized Gradient of a Marginal Function in Mathematical
Progranuning." Mathemat i c s of Operat- R e s e a r c h 4(4) pp. 458-463
(1970).
E.G. Golstein and N.V. Tretyakov, "Modfied Lagrangians in Convex Program-
ming and Their Generalizations." pp. 86-97 in Point - to-Set Maps in
Mathemat i ca l P r o g r a m m i n g , ed. P. Huard,Mathematical Programming
Study 10, North-Holland, Amsterdam (1979).
B.D. Graven. "lmplic~t Function Theorems and Lagrange Multipliers,"
Numer ica l Funct ional Analys is a n d O p t i m i z a t i o n 8(2) pp. 473-486 ( 1980).
R.C. Gr~nold, "Lagrangian Subgradients," Management S c i e n c e 17 pp. 185-
I88 (1970).
A . M . Gupal, "On the Properties of Functions Satisfying the Local Lipsch~tz
Condition (in Russ~an)." Kiberne t ika 14(4) pp. i40-142 (1978).
J. Gwinner, "Closed Images of Convex Multivalued Mappings in Llnear Topo-
logical Spaces with Applications," Journal of Mathemat i ca l Analysis a n d
![Page 257: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/257.jpg)
Applications M(1) pp. 75-86 (1977).
J . Gwmner, "Contribution a la Programmation non Differentiable dans des
Espaces Vectoriels Topologiques," C o m p t e s Rend- de 1'Academi.e d e s Sci-
e n c e s (Par is ) , Snrie A 289(10) p. 523 ( 1979).
. J . Gwinner, "On Optirnality Conditions for Infimte Programs," pp. 21-28 in
O p t i m u a t w n and Opt imal Control , ed. A.Auslender, W.Oettli, and
J.Stoer.Procee&ngs of a Conference held a t Oberwolfach, March i6-22,
1Q8O. Lecture Notes in Control and Information Science, Vo1.30, Springer
( l e a l ) .
H. Halkin, "The Method of Dubovitskii-Milyutin in Mathematical Program-
ming," pp. 1-12 in S y m p o s i u m o n O p t i m i z a t i o n a n d S t a b i l i t y P r o b l e m s in
C o n t i n u u m Mechanics , 24 August 1971, Los Angeles. Springer (1973).
H. Halhn. "Implicit Functions and Optirnizatlon Problems without Continu-
ous Differentiability of the Data." STAM Journa l o n Control 12(2) pp. 229-236
(1974).
H. Ilalkin, "Mathematical Programming without D~fferentiability." pp. 279-
288 in Calculus o j V a r i d w n s a n d Control Theory , ed. D.L. Russel1,Academic
Press. New York (1978).
H. Halkin. "Necessary Conditions for Opt~rnal Control Problems with
Differentiable and Nondifferentiable Data," pp. 77-l i8 in Mathetat ical Con-
t ro l Theory , ed. W . A . Coppe1,Lecture Notes in Mathematics. Vo1.68, Springer,
Berlin (1978).
W . Heins and S . K . Mitter. "Conjugate Convex Functions, Duality, and Optimal
Control Problems. I . Systems Governed by Ordinary Differential Equations."
I n f i n a h a n S c i e n c e s 2(2) pp. 211-243 (1970).
![Page 258: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/258.jpg)
, J.-B. Hiriart-Urruty. "Contribution a la Programmation Mathernatique Cas
Deterministe et Stochastique," These Serie E N 247, Universite de
Clermont-Ferrand. Clermont-Ferrand, France (1977).
. J.-B. Hiriart-Urruty, "Generalized Gradients of Marginal Value Function,"
SlAM Journal o n Control a n d @t i tn i za t i on 16 pp. 301-316 (1978).
. J.-B. Hiriart-Urruty, "Refinements of Necessary Optimality Conditions in
Nondiderentiable Programming." Applied Mathemat i c s a n d O p t i m i z a t i o n
$1) pp. 63-82 (1979).
J.-B. Hiriart-Urruty. "Tangent Cones. Generalized Gradients and Mathernati-
cal Programming in Banach Spaces." Mathemat i c s of Operat ions R e s e a r c h
4(1) pp. 79-97 (1979).
J.-B. Hiriart-Urruty, "New Concepts in Nondiderentiable Programming."
Bulletin of the Mathemat ical S o c i e t y of f i ance 80 pp. 5-85 (1979). [ Also in
Journees d'Analyse Non Convexe, May 1980 ]
J.-B. Hiriart-Urruty. "hpschitz r-Continuity of the Approximate
SubdiRerential of a Convex Function." Mathemat i ca Scand inawica 47 pp.
123-134 (1980).
. J.-B. Hiriart-Urruty, "Extension of h p s c h t z Functions," Journal of
Mathemat ical Analysis a n d Applications 77(2) pp. 539-554 (i980).
. J.-B. Hiriart-Urruty. "E-Subdlfferent~al Calculus." in Proceed ings of the Col-
l o q u i u m "Convez Analysis a n d Opt imizat ion", 28-29 February , 1 980,
lmperial College, London (torthcoming) .
. W.W. Hogan. "Directional Derivatives ot Extremal-Value Functions with Appll-
cations to the Completely Convex Case," Operat ions Research 21(1) pp.
188-209 (1973).
![Page 259: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/259.jpg)
W . W . Hogan, "Point-to-Set Maps in Mathematical Programming," S I M
Review 15 pp. 591-603 (1973).
A.D loffe and V.L. Levin, "Subdlderentials of Convex Functions (in Rus-
sian)," Transac t ions of the Moscow Mathemat ical S o c i e t y 28pp . 1-72
(1972).
A.D lode. "Nonsrnooth Annlys~s DlfTerent~al Calculus of Nondlfferent~able
Mappings." 7?ransactzons of the Amer i can Mathemat ical S o c i e t y 266(1) pp.
1-56 (1981)
V. Kankova. "D~fferentiability of the Optimal Function in a Two-Stage Sto-
chastic Nonlinear Programming Problem ( in Czech )," Ekonomiko-
M a t e m a t i c k y ~ &ZOT 14(3) pp. 322-330 (1978)
D. Klatte, "On the Lower Semicontinuity of Optimal Sets in Convex Program-
mug." pp. 104-109 in Point-to-Set Maps in Mathemat ical P r o g r a m m i n g , ed.
P . Huard,Mathematical Programming Study 10, North-Holland, Amsterdam
(1979).
K.O. Kortanek. "Perfect Dual~ty In Semi-Inlinite and Generalized Convex
Programming." Operat ions R ~ s e a r c h Verfahren 25 pp. 79-88 (1977).
C. Lemarechal and E. A. Nurrninski, "Differentlability of a Support Function
of a n E-Subgradient," WP-80-i01, lnternational Institute lor Applied Sys-
tems Analysis. Laxenburg. Austria (1980).
C . Lemarechal and E.A. Nurmlnski, "Sur la Dlfferentiabllite de la Fonction
d'Appui du Sous-Differentlel Approche." C o m p t e s R e n d u s d e 1 'Academic d e s
Sc i ences (Par i s ) , S e n e A 290 pp. 855-858 (1 980).
A.V. Levltln, "A Generalization ot the Property of Strong Convexity and Gen-
eral Theorems on the Gradlent Method for Mimm~zation of Functionals (in
Russian)." pp. 112-i20 in Mathemat ical Methods f o r So lu t inn of Economic
![Page 260: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/260.jpg)
Roblems, ed. N.P. Fedorenko and E.G. Golstein.Nauka. Moskow (1977)
E.S. Levitin and B.S. Darhovski, "Quadratic Optimality Condtions for a Class
of Nonsmooth Problems of Mathematical Programming (in Russian)," Ilok-
lady Akademii N a v k SSSR 244(2) pp. 270-273 (1979).
E.S. Levitin, "Second-Order Conditions in Nonsmooth Mathematical Pro-
gramming Problems Applied to the Minimax with Bound Variables (in Rus-
sian)." Doklady Akademii Navlc SSSR 2442) pp. 288-290 (1979).
P.O. Lindberg. "A Generalization of Fenchel Conjugation Giving Generalized
Lagrangians and Symmetric Nonconvex Duality." in Suruey o j Mathematical
Aogramming, ed. A. Prekopa,North-Holland. Amsterdam - Akademiai Kiado
Proceedings of 9th International Mathematical Programming Symposium.
( 1979).
. P.O. Lindberg, "A Relaxational Framework tor Duality." Technical Report
TRITA-MAT-1981-12, Department of Mathematics, Royal Institute of Technol-
ogy. Stockholm (1981).
. C. Malivert, J.-P. Penot, and M. Thera, "Minimization d'une Fonction Regu-
liere sur un Ferme non Regulier et Non-Convexe d'un Espace de Hilbert."
Comptos Rendus de 1 'Academic des Sciences (Paris). Swia A 286 pp. 165-
168 (1978).
0. Mangasarian. "Uniqueness o! Solution in Linear Programming," Linear
Algebra m d Applications 25 pp . 152-161 (1979).
H. Massam and S. Zlobec, "Various Definitions ot the Der~vat~ve in
Mathematical Programming." Mathematical Programming 7(2) pp. 144-161
( 1974).
E.J. McShane. "Extension of Range ot Functions." Bulletin o j the American
Yathemntical Society 40 pp. 837-842 (1930).
![Page 261: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/261.jpg)
R.A. Minch, "Applications of Symmetric Derivatives in Mathematical Pro-
gramming." Mathemat ical P r o g r a m m i n g 1 pp. 307-321 (197 1).
J.J. Moreau, "Fonctlonnelles Sous-Differentiables." C o m p t e s Rendus de
L'Academis d e s Sc i ences ( P h ) . Ser i e A 257pp. 4117-4119 (1963).
J.J. Moreau, "Semi-Continuite de Sous-Gradient d'une F o n c t i o ~ e l l e , "
Comptas Rendus de 1'Academie das S c i e n c e s (Paris), S e r b A 280 pp. 1087-
1070 (1985).
E.A. Nurminslu, "On the Continuity of E-Subd~flerential Mappings (in Rus-
sian)." K i e r n e t i k a 13(5) pp. 148-149 (1977). English translation: Cybernet-
ics Vol. 13(5) pp. 790-791(1978).
W . Oettli, "Symmetric Duality, and a Convergent Subgradent' Method for
Discrete, Linear. Constrained Approximation Problems with Arbitrary
Norms Appearing in the Objective Function and in the Constraints." Journal
01 h p p r o z h a t i o n Theory 14( 1)( 1975).
J.-P. Penot, "Calcul Sous-DiBerentiel e t Optimisation," Journal of P tmc-
f-bmL Analysis m(2) pp. 248-276 (1 978).
J.-P. Penot, "The Use of Generalized Subdifferential Calculus in Optimization
Theory." Operat ions Research Verfahren 31 pp. 495-51 1 ( 1979).
J.-P. Penot. "A Character~zation of Tangential Regularity," Nonlinear
Analysis, Th8ory, Methods a n d Applications 5(6) pp. 625-643 (1981).
J -P. Penot, "On the Existence of Multipliers in Mathematical Programming
in Banach Spaces," pp. 89-104 In Q t i m i z a t i o n a n d Optimal Control, ed.
A.Auslender, W.Oettli, and J.Stoer.Proceedings of a Conference held a t
Obenvollach, March 16-22. 1980 Lecture Notes in Control and Information
Sclence, Vo1.30, Springer (1981).
![Page 262: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/262.jpg)
. B.N. Pshenichniy. Lecow sur les Jeuz Differentials, Controle Optimal et
Jeux Differentiels. Vo1.4, I.N.R.I.A.. Le Chesnay. France (1971).
. B.N. Pshenichniy. "Necessary Extremum Conditions for Differential Inclu-
sions (in Russian)," Kibemetika 12(6) pp. 60-78 (1976).
. B.N. Pshenichniy, "On Necessary Extremum Conditions for Nonsmooth
Functions (in Russian)," Kibemetika lS(6) pp. 92-96 (1977).
. S.M. Robinson and R.R. Meyer, "Lower Semicontinuity of Multivalued Lineari-
zation Mappings." SIAM Journal on Control l l ( 3 ) pp. 525-533 (1973).
. S.M. Robinson and R.H. Day. "A Sufficient Condition for Continuity of
Optimal Sets in Mathematical Programming," Journal of Mathematical
Analysis and Applications 45(2) pp. 506-5 1 1 ( 1974).
. S.M. Robinson. "Regularity and Stability of Convex Multivalued Functions,"
Mathematics of Operations Research l(2) pp. 130-143 (1976).
. S.M. Robinson. "A Characterization of Stability in hnear Programming,"
Operations Research 25(3) pp. 435-447 ( 1977).
. S.M. Robinson, "Generahzed Equations and Their Solutions, Part 1: Basic
Theory," pp. 128- 141 in Point-to-Set Maps i n Mathematical Programming.
ed. P. Huard,Mathematical Programming Study 10, North-Holland, Amster-
dam (1979).
. S.M. Robmson, "Some Continuity Propert~es of Polyhedral Multifunctions."
pp. 206-214 in Mathematical Programming at Obenuolfach, ed. H. Konig. B.
Korte, and K. Rltter,Mathematical Programming Study 14, North-Holland,
Amsterdam (1981).
. R.T. Rockafellar, "Conjugate Convex Functions in Optimal Control and the
Calculus of Variations," Journal of Mathematical Analysis and &plicatwns
32 pp. 174-222 (1970).
![Page 263: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/263.jpg)
R.T. Rockafellar. "Augmented Lagrange Multiplier Functions and Duality in
Nonconvex Programming." S I N Journal o n Control 12 pp. 288-285 (1974).
R.T. Rockafellar and R.J.-B. Wets, "Measures as Lagrange Multipliers in Mul-
tistage Stochastic Programming." Journal of Mathemat i ca l AnaLysis and
Appl ical ions 80(2) pp. 301-313 (1977).
R.T. Rockafellar. "Higher Derivatives of Conjugate Convex Functions," Inter-
n a t i o n a l Journal of Applied Analysis l(1) pp. 41-43 ( 1977).
R.T. Rockafellar. "Clarke's Tangent Cones and the Boundaries of Closed Sets
In R-n," N o d i n e m Analysis 3(1) pp. 145-154 (1978).
R.T. Rockafellar, "Directionally Lipschitzian Functions and Subdifferential
Calculus," R o c e e d i n g s of t h London Mathemat i ca l S o c i e t y 39(3) pp. 331-
355 ( 1979).
R.T. Rockafellar and J.E. Spingarn, "The Generic Nature of Optimality Condi-
tlons in Nonlinear Programming," Mathemat i c s o f Operat ions R e s e a c h
*4) pp. 425-430 (1979).
R.T. Rockalellar, "Generalized Directional Derivatives and Subgradlents of
Nonconvex Functions." Canad ian Journal o f M a t h e m a t i c s 32 pp. 257-280
(leao).
R.T. Rockafellar, "Favorable Classes of bpschitz Continuous Functions in
Subgradient Optimization." WP-81-1, International Institute for Applied Sys-
tems Analys~s, Laxenburg. Austria (1981).
. R.T. Rockafellar, "Augmented Lagrangians and Marginal Values in
Parameterized Optimization Problems," WP-81-19, International Institute
for Applied Systems Analysis, Laxenburg, Austria (1981).
. S. Rolevicz, "On Condit~ons Warranting Ph-2 Subdiflerentiability." pp. 215-
224 in Mathemat ical Programming at Oberwolfach. , ed. H. Konig, B. Korte,
![Page 264: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/264.jpg)
and K. Ritter,Mathematical Programming Study 14. North-Holland, Arnster-
dam (1981).
G. Salinetti and R.J.-B. Wets, "On the Convergence of Sequences of Convex
Sets in W t e Dimensions," SIAh4 Review 21 pp. 1333 (1979).
M . Schechter, "More on Subgradient Duality," Journal 01 Mathematical
Analysis and Applications 71(1) pp. 251-281 (1979).
C. Singh, "A System of Inequalities and Nondifferentiable Mathematical Pro-
gramming." Juurnal 01 g t imiza t ion Theory and Applications m(2) pp.
291-299 (1979).
J.E. Spingarn, "Submonotone Sub&Berentials or Lipschitz Functions," Ran-
suctions 01 the American Mathematical Society 2841) pp. 77-89 (1981).
J.E. Spingarn, "On Optmality Conditions for Structured Families of Non-
linear Programming Problems." Mathematical Programming 22(1) pp. 82-
92 (1982).
5. Tanimoto. "Nondiderentiable Mathematical Programming and Convex-
Concave Functions," Journal 01 mtimization Theory and Applications
Sl(3) pp. 331-342 (1980).
L. Thbault. "Subdifferentiability of Compactly Lipschitzian Vector-Valued
Functions," h u u u z Sem. A d . Conveze B(5) pp. 1-54 (1978).
L. Thlbault, "Subdifferentiability of Compactly Lipschtzian Mappings."
Operations Research Verjahren 31 pp. 637-645 (1979).
Hoang Tuy. "Stability Property of a System of Inequalities." Acta Mathema-
tics Viehamica 2(1) pp. 3-18 (1977).
J.P. Vial, "Strong Convexity or Sets and Functions,'' Discussion Paper 8018.
Center for Operations Research and Econometrics, Universite Catholique de
Louvain ( 1980).
![Page 265: PROGRESS IN NONDIFFERENTIABLE OPTIMIZATIONpure.iiasa.ac.at/2019/1/CP-82-708.pdf · system, determine the optimal structure and compute the optimal parameters of the control system,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f51a37d44e973e66b044c/html5/thumbnails/265.jpg)
J. Warga. "Derivatives, Containers. Inverse Functions and Controllability,"
pp. 13-46 in Calculus o j Varia t ions a n d Control m e o r y , ed. D.L.
Russel1,Academic Press. New York (1976).
J. Warga, "Controllability and a Multiplier Rule lor Nondifferentiable Optimi-
zation Problems." SIAM Journal o n Control a n d @ t i m i z a t i o n 16pp. 803-
012 (1970).
R.J.-B. Wets, "Stochastic Programs with Fixed Recourse: The Equivalent
Deterministic Program," SIAM R e v i e w 10(3) pp. 309-339 (1974).
R.J.-B. Wets. "Convergence of Convex Functions, Variational Inequalities and
Convex Optimization Problems," pp. 378-403 in Variational I n e g d i t i e s a n d
C o m p l e m e n t a ~ Prob lems , ed. R.W. Cottle, F. Giannessi, and J-L.
Lions.Wiley. Chtchester (1980).
A.P. Wlerzbicki, "Maximum Principle for Semismooth Performance Func-
t ional~." S I M Journal on Conlrol 10(3) pp. 444-459 (1972).
A.P. Wierzbicki and S. Kurcyusz, "Projection on a Cone: Generalized Penalty
Functions and Duality Theory for Problems with Inequality Constraints in
Hilbert Space," SIAM Journal o n Conbrol a n d O p t i m i z a t i o n 15 pp. 25-56
( 1977).
V.1. Zabotin and Yu.A. Polonsky. "Preconvex Sets. Mappings and Their Appli-
catlons to Extreme-Value Problems (in Russian)," Kiberne t ika 17(1) pp. 71-
74 (1981).
S. Zlobec and B. Mond, "Duality for Nondifferentiable Programming without
Constraint Quahfication." UhLifas Mathematics 15 pp. 291-302 (1979).
S. Zlobec and A. Ben-Israel, "Perturbed Convex Programs. Continuity of
OpUmal Solutions and Optimal Values." Operat ions Resecach Ver jahren
31 pp. 737-749 (1979).