numerical algorithms for inverse eigenvalue problems...
TRANSCRIPT
NUMERICAL ALGORITHMS FOR
INVERSE EIGENVALUE PROBLEMS ARISING
IN CONTROL AND NONNEGATIVE MATRICES
By
Kaiyang Yang
SUBMITTED IN PARTIAL FULFILLMENT OF THE
REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
AT
THE AUSTRALIAN NATIONAL UNIVERSITY
CANBERRA, AUSTRALIA
OCTOBER 2006
c© Copyright by Kaiyang Yang, October 2006
THE AUSTRALIAN NATIONAL UNIVERSITY
DEPARTMENT OF
INFORMATION ENGINEERING, RSISE
The undersigned hereby certify that they have read and recommend
to the Research School of Information Sciences and Engineering for
acceptance a thesis entitled “Numerical Algorithms for Inverse
Eigenvalue Problems Arising in Control and Nonnegative
Matrices” by Kaiyang Yang in partial fulfillment of the requirements
for the degree of Doctor of Philosophy.
Dated: October 2006
Research Supervisors:Prof. John B. Moore
Dr. Robert Orsi
Examing Committee:Prof. Iven Mareels
Prof. Andrew Lim
ii
THE AUSTRALIAN NATIONAL UNIVERSITY
Date: October 2006
Author: Kaiyang Yang
Title: Numerical Algorithms for Inverse Eigenvalue
Problems Arising in Control and Nonnegative
Matrices
Department: Information Engineering, RSISE
Degree: Ph.D.
Permission is herewith granted to The Australian National Universityto circulate and to have copied for non-commercial purposes, at its discretion,the above title upon the request of individuals or institutions.
Signature of Author
THE AUTHOR RESERVES OTHER PUBLICATION RIGHTS, ANDNEITHER THE THESIS NOR EXTENSIVE EXTRACTS FROM IT MAYBE PRINTED OR OTHERWISE REPRODUCED WITHOUT THE AUTHOR’SWRITTEN PERMISSION.
THE AUTHOR ATTESTS THAT PERMISSION HAS BEEN OBTAINEDFOR THE USE OF ANY COPYRIGHTED MATERIAL APPEARING IN THISTHESIS (OTHER THAN BRIEF EXCERPTS REQUIRING ONLY PROPERACKNOWLEDGEMENT IN SCHOLARLY WRITING) AND THAT ALL SUCH USEIS CLEARLY ACKNOWLEDGED.
iii
To Bojiu Yang and Ping Sun, My Parents
and Hongsen Zhang, My Husband.
iv
Table of Contents
Table of Contents v
List of Figures ix
List of Symbols xi
Statement of Originality xii
Acknowledgements xiv
Abstract xvi
I Introduction 1
1 Introduction 2
1.1 Inverse Eigenvalue Problems . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Problems Arising in Control . . . . . . . . . . . . . . . . . . . 4
1.1.3 Problems Arising in Nonnegative Matrices . . . . . . . . . . . 4
1.2 Research Motivations and Contributions . . . . . . . . . . . . . . . . 5
1.2.1 Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
II Background 11
2 Projections 13
v
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Alternating Projections . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3 Computational Complexity 20
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 What is NP-Hard? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.3 Computational Complexity in Control . . . . . . . . . . . . . . . . . 22
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
III Problems Arising in Control 23
4 A Projective Methodology for Generalized Pole Placement 27
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2.1 The Symmetric Problem . . . . . . . . . . . . . . . . . . . . . 32
4.2.2 The General Nonsymmetric Problem . . . . . . . . . . . . . . 36
4.3 Computational Results . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.3.1 Classical Pole Placement: Random Problems . . . . . . . . . . 43
4.3.2 Classical Pole Placement: Particular Problem . . . . . . . . . 44
4.3.3 Continuous Time Stabilization: Random Problems . . . . . . 45
4.3.4 Continuous Time Stabilization: Particular Problem . . . . . . 47
4.3.5 Discrete Time Stabilization: Random Problems . . . . . . . . 49
4.3.6 Discrete Time Stabilization: Particular Problem . . . . . . . . 50
4.3.7 A Hybrid Problem . . . . . . . . . . . . . . . . . . . . . . . . 51
4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5 A Projective Methodology for Simultaneous Stabilization and De-
centralized Control 53
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.2.1 Simultaneous Stabilization . . . . . . . . . . . . . . . . . . . . 56
5.2.2 Decentralized Control . . . . . . . . . . . . . . . . . . . . . . . 59
5.3 Computational Results . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.3.1 Simultaneous Stabilization: Random Problems . . . . . . . . . 61
5.3.2 Simultaneous Stabilization: Particular Problems . . . . . . . . 63
5.3.3 Decentralized Control: Random Problems . . . . . . . . . . . 63
vi
5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6 Trust Region Methods for Classical Pole Placement 66
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
6.2 Trust Region Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6.2.1 Basic Methodology . . . . . . . . . . . . . . . . . . . . . . . . 69
6.2.2 Convergence Results . . . . . . . . . . . . . . . . . . . . . . . 72
6.3 Derivative Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.4 Additional Comments . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.5 Computational Results . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.5.1 Random Problems . . . . . . . . . . . . . . . . . . . . . . . . 77
6.5.2 Particular Problems . . . . . . . . . . . . . . . . . . . . . . . 79
6.5.3 Repeated Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . 81
6.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
7 A Gauss-Newton Method for Classical Pole Placement 84
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
7.2 The Gauss-Newton Method . . . . . . . . . . . . . . . . . . . . . . . 87
7.3 Computational Results . . . . . . . . . . . . . . . . . . . . . . . . . . 89
7.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
IV Problems Arising in Nonnegative Matrices 92
8 A Projective Methodology for Nonnegative Inverse Eigenvalue Prob-
lems 94
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
8.2 The Symmetric Problem . . . . . . . . . . . . . . . . . . . . . . . . . 98
8.3 The General Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
8.4 Computational Results . . . . . . . . . . . . . . . . . . . . . . . . . . 108
8.4.1 SNIEP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
8.4.2 NIEP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
8.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
9 Newton Type Methods for Nonnegative Inverse Eigenvalue Prob-
lems 116
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
9.2 Newton Type Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 118
9.3 Derivative Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . 119
vii
9.4 Computational Results . . . . . . . . . . . . . . . . . . . . . . . . . . 120
9.4.1 SNIEP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
9.4.2 NIEP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
9.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
V Conclusion and Future Work 126
10 Conclusion and Future Work 127
10.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
10.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
A Results for Classical Pole Placement 132
B More Computational Results for Stabilization 134
Bibliography 137
viii
List of Figures
1.1 Flow chart of the thesis. . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1 Projection point onto convex set is unique. . . . . . . . . . . . . . . . 14
2.2 Projection points onto nonconvex set may be multiple. . . . . . . . . 15
2.3 Alternating projections between two intersecting convex sets. . . . . . 17
2.4 Alternating projections between a convex set and a nonconvex set in-
tersecting each other. . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.1 Illustration of simultaneous stabilization. . . . . . . . . . . . . . . . . 25
3.2 Illustration of decentralized control. . . . . . . . . . . . . . . . . . . . 25
4.1 Examples of generalized static output feedback pole placement problem. 30
4.2 Alternating projections for generalized static output feedback pole
placement problems. . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.3 Performance for classical pole placement using up to 10 initial conditions. 44
4.4 Performance for discrete time stabilization using up to 10 initial con-
ditions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.5 The closed loop poles corresponding to a solution for the considered
hybrid problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
6.1 Quadratic convergence near solution of the Levenberg-Marquardt al-
gorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
7.1 Quadratic convergence near solution of the Gauss-Newton algorithm. 91
ix
8.1 Illustration of the problem formulation for SNIEP. . . . . . . . . . . . 99
8.2 Illustration of the problem formulation for NIEP. . . . . . . . . . . . 104
8.3 Linear convergence of the SNIEP algorithm. . . . . . . . . . . . . . . 110
9.1 Quadratic convergence near solution of the SNIEP algorithm. . . . . 122
9.2 Quadratic convergence near solution of the NIEP algorithm. . . . . . 125
x
List of Symbols
R the set of real numbers.
C the set of complex numbers.
Rm×p the set of real m× p matrices.
Cm×p the set of complex m× p matrices.
On the set of orthogonal n× n matrices.
Sn the set of real symmetric n× n matrices.
Sn+ the set of real positive semidefinite n× n matrices.
AT the transpose of matrix A.
A∗ the complex conjugate transpose of matrix A.
tr(A) the sum of the diagonal elements of a square matrix A.
λ(A) the set of eigenvalues of matrix A ∈ Rn×n.
ρ(A) the maximum of the real parts of the eigenvalues of A ∈ Rn×n.
diag(v) the n× n diagonal matrix for v ∈ Cn whose i’th diagonal term is vi.
Re(a) the real part of a ∈ C.
Im(a) the imaginary part of a ∈ C.
vec(A) ∈ Cmp consists of the columns of A stacked below each other for A ∈ Cm×p.
A⊗B the Kronecker product of A and B.
‖ · ‖2 the vector 2-norm.
‖ · ‖F the Frobenius norm of a matrix. (In the cases of no confusion, F is omitted.)
xi
Statement of Originality
I hereby declare that this submission is my own work, in collaboration with others,
while enrolled as a PhD candidate at the Department of Information Engineering,
Research School of Information Sciences and Engineering, the Australian National
University. To the best of my knowledge and belief, it contains no material previously
published or written by another person nor material which to a substantial extent
has been accepted for the award of any other degree or diploma of the university or
other institute of higher learning, except where due acknowledgement has been made
in the text.
Most of the technical discussions in this thesis are based on the following publi-
cations:
• K. Yang, J. B. Moore, and Robert Orsi. Gauss-Newton Method for Solving
Static Output Feedback Pole Placement Problem. In preparation.
• K. Yang, R. Orsi, and J. B. Moore. Newton Type Methods for Solving Inverse
Eigenvalue Problems for Nonnegative Matrices. In preparation.
• K. Yang and R. Orsi. Simultaneous Stabilization and Decentralized Control: a
Projective Methodology. In preparation.
• K. Yang and R. Orsi. Static Output Feedback Pole Placement via a Trust
Region Approach. Submitted to IEEE Transactions on Automatic Control.
xii
• K. Yang and R. Orsi. Generalized Pole Placement via Static Output Feedback:
a Methodology Based on Projections. Automatica, 42 (12), 2006.
• R. Orsi and K. Yang. Numerical Methods for Solving Inverse Eigenvalue Prob-
lems for Nonnegative Matrices. In Proceedings of the 17th International Sympo-
sium on Mathematical Theory of Networks and Systems (MTNS), pp.2284∼2290,
Kyoto, Japan, 2006.
• K. Yang and R. Orsi. Pole Placement via Output Feedback: a Methodology
Based on Projections. In Proceedings of the 16th IFAC World Congress, 6 pages,
Prague, Czech Republic, 2005.
• K. Yang, R. Orsi, and J. B. Moore. A Projective Algorithm for Static Output
Feedback Stabilization. In Proceedings of the 2nd IFAC Symposium on System,
Structure and Control (SSSC), pp.263∼268, Oaxaca, Mexico, 2004.
xiii
Acknowledgements
I express my deepest gratitude to Professor John B. Moore and Dr. Robert Orsi for
being my supervisors and offering me so much guidance and help throughout this
research. John’s invaluable help, guidance and insights greatly influenced me. His
optimistic attitudes towards both research and life impressed me so much and will
always encourage me in my later life. Dr. Robert Orsi had been my supervisor and
the person I most closely worked with. His most careful scientific research attitude,
broad knowledge, invaluable day-to-day supervision and much patience are highly
appreciated. His encouragements will support me to go further in future research.
My special thanks go to Dr. Robert Mahony for being my advisor. He advised
me on some key issues in my research and offered me the chance to do tutoring which
was an important and cherished component of my PhD training.
Professor Uwe Helmke at the Wurzburg University of Germany brought the non-
negative inverse eigenvalue problems to our attention which led to a successful topic
in this research. I would like to thank his generous guidance.
Professor Iven Mareels at the University of Melbourne of Australia advised us on
various parts of the literature for our research. Only with that, could we have a deep
understanding of the previous research situation. I would like to thank his important
help.
I am grateful to Dr. Mei Kobayashi, Mr. Hiroyuki Okano and Mr. Toshinari Itoko
for their collaboration and help during my visit to IBM Tokyo Research Laboratory,
which made my three months’ stay very fruitful and enjoyable.
Department of Information Engineering of the Australian National University and
xiv
the SEACS group in National ICT Australia (NICTA) offered me warm environments
throughout my PhD. I especially thank Dr. Knut Huper, our program leader in
NICTA, for his help and guidance. I also especially thank Dr. Jochen Trumpf and
Dr. Alexander Lanzon for their much help and discussion. My special thanks go to
all our departmental staff and friends for their invaluable friendship.
Finally, the most importantly, I always thank my family for their endless love
and support. My parents dedicated all their love and effort to my twenty years’ long
education. They always support my decisions to pursue my dreams even I need to be
10, 000 kilometers away. They keep my heart warm and my belief firm.
My husband, Hongsen Zhang, is the most amazing person in my life. He gave
me all his love, support, encouragement, understanding and patience during my hard
work. Only with his unfailing love, could I possess the courage to go further in my
life and work.
xv
Abstract
An Inverse Eigenvalue Problem (IEP) is to construct a matrix which possesses both
proscribed eigenvalues and desired structure. Inverse eigenvalue problems arise in
broad application areas such as control design, system identification, principle com-
ponent analysis, structure analysis etc. There are many different types of inverse
eigenvalue problems and despite of a great deal of research effort being put into this
topic many of them are still open and are hard to be solved.
In this dissertation, we propose optimization algorithms for solving two types
of inverse eigenvalue problems, namely, the static output feedback problems and the
nonnegative inverse eigenvalue problems. Consequently, this dissertation is essentially
composed of two parts.
In the first part, three novel methodologies for solving various static output feed-
back pole placement problems are presented. The static output feedback pole place-
ment framework encompasses various pole placement problems. Some of them are
NP-hard, for example, classical pole placement [31]. That is, an efficient (i.e. poly-
nomial time) algorithm that is able to correctly solve all instances of the problem
cannot be expected. In this dissertation, a projective methodology, two trust region
methods and a Gauss-Newton method are proposed to solve various instances of the
pole placement problems.
In the second part, two novel methodologies for solving nonnegative/stochastic
inverse eigenvalue problems are presented. Nonnegative matrices arise in many ap-
plication areas and attract a lot of research in matrix analysis community. Stochastic
xvi
inverse eigenvalue problem has potential applications in Markov chains and the the-
ory of probability etc. In the small dimensional cases, i.e., the dimension of the
resulting matrix is less or equal to 5, there exists necessary and sufficient conditions
to fully characterize the problem. However when the dimension grows larger, the
problem becomes much harder to be solved. The existing necessary conditions are
too general and sufficient conditions are too specific. In general the proofs of the suf-
ficient conditions are nonconstructive. In this dissertation, a projective methodology
and two Newton type methods are proposed which are widely applicable to various
nonnegative inverse eigenvalue problems.
All of the problems considered are important and challenging in their area. The op-
timization methodologies are clearly stated and the algorithms are intensively tested.
More than the problems being solved in this thesis, the algorithms appear to be quite
useful in a lot more related problems, i.e., inverse eigenvalue problems subject to
different structural constraints.
xvii
Part I
Introduction
1
Chapter 1
Introduction
1.1 Inverse Eigenvalue Problems
1.1.1 Background
The spectral properties of a physical system govern its dynamical performance. Hence
the computation of eigenvalues enables the basic understanding of the underlying
physical system. In contrast, an inverse eigenvalue problem is to reconstruct a physical
system from desired dynamical behavior, namely, its eigenvalues. In this research, we
concentrate our attention on the problems whose systems can be somehow expressed
by matrices.
It is clear that an inverse eigenvalue problem is trivially solved if there is no
restriction on its structure. We simply construct a diagonal matrix with desired
eigenvalues as the diagonal entries. However in practice we usually require that the
resulting matrix from a specific inverse eigenvalue problem is physically realizable
and thus additional structural constraints are imposed. For example in static output
feedback problems, the closed loop systems form an affine subspace with the form
A + BKC. In general the solution to an inverse eigenvalue problem must satisfy two
2
3
constraints – the spectral constraint referring to the prescribed spectral data, and the
structural constraint referring to the desired structure.
There are various types of inverse eigenvalue problems and attract a lot of re-
search. For more details on different problems, existing theoretical results, numerical
algorithms, applications and open problems, refer to an excellent book [20].
Associated with any inverse eigenvalue problem are two fundamental questions:
the solvability and the computability. The solvability is to determine a necessary
and/or a sufficient condition under which an inverse eigenvalue problem is solvable.
The computability is to develop efficient and reliable algorithms to construct the
matrices with prescribed eigenvalues and desired structure, where the problem is
feasible. Both questions are difficult and challenging.
Inverse eigenvalue problems arise in a remarkable variety of applications, for ex-
ample, control design, system identification, seismic tomography, principle compo-
nent analysis, exploration and remote sensing, antenna array processing, geophysics,
molecular spectroscopy, particle physics, structure analysis, circuit theory, mechanical
system simulation [20].
In this dissertation, we propose optimization algorithms for the inverse eigenvalue
problems arising in control and nonnegative matrices. We use both classical methods,
i.e. Newton type methods, and novel methods, i.e. alternating projections and trust
region methods. What follows describe the problems being considered in greater
detail.
4
1.1.2 Problems Arising in Control
One of the most basic control tasks is the static output feedback pole placement.
That is, given system matrices A ∈ Rn×n, B ∈ Rn×m, C ∈ Rp×n and a list of desired
eigenvalues λD ∈ Cn, find a static output feedback controller K ∈ Rm×p such that
λ(A + BKC) = λD. Static output feedback control problems are a type of special
inverse eigenvalue problem as the structural constraint is the closed loop system has
the form A + BKC. These problems have simple expressions and wide applications.
They have been intensively researched in the past half a century. Despite this, many of
them are still open and some of them are proved to be NP-hard, for example, classical
pole placement and simultaneous stabilization. Development of novel, efficient and
reliable algorithms for these hard problems are certainly of great interest.
1.1.3 Problems Arising in Nonnegative Matrices
Nonnegative matrices are those whose entries are nonnegative. Stochastic matrices,
which are a type of special nonnegative matrices, are with each row sum to 1. Non-
negative/stochastic matrices are widely used in game theory, Markov chains, theory
of probability, probabilistic algorithms, discrete distributions, categorical data, group
theory, matrix scaling and economics [20]. Nonnegative inverse eigenvalue problem
is to construct a square nonnegative matrix with desired eigenvalues. When the di-
mension of the desired matrix is greater than 5, there is no necessary and sufficient
condition available. The existing necessary conditions are too general and sufficient
conditions are too specific. In general the proofs of sufficient conditions are noncon-
structive. To the best to our knowledge, there are only a few algorithms in literature
5
and they are not applicable to high dimensional problems. Development of novel al-
gorithms which are efficient and can be used to solve large scale problems is of great
interest.
1.2 Research Motivations and Contributions
1.2.1 Motivations
Identifying the open inverse eigenvalue problems arising in control and nonnegative
matrices and their hardness, our main interest here is to develop new optimization
algorithms. In the following, we stress several key points on why the considered
problems are interesting.
1. Pole placement in the generality (see for Chapter 4), which allows flexibility in
choosing the pole placement regions, has not previously been considered. Taking
the advantage that the pole placement regions need not to be convex or even
connected, there are broad choices of the pole placement regions for various
control tasks. Though many algorithms for different instances of generalized
pole placement have been proposed, none of them unifies the solution method
for various problems in one framework.
2. Simultaneous stabilization (see for Chapter 5) is an important problem in robust
control with broad applications. It is proved to be NP-hard [6]. Decentralized
control (see for Chapter 5) arises naturally from the need of controlling large
scale systems, where a centralized controller is not possible to apply. With a
bound on the norm of the controller, decentralized control problem is NP-hard
6
[6]. Despite the great deal of work that has been done on these problems, these
problems are not thoroughly solved and keep attracting a lot of research effort.
3. Classical pole placement (see for Chapter 6 and 7) is one of the most important
open problems in control design. It is shown to be NP-hard [31]. Though simply
expressed, the problem is hard to be solved.
4. Inverse eigenvalue problems for nonnegative/stochastic matrices attracted much
research in the past fifty years. However there are only one algorithm for sym-
metric nonnegative inverse eigenvalue problems and another for general non-
symmetric problems. Due to the nature of existing algorithms, they can hardly
solve high dimensional problems. Nonnegative/stochastic inverse eigenvalue
problems have broad potential applications.
1.2.2 Contributions
In regard to algorithm development, we mainly apply three methodologies, i.e. the
projective methodology, Newton type methods and trust region methods. Since each
of them has its own advantage, we highlight three key points on the contributions of
this work.
• In projective methodologies (see for Chapter 4, 5 and 8), we formulate the prob-
lems into feasibility problems involving two closed sets. In the case where one
of the sets is nonconvex, how to project onto the nonconvex set is an open hard
problem. We propose a substitute map which gives a reasonable estimation
of the true projection and provides good performance in computational exper-
iments. The idea of handling nonconvexity in such problem settings can be
7
used to many other related problems, especially those involving nonsymmetric
matrices.
• In trust region methods and Newton type methods (see for Chapter 6, 7 and
9), we formulate the problems as nonlinear least squares problems. The idea
of applying these methods to the specific problem settings is a perfect joint of
optimization techniques and practical problems. The proposed algorithms are
broadly tested to be efficient and reliable. Such ideas should be useful in many
more problems.
• Various instances of control tasks via static output feedback (see for Chapter
4) are unified into single framework, namely, generalized static output feedback
pole placement. The flexibility in choosing pole placement regions enables the
encompassment of a large scope of control tasks, especially less standard pole
placement problems. Hence the algorithms for solving these problems can be
applied to those which can be put in this framework.
1.3 Outline of the Thesis
This thesis consists of 5 parts including 10 chapters. Part I is the introduction and
Part V is the conclusion and future work. The main content is presented in 3 parts.
Figure 1.1 is a flow chart of the thesis, which clearly indicates on how the thesis is
organized. A more detailed introduction to the main content is as follows:
Part II : Background
In this work, optimization algorithms are developed to tackle inverse eigenvalue
problems arising in control and nonnegative matrices. All of them are hard to be
8
Introduction
Part II
Projections
Computational Complexity
Background
Part III
Problems Arising in Control
A Projective Methodology for Simultaneous Stabilization and
Decentralized Conrol
A Projective Methodology for Generalized Pole Placement
Trust Region Methods for Classical Pole Placement
Part IV
Problems Arising in Nonnegative Matrices
Newton Type Methods for Nonnegative Inverse
Eigenvalue Problems
A Projective Methodology for Nonnegative Inverse
Eigenvalue Problems
Conclusion and Future Work
A Gauss-Newton Method for Classical Pole Placement
Figure 1.1: Flow chart of the thesis.
solved. Different methodologies are employed to the algorithm developments,
i.e., alternating projections, Newton type methods and trust region methods.
Overviews of trust region methods and Newton type methods are given in the
chapters they are employed. Projections and alternating projections are used in
three chapters (see for Chapter 4, 5 and 8) and are introduced in the background
part. This part also includes a brief introduction to computational complexity
and the computational complexity in control.
Part III : Arising in Control
Control is one of the two main application areas considered in this work. Op-
timization algorithms for various static output feedback control problems are
9
presented in part III. First of all, generalized pole placement via static output
feedback problem is considered. They are formulated as feasibility problems
and solutions can be found via a projective methodology. Following the same
idea, simultaneous stabilization via static output feedback and stabilization via
decentralized static output feedback are considered. Alternating projection idea
is applied. Then trust regions methods are developed for solving classical pole
placement via static output feedback. As this problem is NP-hard, the trust
region methods appear to be performing extremely well. Lastly a novel Gauss-
Newton method is considered for the classical pole placement problem. The
problem is formulated as a constrained nonlinear optimization problem and
minimization is achieved via a Gauss-Newton method.
Part IV : Arising in Nonnegative Matrices
Problems arising in nonnegative and stochastic matrices are the other main
application area in this work. Two different methodologies are presented for
solving nonnegative/stochastive inverse eigenvalue problems, namely, a projec-
tive methodology and Newton type methods. The algorithms appear to be very
useful for various problems. The algorithms are also extendable to many other
inverse eigenvalue problems, especially those involving nonsymmetric matrices.
1.4 Summary
This chapter forms a base of this thesis. We first introduce the background of the
inverse eigenvalue problems. Identifying its broad applications and the existing diffi-
culties in various inverse eigenvalue problems, we are motivated to focus our attention
10
on the problems arising in control and nonnegative/stochastic matrices. Based on
this, we are ready to proceed to the following chapters with research details.
Part II
Background
11
12
This part consists of two chapters. They cover the key background information on
projections and computational complexity and are indispensable to understand the
main parts of this research.
Chapter 2 contains general properties of projections, alternating projections and
recalls how alternating projections can be used to find a point in the intersection of a
finite number of closed (convex) sets. It plays a key part in the algorithms in Chapter
4, 5 and 8. Basic knowledge is introduced and important properties are highlighted.
In Chapter 3, a brief introduction to computational complexity is presented. It is
essential to understand how hard the NP-hard problems are since several important
NP-hard problems are tackled in this work.
Chapter 2
Projections
2.1 Introduction
This chapter contains general properties of projections, alternating projections and
introduces the method of alternating projections. Since alternating projection idea,
especially its variation applied to the problems involving nonconvex set, is crucial
to the algorithms in Chapter 4, 5 and 8, we gather such materials in this separate
chapter. We first introduce the basic knowledge and then emphasize on the key
properties of the concepts introduced.
2.2 Projections
Projections play a key part in the algorithms. This section contains general properties
of projections.
Let x be an element in a Hilbert space H and let D be a closed (possibly noncon-
vex) subset of H. Any d0 ∈ D such that ‖x− d0‖ ≤ ‖x− d‖ for all d ∈ D will be
13
14
called a projection of x onto D. In the cases of interest here, namely where H is a
finite dimensional Hilbert space, there is always at least one such point for each x. A
function PD : H → H will be called a projection operator (for D) if for each x ∈ H,
PD(x) ∈ D and
‖x− PD(x)‖ ≤ ‖x− d‖ for all d ∈ D.
Where convenient, we will use y = PD(x) to denote that y is a projection of x
onto D. We emphasize that y = PD(x) only says y is a projection of x onto D and
does not make any statement regarding uniqueness.
If D is convex as well as closed then each x has exactly one projection point PD(x)
[51]. In Figure 2.1, D is a closed convex set and x is an arbitrary starting point. PD(x)
is a projection of x onto D and PD(x) is unique.
()
Figure 2.1: Projection point onto convex set is unique.
However if D is nonconvex but closed then each x may have multiple projection
points PD(x)’s. In Figure 2.2, D is a closed nonconvex set and x is an arbitrary
starting point. Due to the nonconvexity of D, the projections of x onto D may be
not unique. PD(x)1 and PD(x)2 are both of the smallest distance to x among all the
points in D and hence are both projections.
15
()2
()1
Figure 2.2: Projection points onto nonconvex set may be multiple.
2.3 Alternating Projections
In Chapter 4, 5 and 8, all problems of interest are feasibility problems of the following
abstract form.
Problem 2.3.1. Given closed sets D1, . . . ,DN in a finite dimensional Hilbert space
H, find a point in the intersectionN⋂
i=1
Di
(assuming the intersection is nonempty).
(In fact, we will solely be interested in the case N = 2.)
If all the Di’s in Problem 2.3.1 are convex, a classical method of solving Problem
2.3.1 is to alternatively project onto the Di’s. This method is often referred to as the
Method of Alternating Projections (MAP). If the Di’s have nonempty intersection, the
successive projections are guaranteed to asymptotically converge to an intersection
point [9].
Theorem 2.3.2 (MAP). Let D1, . . . ,DN be closed convex sets in a finite dimen-
sional Hilbert space H. Suppose⋂N
i=1Di is nonempty. Then starting from an arbitrary
initial value x0, the sequence
xi+1 = PDφ(i)(xi), where φ(i) = (i mod N) + 1,
16
converges to an element in⋂N
i=1Di.
We remark that the usefulness of MAP for finding a point in the intersection of
a number of sets is dependent on being able to compute projections onto each of the
Di’s.
The significance of Theorem 2.3.2 is that it defines a systematic numerical algo-
rithm for finding a point in the intersection of closed convex sets [67].
Theorem 2.3.3. Let D1, . . . ,DN be closed convex sets in a finite dimensional Hilbert
space H. Given constants t1, . . . , tN in the interval (0, 2), for i = 1, . . . , N , define the
operators
Ri = (1− ti)Id + tiPDi.
If⋂N
i=1Di is nonempty, then starting from an arbitrary initial value x0, the following
sequence
xi+1 = Ri(mod N)+1(xi)
converges to an element in⋂N
i=1Di.
Proof. See [36]. An alternate proof can also be found in [78].
Here Id denotes the identity operator on H; Id(x) = x for all x ∈ H. Each Ri of
the theorem is a projection operator onto Di if and only if ti = 1. If ti 6= 1 then Ri
is referred to as a relaxed projection. Theorem 2.3.3 is a generalization of Theorem
2.3.2 which corresponds to the case t1 = · · · = tN = 1. We will see later how the
freedom to choose the ti’s not equal to 1 can be useful. Figure 2.3 illustrates the
alternating projections between two intersecting closed convex sets. The pink region
and the green region represent two closed convex sets. The blue region is the feasible
17
set, i.e. the intersection of these two sets. Starting from an arbitrary point x, a point
in the feasible set can be found via alternating projections.
feasible set
starting point
Figure 2.3: Alternating projections between two intersecting convex sets.
When ti ∈ (0, 1), Ri is under projection where we step only the fraction ti from
the starting point to the target set. Hence Ri does not really map into the target
set, so we do not usually apply this technique to algorithms. On the contrary when
ti ∈ (1, 2), Ri is over projection where we step further ‘into’ the target set. So over
projection is sometimes used to find a point in the intersection in a finite number of
steps [8].
When one or more Di’s are nonconvex, Theorem 2.3.3 no longer applies and start-
ing the algorithm of Theorem 2.3.3 from certain initial values may result in a sequence
of points that does not converge to a solution of the problem [23]. Figure 2.4 illus-
trates the alternating projections between two intersecting sets involving nonconvex
set. The pink region represents a closed convex set. However the green region is a
closed nonconvex set. The blue region is the feasible set, i.e., the intersection of these
two sets. If we choose an arbitrary starting point x, alternating projection scheme
does not converge to a solution in the feasible set. If we choose another arbitrary
starting point y, alternating projection scheme is working and a point in the feasible
18
set can be found.
feasible set
starting point starting point
Figure 2.4: Alternating projections between a convex set and a nonconvex set inter-
secting each other.
If however there are only two sets, then the following distance reduction property
always holds.
Theorem 2.3.4. Let D1 and D2 be closed (nonempty) sets in a finite dimensional
Hilbert space H. For any initial value y0 ∈ D2, if
x1 = PD1(y0),
y1 = PD2(x1),
x2 = PD1(y1),
then
‖x2 − y1‖ ≤ ‖x1 − y1‖ ≤ ‖x1 − y0‖ .
Proof. The second inequality holds as y1 is a projection of x1 onto D2 and hence its
distance to x1 is less than or equal to the distance of x1 to any other point in D2 such
as y0. The first inequality holds by similar reasoning.
19
Corollary 2.3.5. If for i = 0, 1, . . .,
xi+1 = PD1(yi), yi+1 = PD2(xi+1),
that is, the xi’s and yi’s are successive projections between two closed sets, then
‖xi − yi‖ is a nonincreasing function of i.
Suppose one is interested in solving Problem 2.3.1 in the case of two sets, D1 and
D2, when one or both sets are nonconvex. If projections onto these sets are com-
putable, a solution method is to alternately project onto D1 and D2. Corollary 2.3.5
ensures that the distance ‖xi − yi‖ is nonincreasing with i. While this is promis-
ing, there is, however, no guarantee that this distance goes to zero and hence that a
solution to the problem will be found.
Most of the literature on alternating projection methods deals with the case of
convex subsets of a (possibly infinite dimensional) Hilbert space; a survey of these
results is contained in [2]. The text [27] is also recommended. There is much less
available for the case of one or more nonconvex sets; see in particular [23].
2.4 Summary
In this chapter, an overview of projections is given. We introduce the general prop-
erties of projections, alternating projections and recalls how alternating projections
can be useful in finding a common point in the intersection of finite number of closed
(convex) sets. Moreover the alternating projections involving nonconvex set are also
analyzed. Please refer to Chapter 4, Chapter 5 and Chapter 8 for the algorithms
using projective ideas.
Keywords: projections, alternating projections.
Chapter 3
Computational Complexity
3.1 Introduction
In this research, we tackle a few NP-hard problems which are long open problems
in control, for example, classical pole placement and simultaneous stabilization via
static output feedback. NP-hardness is directly related to computability and com-
plexity. In this chapter, we present a brief introduction to computational complexity
to give a basic idea on what are NP-hard problems. As large volume of literature
on computational complexity in its own right exists, especially, many on computa-
tional complexity analysis of control related problems, we just mention a few for more
details (see for [6] and [31] and a recent survey [5]).
3.2 What is NP-Hard?
A decision problem is one where the desired output is binary, and can be interpreted
as ‘yes’ or ‘no’.
20
21
A decidable problem is one for which there exists an algorithm that always halts
with the right answer. But there also exist undecidable problems, for which there is
no algorithm that always halts with the right answer.
If T (s) of an algorithm is defined as a function of size to be the worst case running
time over all instances of size s, we say that an algorithm runs in polynomial time if
there exists some integer k such that
T (s) = O(sk)
and we define P as the class of all (decision) problems that admit polynomial time
algorithms. Due to both practical and theoretical reasons, P is generically viewed as
the class of problems that are efficiently solvable.
There are many decidable problems of practical interest for which no polynomial
time algorithm is known. Many of these problems belong to a class known as NP
(nondeterministic polynomial time), that includes all of P. A decision problem is
said to belong to NP if every ‘yes’ instance has a ‘certificate’ of being a ‘yes’ instance
whose validity can be verified with a polynomial amount of computation.
If there exists a problem within the class NP that every problem in NP can be
reduced to it in polynomial time, then this problem is said to be NP-complete.
If a problem is at least as hard as some NP-complete problem, then it is said
to be NP-hard. NP-hardness of a problem means that it is about as difficult as a
NP-complete problem.
22
3.3 Computational Complexity in Control
Computational complexity analysis into various application areas is an active research
topic. It is helpful for the people to understand how valuable an algorithm is or how
much we can optimize the solution. In control, there are many problems proved to
be NP-hard or NP-complete. That is, a polynomial time algorithm that can correctly
solve all instances of such problems cannot be expected. For example, classical pole
placement via static output feedback and simultaneous stabilization via static output
feedback are NP-hard problems that are considered in this work. More than these,
there are many NP-hard problems in robust stability analysis, nonlinear control,
optimal control and Markov decision theory etc. For more details on these open hard
control problems, please refer to a recent survey [5].
3.4 Summary
In this chapter, a brief introduction to computational complexity is given. The main
purpose is to introduce NP-hardness and hence gets to know how hard the NP-hard
problems are. Note that computational complexity analysis is an independent and
abundant research area in its own right.
Keywords: computational complexity, NP-hard, undecidable.
Part III
Problems Arising in Control
23
24
This part consists of four chapters. They include optimization algorithms devel-
oped for solving inverse eigenvalue problems arising in control.
In Chapter 4, we present a projective methodology for solving static output feed-
back pole placement problems of the following rather general form: given n subsets
of the complex plane, find a static output feedback that places in each of these sub-
sets a pole of the closed loop system. The algorithm presented is iterative in nature
and is based on alternating projection ideas. Each iteration of the algorithm involves
a Schur matrix decomposition, a standard least squares problem and a combinato-
rial least squares problem. While the algorithm is not guaranteed to always find a
solution, computational results are presented demonstrating the effectiveness of the
algorithm.
In Chapter 5, we extend the projective methodology presented in Chapter 4 to
tackle some different control problems – simultaneous stabilization via static output
feedback and stabilization via decentralized static output feedback.
Unlike the static output feedback pole placement in the generality considered in
Chapter 4, which is handling one system each time, simultaneous stabilization is to
stabilize multiple systems simultaneously using one static output feedback controller.
In Figure 3.1, P1, . . . , Pn are n independent systems and we are supposed to find a
controller K such that all the closed loop systems are stable.
On the contrary, decentralized control is to split the control task into multiple
channels. In each channel, a controller is employed to stabilize the subsystem. De-
centralized control is generally arising in controlling large scale systems, where cen-
tralized control is not possible to apply. In Figure 3.2, P is a plant which is divided
into n subsystems P1, . . . , Pn. For each subsystem Pi (i = 1, . . . , n), a controller
25
Figure 3.1: Illustration of simultaneous stabilization.
Ki is employed to stabilize it. Again the algorithms are not guaranteed to always
find a solution. The effectiveness of the algorithms is demonstrated in computational
results.
Figure 3.2: Illustration of decentralized control.
In Chapter 6, we present two closely related algorithms for classical pole place-
ment via static output feedback. The pole placement problem is formulated as an un-
constrained nonlinear least squares optimization problem involving the desired poles
and the closed loop system poles. Minimization is achieved via two different trust
26
region approaches that utilize the derivatives of the closed loop poles. Extensive nu-
merical experiments show that the algorithms are very effective in practice though
convergence to a solution is not guaranteed for either algorithm. Near solutions,
both algorithms typically converge quadratically. While the algorithms require the
desired poles to be distinct, effective strategies for dealing with repeated poles are
also presented.
In Chapter 7, we present a Gauss-Newton method for the problem of classical pole
placement via static output feedback. The pole placement problem is formulated as an
constrained nonlinear least squares optimization problem involving the Schur form of
the closed loop system. Minimization is achieved via a Gauss-Newton method. Near
solutions, the algorithm is typically converge quadratically. As algorithm does not
require the desired poles to be distinct, there is no formal limitation on the proposed
algorithm. Further convergence analysis and more extensive experiments will be
carried out in future work.
Chapter 4
A Projective Methodology forGeneralized Pole Placement
4.1 Introduction
There has been a great deal of research done on the problems of pole placement and
stabilization via static output feedback. An overview of theoretical results, existing
algorithms and historical developments can be found in [10], [26], [29], [60], [61]
and [69]. (For the convenience of the reader the appendix A lists some of the main
theoretical pole placement results that have appeared in the literature to date.) Our
main interest here is in algorithms, and in this regard, for pole placement, the survey
paper [61] states that existing sufficiency conditions are mainly theoretical in nature
and that there are no good numerical algorithms available in many cases when a
problem is known to be solvable. Despite the great deal of work that has been done
in this area, new algorithms for these important problems are still of great interest.
In this chapter we will actually consider the following generalized static output
feedback pole placement problem.
Problem 4.1.1. Given A ∈ Rn×n, B ∈ Rn×m, C ∈ Rp×n and closed subsets C1, . . . , Cn
27
28
⊂ C, find K ∈ Rm×p such that
λi(A + BKC) ∈ Ci for i = 1, . . . , n.
Here λi(A + BKC) denotes the i’th eigenvalue of A + BKC.
Problem 4.1.1 encompasses many types of pole placement problems. Indeed by
varying the choice of Ci’s, Problem 4.1.1 can for example be specialized to the following
problems:
1. Classical pole placement :
Ci = ci, ci ∈ C.
Here each region Ci is an individual point on the complex plane, see for Figure
4.1(a).
2. Relaxed classical pole placement :
Ci = z ∈ C | |z − ci| ≤ ri.
Here each region Ci is a disk centered at ci ∈ C with radius ri ≥ 0, see for Figure
4.1(b).
3. Stabilization type problems for continuous time and discrete time systems :
C1 = . . . = Cn = z ∈ C | Re z ≤ −α, α > 0, (4.1.1)
and
C1 = . . . = Cn = z ∈ C | |z| ≤ α, 0 < α < 1,
respectively, see for Figure 4.1(c) and (d).
29
4. Hybrid problems : for example, problems of the type shown in Figure 4.1(e).
Here c ∈ C and the aim is to place a pair of poles at c and c, and to place the
remaining poles in the truncated cone C:
C1 = c, C2 = c, C3 = . . . = Cn = C.
As far as we are aware, pole placement in the generality presented in Problem
4.1.1 has not previously been considered. The most closely related results from the
existing literature are as follows. Early work in [37] considers pole placement in a
single region specified by polynomials. While a Lyapunov type necessary and sufficient
condition is given for a matrix to have its eigenvalues in such a region, this condition
is polynomial in the matrix in question and hence not readily amenable to controller
design. In [16], LMI conditions are presented that are sufficient though not necessary
for pole placement in various convex regions. These results cover state feedback and
full-order dynamic output feedback but not static output feedback. Another LMI
based approach, again for a single, though this time, possibly disconnected region, is
considered in [7]. A method for placing poles in distinct convex regions (each region
is specified using linear programming constraints or second order cone constraints)
is given in [38] however the method is based on eigenvalue perturbation results and
hence appears largely limited to cases where the open loop poles are already quite
close to the desired poles. In [63], pole placement in distinct convex regions (each
region is a disk or a half plane) is achieved via a rank constrained LMI approach
though the results are only for state feedback.
This chapter presents an algorithm for Problem 4.1.1. The approach employed
here is quite different to each of the approaches mentioned above. Problem 4.1.1 is
shown to be equivalent to finding a point in the intersection of two particular sets,
30
(a) Classical pole placement. (b) Relaxed classical pole placement.
(c) Continuous time stabilization. (d) Discrete time stabilization.
(e) A hybrid problem.
Figure 4.1: Examples of generalized static output feedback pole placement problem.
31
one of which is a simple convex set, the other a rather complicated nonconvex set.
The algorithm is iterative in nature and is based on an alternating projection like
scheme between these two sets. Each iteration of the algorithm involves a Schur
matrix decomposition and a standard least squares problem. If the Ci’s are not all
equal, each iteration also requires a combinatorial least squares matching step.
Alternating projection type ideas have been employed previously for output feed-
back stabilization, see in particular [33], [35] and [57]1. A distinguishing feature of
our algorithm is that, unlike these methods, our algorithm does not involve LMIs. (A
further technical difference is that rather than solving feasibility problems that in-
volve symmetric matrices, the problem solved by the algorithm is a feasibility problem
involving nonsymmetric matrices.)
The algorithm can be applied to problems with rather general choices for the Ci
regions. In fact the only formal requirement is the following, which we state in the
form of an assumption.
Assumption 1. It is possible to calculate projections onto each of the Ci’s: given
z ∈ C it is possible to find zi ∈ Ci such that |z − zi| ≤ |z − c| for all c ∈ Ci.
In particular, the Ci’s must be closed sets though they need not be convex or even
connected.
For given A, B, C and Ci’s, Problem 4.1.1 may or may not have a solution. Indeed,
one would expect that determining whether a particular instance of Problem 4.1.1 is
solvable is in general difficult. For example, the problem of determining whether the
classical pole placement problem is solvable for particular A, B, C and desired poles
has recently been shown to be NP-hard [31]. Given the difficulty of Problem 4.1.1, an
1They were first used in control design to solve certain convex problems, see [34].
32
efficient (i.e., polynomial time) algorithm that is able to correctly solve all instances
of the problem cannot be expected, and while the algorithm presented here is often
quite effective in practice, it is not guaranteed to find a solution even if a solution
exists.
This chapter is structured as follows 2. Section 4.2 presents the solution method-
ology. To motivate the solution methodology we first restrict our attention to systems
that have a symmetric state space representation (A = AT , C = BT ) and present
an algorithm for this easier class of problems. The general problem is then consid-
ered. Section 4.3 contains computational results of applying the algorithm to various
instances of Problem 4.1.1.
4.2 Methodology
4.2.1 The Symmetric Problem
Systems with a symmetric state space realization, that is, systems with state space
matrices satisfying A = AT , C = BT , occur in various contexts, for example RC-
networks. In order to motivate our solution methodology, we first consider the fol-
lowing special case of Problem 4.1.1 for symmetric systems3.
Problem 4.2.1. Given A ∈ Sn, B ∈ Rn×m, and closed subsets C1, . . . , Cn ⊂ R, find
K ∈ Sm such that
λi(A + BKBT ) ∈ Ci for i = 1, . . . , n.
2Please refer to Chapter 2 for the background of projections.3See [52] for more regarding the classical pole placement problem for symmetric systems and a
rather surprising result regarding arbitrary pole placement for such systems.
33
Note that in Problem 4.2.1 the static output feedback matrix K is required to
be symmetric and that the Ci’s are assumed to be subsets of the real numbers. This
latter assumption is natural as given any symmetric K, A+BKBT is also symmetric
and hence has real eigenvalues.
We assume A, B and C1, . . . , Cn are given and fixed. Let
L = Z ∈ Sn | Z = A + BKBT for some K ∈ Sm
and let M denote the set of symmetric matrices with eigenvalues in the specified
regions C1, . . . , Cn,
M = Z ∈ Sn | λi(Z) ∈ Ci, i = 1, . . . , n.
The symmetric problem can be stated as follows:
Find X ∈ L ∩M.
We now show that projections onto both sets L and M can be calculated and
hence that alternating projections can be employed as a solution method.4
For now on, Sn will be regarded as a Hilbert space with inner product
〈Y, Z〉 = tr(Y Z) =∑i,j
yijzij,
and associated norm ‖Z‖ = 〈Z, Z〉 12 (the Frobenius norm). For i = 1, . . . , n, PCi
will
denote a projection operator for Ci.
The set L is an affine subspace and hence is convex. Projection of X ∈ Sn onto
L involves solving a standard least squares problem.
4As the system is symmetric and we restrict K to be symmetric, the symmetric static outputfeedback continuous time stabilization problem is equivalent to an LMI problem. Hence, if theproblem is solvable, a numerical solution to the problem can be readily found using existing LMIalgorithms, see for example [71]. The L and M for continuous time stabilization problem are bothconvex and projections onto these sets are readily calculated.
34
Lemma 4.2.2. The projection of X ∈ Sn onto L is given by PL(X) = A + BKBT
where K is a solution of the least squares problem
arg minK∈Sm
‖(B ⊗B) vec(K)− vec(X − A)‖2 .
A proof of a more general version of this result is given in the next subsection.
While the set M is not convex, it is still possible to calculate projections onto M.
How to do this is shown in Theorem 4.2.4 below. The result is based on the following
result of Hoffman and Wielandt.
Lemma 4.2.3. Suppose Y, Z ∈ Sn have eigenvalue-eigenvector decompositions
Y = V DV T , D = diag(λY1 , . . . , λY
n ),
Z = WEW T , E = diag(λZ1 , . . . , λZ
n ),
where V,W ∈ Rn×n are orthogonal and λY1 ≥ . . . ≥ λY
n and λZ1 ≥ . . . ≥ λZ
n . Then
‖D − E‖ ≤ ‖Y − Z‖ , (4.2.1)
where ‖ · ‖ denotes the Frobenius norm.
Proof. See for example [40, Corollary 6.3.8].
Theorem 4.2.4. Given Y ∈ Sn, let Y = V DV T be an eigenvalue-eigenvector de-
composition of Y with D = diag(λ1, . . . , λn). Let σ be a permutation of 1, . . . , nsuch that amongst all possible permutations, it minimizes
n∑
k=1
|λk − PCσ(k)(λk)|2.
Define
PM(V, D) = V DV T
35
where
D = diag(PCσ(1)(λ1), . . . , PCσ(n)
(λn)).
Then PM(V, D) is a best approximant in M to Y in the Frobenius norm.
Proof. Let Y be as in the theorem statement. As PM(V,D) ∈M, it remains to show
‖Y − PM(V,D)‖ ≤ ‖Y − Z‖ for all Z ∈M. (4.2.2)
Without loss of generality, suppose the eigenvalues of Y are ordered, i.e., λ1 ≥ . . . ≥λn. Similarly, for Z ∈ M, let Z = WEW T be an eigenvalue-eigenvector decomposi-
tion with E = diag(λZ1 , . . . , λZ
n ) and λZ1 ≥ . . . ≥ λZ
n .
As the Frobenius norm is orthogonally invariant,
‖Y − PM(V,D)‖ = ‖D − D‖. (4.2.3)
Let π be a permutation of 1, . . . , n such that λZk ∈ Cπ(k), k = 1, . . . , n. (Such a
permutation exists as Z ∈M and hence must have an eigenvalue in each of the Ci’s.)
Define E = diag(PCπ(1)(λ1), . . . , PCπ(n)
(λn)). It follows from the definition of D that
‖D − D‖ ≤ ‖D − E‖. (4.2.4)
As for each k, |λk − PCπ(k)(λk)| ≤ |λk − λZ
k |, it also follows that
‖D − E‖ ≤ ‖D − E‖. (4.2.5)
Combining (4.2.3), (4.2.4), (4.2.5) and inequality (4.2.1) from Lemma 4.2.3 gives
the inequality in (4.2.2) and the proof is complete.
Note that to calculate PM(V, D) we keep the original orthogonal matrix V and
simply modify the diagonal matrix D to D. The fact that V remains unchanged
motivates our solution method for the general nonsymmetric case.
36
4.2.2 The General Nonsymmetric Problem
Consider again Problem 4.1.1. Throughout this subsection it is assumed A, B, C and
C1, . . . , Cn are given and fixed.
From now on Cn×n will be regarded as a Hilbert space with inner product
〈Y, Z〉 = tr(Y Z∗) =∑i,j
yij zij.
The associated norm is the Frobenius norm ‖Z‖ = 〈Z, Z〉 12 .
In this subsection we redefine L to be the set of all possible closed loop system
matrices,
L = Z ∈ Rn×n | Z = A + BKC for some K ∈ Rm×p,
and redefine M to be the set of complex matrices with eigenvalues in the specified
regions C1, . . . , Cn,
M = Z ∈ Cn×n | λi(Z) ∈ Ci, i = 1, . . . , n.
Problem 4.1.1 can now be stated as:
Find X ∈ L ∩M.
A solution strategy to solve Problem 4.1.1 would be to employ an alternating pro-
jection scheme, alternatively projecting between L and M. A difficulty occurs in
trying to do this. While L is convex and M is nonconvex, just as for the symmetric
problem, M in this case is a substantially more complicated set and how to calculate
projections onto this set is a difficult unsolved problem. That is, given a point Z, it
is not clear how to find a point in M of minimal distance to Z.
37
As will be verified by the experiments, an alternating projection like scheme can
still be quite successful if for M a suitable substitute mapping is used in place of an
actual projection operator.
Figure 4.2 illustrates the problem formulation for Problem 4.1.1 and the alternat-
ing projection scheme. The pink line represents the set L as it is an affine subspace.
The green region represents the set M which is a rather complicated closed noncon-
vex set. The dotted parts are the feasible set, i.e. the intersection of the two sets.
Starting from an arbitrary point x, we know exactly how to do projection onto L.
However instead of the actual projection onto M which is a pink point with dotted
arrow, our proposed substitute mapping gives a reasonable estimation at the black
point in M.
feasible set
starting point
Figure 4.2: Alternating projections for generalized static output feedback pole place-
ment problems.
Before proceeding, recall Schur’s result [40, Th 2.3.1].
Theorem 4.2.5. Given Z ∈ Cn×n with eigenvalues λ1, . . . , λn in any prescribed order,
there is a unitary matrix V ∈ Cn×n and an upper triangular matrix T ∈ Cn×n such
38
that
Z = V TV ∗,
and Tkk = λk, k = 1, . . . , n.
The following map is proposed as a substitute for a projection map onto M.
Though it is not a true projection map, the notation PM will still be used. Our choice
for PM is motivated by the projection operator for M for the symmetric problem.
We stress that as PM is not a true projection map, unlike for the symmetric problem,
the distance reduction property of Theorem 2.3.4 may not hold. While PM has
various desirable properties (see below), the theoretical convergence properties of the
algorithm are currently unclear, providing interesting questions for future research.
In what follows, Assumption 1 is assumed to hold.
Definition 4.2.6. Suppose V ∈ Cn×n is unitary and T ∈ Cn×n is upper triangular.
Let σ be a permutation of 1, . . . , n such that amongst all possible permutations, it
minimizesn∑
k=1
|Tkk − PCσ(k)(Tkk)|2. (4.2.6)
Define
PM(V, T ) = V TV ∗
where T is upper triangular and given by
Tkl =
PCσ(k)(Tkk), if k = l,
Tkl, otherwise.
Note that PM maps into the set M. Note also that, just as for the symmetric
problem, finding σ involves solving a combinatorial least squares problem. (This will
be discussed further later in the section.)
39
In order to apply PM to Z ∈ Cn×n, a Schur decomposition of Z must first be found.
A given Z may have a nonunique Schur decomposition and Z = V1T1V∗1 = V2T2V
∗2
does not necessarily imply PM(V1, T1) = PM(V2, T2). Hence, PM may give different
points for different Schur decompositions of the same matrix. (Though it was not
discussed at the time, similar comments apply to the PM mapping for the symmetric
case.) This is not so important as different Schur decompositions lead to points in
M of equal distance to the original matrix, as is now shown.
Theorem 4.2.7. Suppose Z = V1T1V∗1 = V2T2V
∗2 where V1, V2 ∈ Cn×n are unitary
and T1, T2 ∈ Cn×n are upper triangular. Then
‖PM(V1, T1)− Z‖ = ‖PM(V2, T2)− Z‖ .
Proof. Suppose Z = V TV ∗ where V is unitary and T is upper triangular. If T is
the matrix given in Definition 4.2.6, then by the unitary invariance of the Frobenius
norm,
‖PM(V, T )− Z‖ = ‖T − T‖.
As ‖T−T‖2 equals the quantity in (4.2.6), ‖PM(V, T )−Z‖ depends only on T11, . . . , Tnn
(and the sets C1, . . . , Cn). The Tkk’s are the eigenvalues of Z and hence, aside from
ordering, are not decomposition dependent. The result now follows by noting that
(4.2.6) does not depend on the ordering of the Tkk’s.
PM(V, T ) keeps V fixed and modifies T . Theorem 4.2.8 below shows that of all
the points in M of the form V TV ∗, T ∈ M upper triangular, i.e., of all the points
in M that have a Schur decomposition with the same V matrix, PM(V, T ) is closest
(or at least equal closest) to the original point Z = V TV ∗.
40
Theorem 4.2.8. Suppose Z = V TV ∗ ∈ Cn×n with V unitary and T upper triangular.
Then PM(V, T ) satisfies
‖PM(V, T )− Z‖ ≤ ‖V TV ∗ − Z‖ for all upper triangular T ∈M.
Proof. Let T be an upper triangular matrix in M. The unitary invariance of the
Frobenius norm implies the result will be established if it can be shown
‖T − T‖ ≤ ‖T − T‖,
where T is the matrix given in Definition 4.2.6. As both T and T are upper triangular
and T ∈M, it follows that
‖T − T‖2 =n∑
k=1
|Tkk − Tkk|2 +∑
k<l
|Tkl − Tkl|2 (4.2.7)
and that Tkk ∈ Cσ(k), k = 1, . . . , n, for some permutation σ.
The result now follows by noting that ‖T −T‖2 equals the quantity in (4.2.6) and
that this value must be less than or equal to the first summation on the right hand
side of the equality in (4.2.7).
As for the symmetric problem, projection of X ∈ Cn×n onto L involves solving a
standard least squares problem.
Lemma 4.2.9. The projection of X ∈ Cn×n onto L is given by PL(X) = A + BKC
where K is a solution of the least squares problem
arg minK∈Rm×p
∥∥(CT ⊗B) vec(K)− vec[Re(X)− A]∥∥
2.
Proof. We would like to find K ∈ Rm×p that minimizes
‖X − (A + BKC)‖2 . (4.2.8)
41
As A,B and C are real matrices, it follows that (4.2.8) equals
‖Re(X)− (A + BKC)‖2 + ‖Im(X)‖2 , (4.2.9)
and hence that the problem is equivalent to minimizing the first term in (4.2.9).
The result now follows by noting that for any Z ∈ Cn×n, ‖Z‖ = ‖vec(Z)‖2, and
that for any (appropriately sized) matrices P, Q and R, vec(PQR) = (RT ⊗P ) vec(Q)
[41].
Here is our algorithm for Problem 4.1.1.
Algorithm:
Problem Data. A ∈ Rn×n, B ∈ Rn×m, C ∈ Rp×n and C1, . . . , Cn ⊂ C.
Initialization. Choose a randomly generated matrix Y ∈ Rn×n.
repeat
1. X := PL(Y ).
2. Calculate a Schur decomposition of X: X = V TV ∗.
3. Y := PM(V, T ).
until ‖X − Y ‖ < ε.
Note that as Y = PM(V, T ) = V TV ∗, see Definition 4.2.6, ‖X − Y ‖= ‖V TV ∗ − V TV ∗‖= ‖T − T‖ = (
∑ |Tkk − Tkk|2) 12 . As the Tkk’s are the eigenvalues of X ∈ L and the
Tkk’s are the eigenvalues of Y ∈ M, the algorithm stops when X (which equals
A + BKC for some K) has eigenvalues sufficiently close to those of a matrix that
satisfies the pole placement constraints. In particular, each eigenvalue of such an X
cannot violate the pole placement constraints by more than ε.
42
As mentioned previously, PM involves finding a permutation σ that minimizes
(4.2.6). The first step in solving this combinatorial least squares problem is calculating
the squared distance of each Tkk to each subset Cl and placing these values in a n×n
cost matrix D:
Dkl := |Tkk − PCl(Tkk)|2. (4.2.10)
The problem is now equivalent to finding a permutation σ such that∑
Dkσ(k) is
minimal.
The problem of finding a minimizing σ given a cost matrix D is a linear assignment
problem which can be solved in O(n3) time using the so called Hungarian method, see
[53] for details.
Note that if the Ci’s are not all distinct, for example this occurs for stabilization
problems and what here have been called hybrid problems, the complexity of the
matching problem is reduced. In fact for stabilization problems, all the Ci’s are the
same and no matching step is required. For the hybrid problem shown in Figure 4.1
(e), it is only necessary to check n(n − 1) possibilities corresponding to which two
Tkk’s are matched to c and c. Hence for this hybrid problem, the direct approach is
faster than using the Hungarian method.
Also note that, given a cost matrix D, an alternative to the Hungarian method
is the following faster suboptimal matching strategy. Find the (or a) smallest entry
in D, say Dkl. Match Tkk with Cl and cross out row k and column l of D. Now
only consider the uncrossed out entries in D and repeat, until all n matches have
been made. This method does not always find the optimal matching though it can
often be an quite effective substitute for the Hungarian method. It will be termed
suboptimal matching. Surprisingly, as will be shown in the next section, by using
43
suboptimal matching in the algorithm it was possible to solve a particular problem
which was not solvable using the Hungarian method.
4.3 Computational Results
This section contains computational results of applying the algorithm to various in-
stances of Problem 4.1.1. We include results for classical pole placement, continuous
time and discrete time stabilization, and a hybrid problem.
The algorithms for each problem were implemented in Matlab 6.5 and all results
were obtained using a 3.06 GHz Pentium 4 machine.
Throughout this section a randomly generated matrix will be a matrix whose
entries are drawn from a normal distribution of zero mean and variance 1.
4.3.1 Classical Pole Placement: Random Problems
This subsection contains results for some randomly generated classical pole placement
problems. A 1000 problems with n = 6, m = 4 and p = 3 were created. Each problem
was created as follows. A,B, C and K were generated randomly. A scalar multiple of
the identity was added to A to ensure the stability degree of A + BKC was equal to
α = 0.1. The desired poles were taken to be the poles of A+BKC. Initial conditions
were chosen randomly.
An attempt was made to solve each problem using up to 10 different initial condi-
tions and a maximum of a 1000 iterations per initial condition. With the termination
parameter ε set to ε = 10−3, the overall success rate was 91%. (The success rate
increases to 95% if up to 20 initial conditions are used.) For the problems that were
44
solved, the average number of iterations taken was 1.8 × 103 and the average time
taken was 1.5 CPU seconds.
Figure 4.3 shows the performance of the algorithm for classical pole placement
problem if we try to solve each problem using up to 10 initial conditions.
1 2 3 4 5 6 7 8 9 1045
50
55
60
65
70
75
80
85
90
95
No. of initial conditions
Suc
cess
rat
e (%
)
Figure 4.3: Performance for classical pole placement using up to 10 initial conditions.
Note that for classical pole placement problems, in (4.2.10), if Cl = cl then
PCl(Tkk) = cl.
4.3.2 Classical Pole Placement: Particular Problem
The following problem is taken from [66] and is of interest as the set of desired poles
overlaps with the set of open loop poles. The system matrices are the following:
45
A =
1 0 0 0
0 2 0 0
0 0 −3 0
0 0 0 −4
, B =
1 0
0 1
1 0
1 1
, C =
1 0
1 0
0 1
0 1
T
.
In this problem the set of open loop poles is 1, 2,−3,−4 and the aim is to
place the closed-loop poles at −1,−2,−3,−5. While initial attempts to solve this
problem failed, the problem was solved by using suboptimal matching and replacing
Step 3 of the algorithm with the relaxed projection ‘Y := (1 − t)X + tPM(V, T )’,
t ∈ (0, 2) constant. Strictly speaking this is not a relaxed projection in the true sense
as PM(V, T ) may not be a projection of X = V TV ∗, however, the idea is clearly the
same. Solutions were successfully found by taking t close to 0; t = 0.1, 0.2 and 0.3 can
all be used to successfully find a solution. Likelihood of success increases and speed
of convergence decreases the closer t is to 0. With t = 0.3 and ε = 10−3, solutions can
typically be found in about 1.2 × 104 iterations and about 4.7 CPU seconds.(With
ε reduced greatly to ε = 10−14 a solution was found in about 106 iterations and 275
CPU seconds.)
Note: when employing relaxed projections, the loop termination criterion of the
algorithm is different. It should be replaced by ‘until ‖X − PM(V, T )‖ < ε’ as now
‖X − Y ‖ = ‖X − [(1− t)X + tPM(V, T )]‖ = t ‖X − PM(V, T )‖.
4.3.3 Continuous Time Stabilization: Random Problems
In the comparison paper [26], a number of methods for continuous time stabilization
via static output feedback are compared. In this subsection we repeat the same
numerical experiments of [26] using our algorithm, to see how our algorithm compares.
46
The experiments involve randomly generated problems. A 1000 problems were
generated for each of a number of different choices of system dimensions n, m and
p. In each case, A, B, C and K were generated randomly. A scalar multiple of
the identity was added to A to ensure it had a stability degree of one. A was then
replaced by A−BKC. All stable A’s were discarded.
In applying our algorithm, the Ci’s were chosen the same as in (4.1.1) with α = 0.1.
As we only require the poles of the closed loop system to be stable, we terminated
the algorithm as soon as all poles had real part less than zero. An attempt was made
to solve each problem using up to 10 different initial conditions and a maximum of a
100 iterations per initial condition.
Results are given in Table 4.1. As can be seen, all problems were solved. Most
problems were solved in 10 iterations or less and average solution times were very
small. These results are as good as the results for the best two algorithms tested in
[26].
Other values for α, such as 0.2, 0.5, 1, and 2, were also tried and produced just
as good results, indicating a certain robustness of the algorithm with respect to this
parameter.
The number of iterations required for the stabilization problems presented in this
subsection is much less than the number required for the classical pole placement
problems of the prior subsections. One would expect the stabilization problems to be
easier to solve and hence this result is not unexpected. This probably indicates that
the pole placement problems are harder problems, as one would expect. The other
possibility is that the algorithm is some how better suited to solving stabilization
problems rather than pole placement problems.
47
(n,m, p) (3, 1, 1) (6, 1, 1) (9, 1, 1) (3, 2, 1) (6, 4, 3) (9, 5, 4)
1 ≤ i ≤ 10 996 990 991 993 989 98210 < i ≤ 100 3 7 8 3 11 17
100 < i ≤ 1000 1 3 1 4 0 1NC 0 0 0 0 0 0T 0.0007 0.0007 0.001 0.0006 0.001 0.002
Table 4.1: A comparison of performance for different n, m and p. i denotes the number
of iterations and listed are the number of problems that converged within different
iteration ranges. ‘NC’ denotes the number of problems that did not converge within
1000 iterations. T denotes the average convergence time in CPU seconds for the
problems that converge within 1000 iterations.
Finally we note that being able to solve the continuous time stabilization problem
enables one to solve other related problems. For example the reduced order dynamic
output feedback stabilization problem can also be solved via s system augmentation
technique; see for example [69].
4.3.4 Continuous Time Stabilization: Particular Problem
The following problem taken from [44] appears frequently in the literature. The sys-
tem considered is the nominal linearized model of a helicopter:
A =
−0.0366 0.0271 0.0188 −0.4555
0.0482 −1.0100 0.0024 −4.0208
0.1002 0.3681 −0.7070 1.4200
0.0000 0.0000 1.0000 0.0000
,
48
B =
0.4422 0.1761
3.5446 −7.5922
−5.5200 4.4900
0.0000 0.0000
, C =
0
1
0
0
T
.
In this problem we wish to place the closed loop eigenvalues in the set z ∈C |Re(z) ≤ −α with α = 0.1. To achieve this, we apply the algorithm with A
replaced by A+αI. Before proceeding, we define P γM(V, T ) as follows. It is a modified
version of Definite 4.2.6 specifically for the continuous time stabilization problems.
Definition 4.3.1. Let γ ∈ R be nonpositive. For any V ∈ Cn×n unitary and any
T ∈ Cn×n upper triangular, define
P γM(V, T ) = V TV ∗,
where
Tkl =
minγ, Re(Tkk)+ i Im(Tkk), if k = l and Re(Tkk) ≥ 0,
Tkl, otherwise.
Numerical experiments show that the performance of the algorithm can actually
be improved by using P γM which depends on the parameter γ ∈ R. P γ
M shifts the
real parts of the unstable eigenvalues to γ (≤ 0) rather than to 0. (If γ = 0 then P 0M
is just PM.) As we will see in this subsection, choosing certain value of γ increases
the likelihood of finding solutions. An intuitive justification why γ < 0 can improve
convergence is the following. During the iteration process, applying PL will tend to
shift the eigenvalues back towards the right side of the complex plane. If we replace
the real parts of the unstable eigenvalues with γ < 0, even though PL may shift the
eigenvalues a little bit to the right, they may still end up in the left half plane as
desired.
49
Regarding the choice of the parameter γ, a number of different values were se-
lected for this problem. When γ ≥ −17, the algorithm was not always convergent.
However, when γ ≤ −18, for example −18,−19,−20, . . . , the algorithm appeared to
be always convergent. Typically the algorithm converged within 1000 iterations, with
computational time under 0.7 CPU seconds. A particular solution is
K =[
0.0939 1.1127]T
for which A + BKC has eigenvalues −0.1440,−9.3716,−0.1765± i0.7909.
4.3.5 Discrete Time Stabilization: Random Problems
This subsection contains results for some randomly generated discrete time sta-
bilization problems. For each problem, the aim is to place all poles in the set
C = z | |z| ≤ α, α = 0.9. A 1000 (A,B,C) triples with n = 6, m = 4 and
p = 3 were randomly generated. Triples with A stable were discarded and replaced.
As in Section 4.3.1, an attempt was made to solve each problem using up to 10
different initial conditions and a maximum of a 1000 iterations per initial condition.
With ε = 10−3, the success rate for 1 initial condition was 61% and the overall success
rate was 80%. A plot of success rate versus number of initial conditions is shown in
Figure 4.4. For the problems that were solved, the average number of iterations taken
was 3.3× 103 and the average time taken was 0.37 CPU seconds.
Note: in (4.2.10), PCl(Tkk) equals αTkk
|Tkk| if |Tkk| ≥ α and Tkk otherwise.
50
1 2 3 4 5 6 7 8 9 1060
65
70
75
80
85
No. of initial conditions
Suc
cess
rat
e (%
)
Figure 4.4: Performance for discrete time stabilization using up to 10 initial condi-
tions.
4.3.6 Discrete Time Stabilization: Particular Problem
The following example is taken from [32]. We wish to stabilize the following system:
A =
0.5 0 0.2 1.0
0 −0.3 0 0.1
0.01 0.1 −0.5 0
0.1 0 −0.1 −1.0
, B =
1 0
0 1
0 0
−1 0
, C =
1 0
0 0
0 1
1 1
T
.
Note A is not stable. We take C as in Section 4.3.5 and ε = 10−3. This problem
was easily solved for all initial conditions tried, a stabilizing K could be found on
average in 3 iterations or less.
51
4.3.7 A Hybrid Problem
So far we have presented results for three traditional classes of problems. In this sub-
section we demonstrate the generality of the algorithm by considering a less standard
problem, namely a hybrid problem of the type shown in Figure 4.1(e).
The problem parameters are c = −0.5+ i3, C = z ∈ C | Re z ≤ −2 and | Im z| ≤|Re z|, n = 13, m = 3 and p = 5. To ensure solvability, B, C and K were randomly
generated and A set to A = V T V T −BKC with V a random orthogonal matrix and
T a real block upper triangular matrix with a spectrum satisfying the constraints. (T
was assigned the spectrum −0.5±i3,−2,−2±i,−2.3,−2.5,−3±i3,−3.5±i3.1,−4±i4 and was created by choosing appropriate 1× 1 and 2× 2 blocks for its diagonal.
The remaining upper triangular entries of T were chosen randomly.)
With ε = 10−3, 64% of initial conditions tested resulted in a solution of this
problem within 5000 iterations. For the initial conditions that lead to convergence,
the average number of iterations taken was 8.6× 102 and the average time taken was
3.1 CPU seconds. The closed loop poles corresponding to a particular solution are
shown in Figure 4.5.
Note: the least squares matching steps were done directly rather than using the
Hungarian method.
4.4 Summary
In this chapter a new methodology for solving a broad class of output feedback pole
placement problems is presented. While the methodology is not guaranteed to find a
solution, numerical experiments presented demonstrate that it can be quite effective
52
−5 −4.5 −4 −3.5 −3 −2.5 −2 −1.5 −1 −0.5 0−5
−4
−3
−2
−1
0
1
2
3
4
5
Figure 4.5: The closed loop poles corresponding to a solution for the considered hybrid
problem.
in practice. A particular strength of the algorithm is that, in addition to being able
to solve classical pole placement and stabilization problems, it can also be used to
solve less standard pole placement problems.
Keywords: static output feedback, pole placement, stabilization, alternating projec-
tions.
Chapter 5
A Projective Methodology for
Simultaneous Stabilization and
Decentralized Control
5.1 Introduction
In this chapter, a projective methodology is presented for solving static output feed-
back simultaneous stabilization and stabilization by decentralized static output feed-
back problems. This chapter contains an overview of the algorithms and some of the
numerical results. An analysis of convergence properties and extensive computational
experiments are needed and will be done in the future work.
In this chapter, we actually consider the following two problems.
Problem 5.1.1. : Static Output Feedback Simultaneous Stabilization.
Given Ai ∈ Rni×ni , Bi ∈ Rni×m, and Ci ∈ Rp×ni for each i = 1, . . . , k , find K ∈ Rm×p
53
54
such that all the closed loop systems Ai + BiKCi for each i = 1, . . . , k are stable.
Simultaneous stabilization problem was first introduced in [62] and [73]. This
problem arises frequently in practice, due to plant uncertainty, plant variation, failure
modes, plants with several modes of operation, or nonlinear plants linearized at several
different equilibria [76].
Static output feedback simultaneous stabilization problem has been shown to be
NP-hard [6]. That is, an efficient (i.e., polynomial time) algorithm that is able to
correctly solve all instances of the problem cannot be expected.
Even for three single input single output systems, no tractable simultaneous sta-
bilization design procedure has been proposed [15]. An effective approach to tackle a
variety of simultaneous stabilization problems is through numerical means.
Note that if Ci’s are all identity, this problem reduces to state feedback simultane-
ous stabilization problem. If k = 1, it reduces to the standard static output feedback
stabilization problem.
Various numerical algorithms have been proposed for solving Problem 5.1.1, for
example, [11] and [12]. Unlike these methods, our algorithm does not involve LMIs.
Note that Problem 5.1.1 is different from what is usually referred to as the ‘si-
multaneous stabilization problem’ in [4] and [72]. Problem 5.1.1 is to find a static
output feedback controller simultaneously stabilizing k > 1 systems. While the other
problem is to seek a dynamic compensator which stabilizes the given system and the
compensator is stable itself, which is also referred to as ‘strong stabilization problem’.
The other problem considered in this chapter is the stabilization by decentralized
static output feedback.
Problem 5.1.2. : Stabilization by Decentralized Static Output Feedback.
55
Given A ∈ Rn×n, Bi ∈ Rn×mi, and Ci ∈ Rpi×n for each i = 1, . . . , k, find Ki ∈Rmi×pi’s such that
A +k∑
i=1
BiKiCi
is stable.
For many large systems like electric power systems, transportation systems and
whole economic systems, it is desirable to decentralize the control task. On the one
hand, using decentralized control will reduce the design complexity greatly; on the
other hand, decentralized control is especially preferable if the measurements have
been taken on local channels and the controls can be applied on local channels only.
Hence decentralized control has drawn a great deal of research into large scale control
problems.
Problem 5.1.2 has been extensively researched and a lot of numerical algorithms
have been proposed, for example, [13] and [47].
Stabilization by decentralized static output feedback is NP-hard if one imposes a
bound on the norm of the controller or if the blocks are constrained to be identical,
that is, all Ki’s are identical [6].
Note that Problem 5.1.2 can be regarded as a generalization of the centralized
problem to include a block diagonal structure constraint on K.
This chapter presents a projective methodology for both Problem 5.1.1 and 5.1.2.
The problems are shown to be equivalent to finding a point in the intersection of
two particular sets. Alternating projection ideas are employed to find a point in the
intersection. One of the sets is an affine subspace and hence convex. We know exactly
how to do the projection onto it. However the other set is closed but nonconvex and
moreover it is a rather complicated set. To the best of our knowledge, how to project
56
onto this set is an open difficult problem. We use a substitute mapping that maps
onto the set, which gives a reasonable estimation of the actual projection operator.
The effectiveness of the algorithms is demonstrated in computational results.
The structure of the chapter is as follows 1. The methodology for both problems
is presented in Section 5.2. Section 5.3 contains computational results demonstrating
the effectiveness of the algorithms.
5.2 Methodology
This section presents the solution algorithms for Problem 5.1.1 and 5.1.2. Throughout
this section Cn×n will be regarded as a Hilbert space with inner product 〈Y, Z〉 =
tr(Y Z∗) =∑
i,j yij zij and the associated norm is the Frobenius norm ‖Z‖ = 〈Z, Z〉 12 .
5.2.1 Simultaneous Stabilization
Throughout this subsection it is assumed that (Ai, Bi, Ci)’s for i = 1, . . . , k are given
and fixed.
In this subsection, we define LS to be the set of all possible closed loop system
matrices,
LS = Z1 ∈ Rn1×n1 , . . . , Zk ∈ Rnk×nk | Z1 = A1 + B1KC1, . . . , Zk = Ak + BkKCk
for some K ∈ Rm×p ,
and define MS to be the set of matrices with eigenvalues in the left half complex
plane,
MS = Z1 ∈ Cn1×n1 , . . . , Zk ∈ Cnk×nk | ρ(Z1) ≤ 0, . . . , ρ(Zk) ≤ 01Please refer to Chapter 2 for background of projections.
57
Problem 5.1.1 can now be stated as:
Find X1, . . . , Xk ∈ LS ∩MS.
A solution strategy to solve Problem 5.1.1 would be to employ an alternating
projection scheme, alternatively projecting between LS and MS. While LS is an
affine subspace and hence convex, MS is in general a rather complicated nonconvex
set. Alternating projections between LS and MS are not guaranteed to converge.
More importantly, how to calculate projections onto MS is an open hard problem.
As will be shown in the experiments, an alternating projection like scheme can
still be quite successful if instead of using a true projection map for MS, a suitable
substitute is employed.
The Schur’s result in Theorem 4.2.5 shows that a complex square matrix is uni-
tarily equivalent to an upper triangular complex matrix.
The following mapping is used as a substitute for the true projection map into
MS. Though it is not a true projection map, the notation PMSwill still be used.
The choice for PMSis motivated by the solution for symmetric static output feedback
stabilization problem, where a true projection can be found.
Definition 5.2.1. For any Vi ∈ Cn×n unitary and any Ti ∈ Cn×n upper triangular,
define
PMS(Vi, Ti) = ViTiV
∗i ,
where
Tikl=
min0,Re(Tikk)+ iIm(Tikk
), if k = l,
Tikl, otherwise.
Given starting points Z1 ∈ Rn1×n1 , . . . , Zk ∈ Rnk×nk , we do PMS(Vi, Ti) to Zi for
each i = 1, . . . , k. Note that PMS(Zi) maps into MS. In order to apply PMS
to Zi,
58
a Schur decomposition of Zi must first be found. A given Zi may have nonunique
Schur decompositions and they may give different projection points. This fact is not
so important as different Schur decompositions lead to points in MS of equal distance
from the original matrix. The proof of this fact is similar to the proof for Theorem
4.2.7 and hence is omitted here. PMS(Vi, Ti) keeps Vi fixed and modifies Ti. Refer
to the proof of Theorem 4.2.8 for a similar idea which shows that of all the points
in MS that have a Schur decomposition with the same Vi matrix, PMS(Vi, Ti) is the
closest (or at least equal closest) to the original point Zi = ViTiV∗i .
The projection of Xi ∈ Rni×ni for each i = 1, . . . , k onto LS involves solving a
standard least squares problem.
Lemma 5.2.2. The projection of Xi ∈ Rni×ni onto LS is given by PLS(Xi) = Ai +
BiKCi where K is a solution of the least squares problem
arg minK∈Rm×p
‖ (k∑
i=1
CTi ⊗Bi ) vec (K)−
k∑i=1
vec [Re (Xi)− Ai ] ‖2
Proof. This proof is similar to the proof for Lemma 4.2.9 and hence is omitted here.
Here is our algorithm for Problem 5.1.1.
Algorithm:
Problem Data. Ai ∈ Rni×ni , Bi ∈ Rni×m, Ci ∈ Rp×ni for each i = 1, . . . , k.
Initialization. Choose randomly generated matrices Yi ∈ Rni×ni for each i = 1, . . . , k.
repeat
1. Xi := PLS(Yi).
59
2. Calculate a Schur decomposition of each Xi: Xi = ViTiV∗i .
3. Yi := PMS(Vi, Ti).
until ρ(Xi) ≤ 0 for all i = 1, . . . , k.
As we only require the eigenvalues of all the closed loop systems to be stable, we
terminate the algorithm as soon as all the eigenvalues have real parts less or equal to
zero.
5.2.2 Decentralized Control
Throughout this subsection it is assumed that A, Bi and Ci for each i = 1, . . . , k are
given and fixed.
In this subsection we define LD to be the set of all possible closed loop system
matrices,
LD = Z ∈ Rn×n ‖ Z = A +k∑
i=1
BiKiCi for some K1 ∈ Rm1×p1 , . . . , Kk ∈ Rmk×pk,
and define MD to be the set of matrices with eigenvalues in the left half complex
plane,
MD = Z ∈ Cn×n ‖ ρ(Z) ≤ 0
Problem 5.1.2 can now be stated as:
Find X ∈ LD ∩MD.
Again alternating projection scheme is employed to solve Problem 5.1.2. As the
method of projecting onto MD is an unsolved problem, the following PMD(V, T ) is
employed as a substitute of the actual projection operator. Computational experi-
ments show that this scheme is quite successful in practice.
60
The following map PMD(V, T ) is used as a substitute for the projection map onto
MD.
Definition 5.2.3. For any V ∈ Cn×n unitary and any T ∈ Cn×n upper triangular,
define
PMD(V, T ) = V TV ∗,
where
Tkl =
min0,Re(Tkk)+ iIm(Tkk), if k = l,
Tkl, otherwise.
The projection of X ∈ Rn×n onto LD involves solving a standard least squares
problems.
Lemma 5.2.4. The projection of X ∈ Rn×n onto LD is given by PLD(X) = A +
∑ki=1 BiKiCi where vec(K) is a solution of the least squares problem and vec(K)
stacks vec(Ki) in order
arg minK∈R
Pimi×pi
‖ [ CT1 ⊗B1 . . . CT
k ⊗Bk ] vec (K)− vec [Re (X)− A ] ‖22
Proof. This proof is similar to the proof of Lemma 4.2.9 and hence is omitted here.
Here is our algorithm for Problem 5.1.2.
Algorithm:
Problem Data. A ∈ Rn×n, Bi ∈ Rn×mi , Ci ∈ Rpi×n for each i = 1, . . . , k.
Initialization. Choose a randomly generated matrix Y ∈ Rn×n.
repeat
1. X := PLD(Y ).
61
2. Calculate a Schur decomposition of X: X = V TV ∗.
3. Y := PMD(V, T ).
until ρ(X) ≤ 0.
5.3 Computational Results
This section contains computational results of applying the algorithms to both Prob-
lem 5.1.1 and 5.1.2. We present results for both randomly generated problems and
particular problems from literature. As mentioned in the introduction, extensive
numerical experiments will be done and it is a part of the future work of this thesis.
The algorithms for each problem were implemented in Matlab 6.5 and all results
were obtained using a 3.19 GHz Pentium 4 machine.
Throughout this section a randomly generated matrix will be a matrix whose
entries are drawn from a normal distribution of zero mean and variance 1.
5.3.1 Simultaneous Stabilization: Random Problems
This subsection contains results for some randomly generated static output feedback
simultaneous stabilization problems. A 1000 problems were created for each of a
number of different choices for the system dimensions (n,m, p). Each problem was
created as follows. System matrices Ai, Bi, Ci for each i = 1, . . . , k and K were
generated randomly. A scalar multiple of the identity was added to Ai to ensure it
had a stability degree of 0.1. Ai was then replaced by Ai − BiKCi. All stable Ai’s
were discarded.
Numerical experiments show that the performance of the algorithm can actually be
62
improved by using a slightly modified version of PMSwhich depends on a parameter
γ ∈ R. A formal definition of similar projection map can be found in Definition 4.3.1,
where P γMS
can be obtained using the similar idea. P γMS
shifts the real parts of the
unstable eigenvalues to γ (≤ 0) rather than to 0. In these experiments, we chose
γ = −1. An attempt was made to solve each problem using up to 10 different initial
conditions and a maximum of 1000 iterations per initial condition.
(n,m, p) (3, 1, 1) (6, 4, 3) (9, 5, 4)k 3 6 3 6 3 6
S.R. 97% 98% 100% 100% 100% 99%i 138 251 38 161 49 222T 0.11 0.41 0.05 0.42 0.11 0.87
Table 5.1: A comparison of performance for different n,m, p and the number of
systems k. S. R. denotes the success rate, T denotes the average convergence time
in CPU seconds, and i the average number of iterations. T and i are based only on
those problems that were successfully solved.
As can be seen from Table 5.1, the results are quite good. The algorithm solved
most of the problems. The average solution times were very small. Note that the
different choices of dimensions have their own difficulties. Kimura’s stabilization
condition is m + p > n (see Theorem A.0.3). Only (n,m, p) = (6, 4, 3) meets this
condition. In the case of (n,m, p) = (9, 5, 4), m + p = n. While in the case of (3, 1, 1)
problems m + p < n, which are quite hard problems.
63
5.3.2 Simultaneous Stabilization: Particular Problems
This subsection contains results for some particular problems of simultaneous stabi-
lization from literature. In order to present a greater number of results, rather than
presenting the details of each problem, only references are given. For each problem,
100 random initial conditions were tested and the maximum number of iterations
per initial condition was set to 1000. As soon as all the systems were stable, the
experiments were terminated.
No. references (n,m, p) k S.R. (%) T i
1 [11, Ex. 1] (2, 1, 1) 3 98% 0.13 282 [11, Ex. 2], [12, Ex. 2], [75, Ex. 1] (3, 1, 2) 4 96% 0.14 313 [11, Ex. 3] (3, 1, 3) 3 96% 0.18 414 [11, Ex. 4], [12, Ex. 1] (2, 1, 1) 3 95% 0.18 40
Table 5.2: Results for particular examples from literature. T and i are based only on
those problems that were successfully solved.
As can be seen from Table 5.2, performance was very good. Solutions to each
problem could be found. The success rates are all greater or equal to 95% and both
the average iterations and average times taken for successfully solved problems are
very small.
5.3.3 Decentralized Control: Random Problems
This subsection contains results for some randomly generated problems of stabiliza-
tion by decentralized static output feedback. A 1000 problems were created for each
of a number of different choices for the system dimensions n and controller dimen-
sions (mi, pi) for i = 1, . . . , k. Each problem was created as follows. System matrices
64
A,Bi, Ci and Ki for i = 1, . . . , k were generated randomly. A scalar multiple of iden-
tity was added to A to ensure it had a stability degree of 0.1. A was then replaced
by A−∑ki=1 BiKiCi. All stable A’s were discarded.
Numerical experiments show that the performance of the algorithm can actually
be improved by using P γMD
with similar idea as from Definition 4.3.1. Here we choose
the value of γ to be −1.
n 10 10 20 20 50 50k 2 3 2 3 2 3
(m1, p1) (2, 2) (2, 2) (3, 4) (3, 4) (5, 7) (5, 7)(m2, p2) (2, 2) (1, 3) (4, 3) (4, 3) (7, 5) (7, 5)(m3, p3) - (3, 1) - (4, 4) - (6, 6)
S.R. 100% 100% 100% 100% 100% 100%i 39 9.7 17 3.8 22 6.5T 0.04 0.01 0.10 0.05 3.3 2.2
Table 5.3: A comparison of performance for randomly generated problems with dif-
ferent n,mi, pi and the number of subsystems k. T and i are based only on those
problems that were successfully solved.
As can be seen from Table 5.3, the algorithm is performing very well in solving
every problems. Not surprisingly that for the same n-dimensional problems, the more
channels the control task is divided in the faster the solution would be found and the
less iterations were needed.
5.4 Summary
In this chapter we present a novel methodology for solving static output feedback
simultaneous stabilization and stabilization by decentralized static output feedback
65
problems. Numerical experiments show that the algorithms are very effective in
practice, though the methodology is not guaranteed to find a solution.
Keywords: static output feedback, simultaneous stabilization, decentralized control,
alternating projections.
Chapter 6
Trust Region Methods for ClassicalPole Placement
6.1 Introduction
Pole placement via static output feedback is a classical problem in systems and con-
trol theory, and there exists a great deal of research into this topic. Unfortunately, the
problem of determining solvability of static output feedback pole placement problems
has recently been shown to be NP-hard [31]. Though sufficient conditions for solv-
ability exist, the survey paper [61] states that these conditions are mainly theoretical
in nature and that there are no good numerical algorithms available in many cases
when a problem is known to be solvable. New algorithms for this important problem
are certainly of interest.
In this chapter we present two related numerical algorithms for solving static
output feedback pole placement problems.
Problem 6.1.1. Given system matrices A ∈ Rn×n, B ∈ Rn×m, C ∈ Rp×n and desired
eigenvalues λD ∈ Cn, find K ∈ Rm×p such that
λ(A + BKC) = λD.
66
67
Two trust region approaches are considered for solving the following unconstrained
nonlinear least squares problem
minK∈Rm×p
f(K) :=1
2
∥∥λ(A + BKC)− λD∥∥2
2. (6.1.1)
Here λ(A+BKC) denotes the vector of eigenvalues of A+BKC, with entries sorted
to give the minimum norm. Trust region methods, which are well known in the
optimization community, are a type of iterative method for minimizing nonconvex
functions. The specific trust region methods we use are the trust region Newton
method and the Levenberg-Marquardt method. In order to employ the trust region
Newton method, at each iteration, the first and second derivatives of the eigenvalues
of A+BKC must be calculated. The Levenberg-Marquardt based algorithm has the
advantage of only requiring the first derivatives of the eigenvalues. Both resulting
algorithms have the desirable property that, near solutions, they typically converge
quadratically.
A technicality that arises in the proposed approaches is that the eigenvalues of
A + BKC may not be differentiable everywhere. They will however be differentiable
at all points at which A + BKC has distinct eigenvalues. A consequence of this is
that the algorithms are only appropriate for problems for which the desired poles are
distinct. It turns out that the algorithms can still be used to solve problems whose
desired poles are distinct but whose separation is quite small and hence this does not
appear to be a serious limitation.
The idea of solving pole placement type problems by utilizing eigenvalue deriva-
tives is not completely new. Related ideas have been used to solve a noncontrol
related inverse eigenvalue problem involving symmetric matrices [30]. However, the
68
methods presented in [30] are all local in nature and hence require one to start suffi-
ciently close to a solution for them to converge. An important distinguishing feature
of our algorithms is that the use of a trust region methodology means that this is
not the case for our algorithms. First derivatives of eigenvalues (though not second
derivatives) have also been used to solve various control problems. For example, in
[38] they are used to try to achieve pole placement in certain convex regions. The
methodology that is used there is quite different to the one used here and is based on
convex programming techniques. As it requires that the open loop poles are already
quite close to the desired poles, this method also only works locally. Along the same
lines as [38], first derivatives of eigenvalues have also been used for robustness analysis
and stabilization, see [59].
This chapter is structured as follows. Section 6.2 contains an overview of trust
region methods. In order to use trust region methods to solve the pole placement
problem, the first and second derivatives of the f are required. Details of these calcu-
lations, including how to calculate derivatives of the eigenvalues, are given in Section
6.3. Trust region methods require the function to be minimized to be differentiable.
While f will typically only be differentiable on an open dense set, this turns out to
be sufficient. Such issues are addressed in Section 6.4. Section 6.5 contains compu-
tational results, including results for a number of problems from the literature.
6.2 Trust Region Methods
This section gives an overview of trust region methods. It is assumed that the function
f : RN → R to be minimized is (sufficiently) smooth. The actual f we wish to
minimize is given in (6.1.1) and it may not satisfy this assumption. Issues related to
69
this fact are addressed in Section 6.4. Additional information on trust region methods
can be found in [24] and [56] .
6.2.1 Basic Methodology
Trust region methods can be used to minimize smooth nonconvex functions and are
iterative in nature. It is assumed that the function f : RN → R to be minimized
is (sufficiently) smooth. The actual f we wish to minimize in problem formulations
may not satisfy this assumption. Issues related to this fact are addressed in Section
6.4. Given a current iterate xk, they construct a possibly nonconvex, quadratic ap-
proximation of the objective function about xk. This model is only assumed to be a
good approximation in a certain ball centered about xk. This is the so-called ‘trust
region’. It turns out that, numerically, it is possible to readily minimize a quadratic
function over a ball. Doing so gives a candidate step pk. The step pk is only accepted
if the difference in the objective function, f(xk) − f(xk + pk), is sufficiently close to
the difference predicted by the model. If pk is not acceptable, the trust region radius
is decreased and the process repeated. On the other hand, if the model gives a good
prediction, the radius of the trust region may be increased to allow a larger step in
the next iteration.
What follows describes the trust region method in greater detail. At each iteration,
the quadratic approximation is assumed to be of the form
mk(p) = f(xk) +∇f(xk)T p +
1
2pT Bkp.
Here Bk is typically either the Hessian of f at xk or some approximation of this
Hessian. If Bk is the Hessian, then mk is simply the 2nd order Taylor approximation
70
of f at xk. As will be discussed below, it may also be useful to consider other choices
for Bk.
Each constrained minimization problem is of the form
minp∈RN
mk(p) s.t. ‖p‖2 ≤ ∆k, (6.2.1)
where ∆k > 0 is the current trust region radius. The solution pk of (6.2.1) gives a
potential step. Whether or not it is a suitable step is assessed by considering the
ratio of actual reduction of the objective to the predicted reduction:
ρk =f(xk)− f(xk + pk)
mk(0)−mk(pk). (6.2.2)
The overall trust region method is as follows.
Trust Region Method, Generic Algorithm ([56])
Given ∆ > 0, ∆0 ∈ (0, ∆), and η ∈ [0, 14):
for k= 0, 1, 2, . . .
Obtain pk by (approximately) solving (6.2.1);
Evaluate ρk from (6.2.2);
if ρk < 14
∆k+1 = 14∆k
else
if ρk > 34
and ‖pk‖2 = ∆k
∆k+1 = min2∆k, ∆else
∆k+1 = ∆k;
if ρk > η
xk+1 = xk + pk
71
else
xk+1 = xk;
end(for).
Approximate solutions of the constrained quadratic minimization problem (6.2.1)
can be obtained in a number of ways. One way is the nearly exact solution method
described in [56, Section 4.2]: it can be shown that problem (6.2.1) is equivalent to
finding a p and a scalar γ ≥ 0 such that the following conditions hold,
‖p‖2 ≤ ∆k,
(Bk + γI)p = −∇f(xk),
γ(∆k − ‖p‖2) = 0,
(Bk + γI) is positive semidefinite.
Without going into the details we mention that finding a p and γ that satisfy these
conditions is equivalent to solving a one dimensional root finding problem in γ which
can be solved using a Newton method.
Regarding the choice of Bk’s, the Hessian of f at xk is a natural choice. In this case,
the method is called the trust region Newton method. When the objective function
f is a least squares cost, say f(x) = 12
∑Mi=1 r2
i (x) for some functions ri : RN → R,
there is another suitable choice. In this case, if we define r(x) = (r1(x), . . . , rM(x))T
and let J(x) denote the Jacobian of r(x), then
∇f(x) = J(x)T r(x) and ∇2f(x) = J(x)T J(x) +M∑i=1
ri(x)∇2ri(x),
and a good choice for Bk is J(xk)T J(xk). The advantages of this choice for Bk include
the fact that it does not require the calculation of the second derivatives of the ri’s and
72
that it gives a good approximation of ∇2f(xk) when f(xk) is small, that is, when each
ri(xk) is small. For this choice of Bk’s, the method is called the Levenberg-Marquardt
method.
6.2.2 Convergence Results
This section contains some general convergence results. These results have been
specialized for our purposes and the given references usually refer to more general
results. The results in this section all assume the nearly exact solution method is
used for the subproblems (6.2.1) and that the algorithm parameter η is nonzero, that
is, η ∈ (0, 14).
A simple but important property of trust region methods is that the cost is non-
increasing from one iteration to the next: for all k ≥ 0, f(xk) ≥ f(xk+1).
Here are some less simple properties. The following result concerns global conver-
gence to stationary points.
Theorem 6.2.1 ([56, Th 4.8]). Suppose that on the sublevel set
x | f(x) ≤ f(x0), (6.2.3)
f is twice continuously differentiable and bounded below, and that ‖Bk‖ ≤ β for some
constant β. Then
limk→∞
∇f(xk) = 0.
Theorem 6.2.1 holds for both the trust region Newton method and the Levenberg-
Marquardt method. For the former method, the following result also holds.
Theorem 6.2.2 ([56, Th 4.9]). Suppose the set (6.2.3) is compact, that f is twice
continuously differentiable on this set, and that Bk = ∇2f(xk). Then the xk’s have a
73
limit point x∗ that satisfies the following first and second order necessary conditions
for a local minima,
∇f(x∗) = 0, ∇2f(x∗) is positive semidefinite.
The following local convergence result for the trust region Newton method implies
that near a (strict) local minimum the method reduces to a pure Newton method.
Theorem 6.2.3 ([56, Th 6.4]). Suppose that Bk = ∇2f(xk). Further, suppose the
sequence of xk’s converges to a point x∗ that satisfies the following first and second
order sufficient conditions for a (strict) local minima,
∇f(x∗) = 0, ∇2f(x∗) is positive definite,
and that f is three continuously differentiable in a neighborhood of x∗. Then the trust
region bound ∆k becomes inactive for all k sufficiently large.
A consequence of Theorem 6.2.3 is that local convergence for the trust region
Newton method is usually quadratic, just as for the pure Newton method.
The final result in this section shows that the Levenberg-Marquardt method is
often locally quadratically convergent to global minima. (Note that this may not be
the case for local minima that are not global minima.)
Theorem 6.2.4 ([56, Section 10.2]). Suppose the ri’s that determine f are three
continuously differentiable in a neighborhood of a global minima x∗. Suppose further
that J(x∗)T J(x∗) is positive definite. Then the Levenberg-Marquardt method is locally
quadratically convergent to x∗.
74
6.3 Derivative Calculations
In order to apply the trust region methods, we need to calculate the appropriate first
and second derivatives. As already mentioned, the eigenvalues of A + BKC may not
be differentiable everywhere. For example, the eigenvalues of
[4 0
0 0
]+
[k
l
] [1 1
](6.3.1)
are 2±√4 + 4k when l = −k and hence they are not differentiable at (k, l) = (−1, 1).
The next result, which follows from a result in [43, Section 2.5.7], shows that lack
of differentiability cannot occur at points at which the eigenvalues are distinct.
Theorem 6.3.1. Consider a matrix valued function A : RN → Rn×n. Suppose A(x)
is k-times continuously differentiable in x in an open neighborhood Ω. Furthermore
suppose that at each point in Ω, A(x) has distinct eigenvalues. Then the eigenvalues
of A(x) are k-times continuously differentiable in Ω.
Suppose the conditions of Theorem 6.3.1 are satisfied with k ≥ 2. Then we can
write down explicit expressions for the first and second derivatives of the eigenvalues.
(Part of the following results appear in [48]. The rest we have proved using similar
techniques to those appearing in that paper.) If λi denotes the ith eigenvalue of
A(x), suppose D = diag(λ1, . . . , λn) and let X ∈ Cn×n and Y ∈ Cn×n be such that
A(x)X = XD and Y ∗X = I. Then
∂λi
∂xk
=
(Y ∗∂A(x)
∂xk
X
)
ii
, (6.3.2)
and if we define
P = Y ∗∂A(x)
∂xk
X and Q = Y ∗∂A(x)
∂xl
X, (6.3.3)
75
then
∂2λi
∂xk∂xl
=
(Y ∗∂
2A(x)
∂xk∂xl
X
)
ii
+n∑
j=1j 6=i
PijQji + PjiQij
λi − λj
. (6.3.4)
These results can be used to calculate the derivatives of our objective function
f(K) = 12
∑ni=1(λi − λD
i )∗(λi − λDi ). Differentiating we have
∂f(K)
∂Kkl
= Re
n∑
i=1
(λi − λDi )∗
∂λi
∂Kkl
(6.3.5)
and
∂2f(K)
∂Kkl∂Kpq
= Re
n∑
i=1
(∂λi
∂Kkl
)∗ (∂λi
∂Kpq
)+
n∑i=1
(λi − λDi )∗
∂2λi
∂Kkl∂Kpq
. (6.3.6)
Note that ‘∂A(x)∂xk
’ is given by
∂(A + BKC)
∂Kkl
= BkCl, (6.3.7)
where Bk is the kth column of B and Cl is the lth row of C. Identity (6.3.7) implies
that the first term appearing in (6.3.4) is always zero. Combining (6.3.2)–(6.3.7) we
now have a complete characterization of the first and second derivatives of our cost
(at points where A + BKC has distinct eigenvalues).
Note that when applying the Levenberg-Marquardt method, the approximate sec-
ond derivatives are given by the first term in (6.3.6),
∂2f(K)
∂Kkl∂Kpq
≈ Re
n∑
i=1
(∂λi
∂Kkl
)∗ (∂λi
∂Kpq
). (6.3.8)
6.4 Additional Comments
When evaluating the cost f(K), the eigenvalues of A + BKC must be matched
with the desired eigenvalues in a least squares sense. Suppose a given problem has
76
distinct desired eigenvalues and that it is solvable. Then, sufficiently near a solution
of the problem, the eigenvalues of A + BKC will be distinct, which eigenvalues of
A + BKC match to which desired eigenvalues will not change, and the eigenvalues
of A + BKC will depend smoothly on K. As a result, for problems that are solvable
and have distinct desired eigenvalues, our objective function f will be smooth in
a neighborhood of solutions. An important consequence of this is that the results
from Section 6.2.2 regarding local convergence to solutions still apply. In particular,
near solutions of problems with distinct eigenvalues, both our algorithms will often
converge quadratically.
The comments above address the behavior of the algorithms in a neighborhood
of a solution. What about behavior far away from a solution? Are the algorithms
even defined in such regions? Considering the steps involved, all that is required for
the algorithms to be well defined is that, for each iterate, A + BKC have distinct
eigenvalues. If the desired eigenvalues are distinct and a generic initial condition is
used, it is unlikely that for either algorithm, that for any iterate, A + BKC has
repeated eigenvalues. Hence, under these mild assumptions, the algorithms should
be well defined and this is indeed what is observed in practice.
If the desired eigenvalues are not distinct, the cost may not be differentiable at
a solution. This indicates that the requirement of distinct desired eigenvalues is also
necessary. This does not limit the usefulness of the algorithms too much however as
desired eigenvalues can always be perturbed slightly so that they are distinct. While
having distinct but close eigenvalues does lead to a degree of ill-conditioning in our
algorithms, the algorithms can still be effectively utilized in such cases, as will be
shown in the numerical results section.
77
We also mention that, if the desired eigenvalues are distinct, it is our belief that,
modulo small changes, the global results of Section 6.2.2 should still hold, at least
generically.
6.5 Computational Results
This section contains computational results of applying the algorithms to various
problems. The algorithms were implemented in Matlab 6.5 and all results were ob-
tained using a 3.19 GHz Pentium 4 machine.
6.5.1 Random Problems
A 1000 random problems were created for each of a number of different choices for the
system dimensions (n,m, p). Each problem was created as follows. System matrices
A, B, and C were generated randomly; their entries were drawn from a normal
distribution of zero mean and variance 1. λD was taken to be the spectrum of a
randomly generated matrix and a scalar was added to λD to ensure maxi Re λDi =
−0.1. Each triple (n,m, p) was chosen to satisfy
mp > n. (6.5.1)
As the problems are randomly generated and satisfy condition (6.5.1), Wang’s suffi-
cient condition ensures each problem is solvable, [74].
An attempt was made to solve each problem using up to 5 different initial condi-
tions and a maximum of a 2000 iterations per initial condition. Initial conditions were
chosen randomly. The convergence condition used was∥∥λ(A + BKC)− λD
∥∥2
< ε
with ε = 10−3. Results for both algorithms are given in Table 6.1.
78
(n,m,p) (3,2,2) (6,4,3) (9,5,5)
Trust S.R. 100% 100% 91%Region T 0.10 4.2 125Newton i 31 247 1037
Levenberg- S.R. 100% 100% 99%Marquardt T 0.03 0.12 0.73
i 15 104 293
Table 6.1: A comparison of performance for different n, m and p. S.R. denotes the
success rate, T denotes the average convergence time in CPU seconds, and i the
average number of iterations. T and i are based only on those problems that were
successfully solved. ε = 10−3.
As can be seen, the results for both algorithms are quite good. Not surpris-
ingly, given the reduced computation required for its implementation, the Levenberg-
Marquardt based algorithm is faster than the trust region Newton based algorithm;
notice in particular the large difference in T for the (9, 5, 5) problem. What is perhaps
surprising is that the Levenberg-Marquardt based algorithm is more likely to find a
solution (at least within the number of iterations that were allowed). This suggests
that the Levenberg-Marquardt based algorithm is superior to the trust region New-
ton based algorithm. Note however that we have observed instances where, using the
same initial condition, the former algorithm converges to a local minima while the
latter algorithm converges to a solution.
The problems in Table 6.1 are ‘easy’ in the sense that they actually satisfy
Kimura’s condition (see Theorem A.0.3), m + p > n, and in most cases the num-
ber of variables mp is significantly larger than n. Results for some harder problems
are presented in Table 6.2. For these problems, mp−n = 1. An attempt was made to
solve each problem using up to 5 different initial conditions and a maximum of 5000
79
iterations per initial condition. Initial conditions were chosen randomly.
(n,m,p) (5,3,2) (7,2,4) (9, 2, 5)
Trust S.R. 100% 89% 65%Region T 1.9 21 100Newton i 539 2319 4747
Levenberg- S.R. 99% 96% 81%Marquardt T 0.29 2.0 6.4
i 308 1502 3163
Table 6.2: Some harder random problems. T and i are based only on those problems
that were successfully solved. ε = 10−3.
As can be seen, for these problems, success rates are less, even though we have
allowed a greater number of iterations per initial condition. Overall though the results
for these difficult problems are still quite good.
Figure 6.1 shows a typical plot of ‖λ(A+BKC)−λD‖2 versus i. It has the desired
property that near solution the convergence was quadratic.
6.5.2 Particular Problems
This subsection contains results for particular problems from the literature. In order
to present a greater number of results, rather than presenting the details of each
problem, only references are given. Results are presented only for the Levenberg-
Marquardt based algorithm.
For each problem, 100 random initial conditions were tested and the maximum
number of iterations per initial condition was set to 1000 (expect for Problem 6 for
which the maximum number of iterations per initial condition was set to 1500). The
termination parameter ε was reduced to ε = 10−6. Results are given in Table 6.3.
80
0 50 100 150 200 250
0.001
0.01
0
0.1
1
Iteration No. i
‖λ(A
+BKC
)−λD‖ 2
Figure 6.1: Quadratic convergence near solution of the Levenberg-Marquardt algo-
rithm.
As can be seen, performance was again very good. Solutions to each problem
could be found. Aside from Problem 6, for those initial conditions that lead to a
solution, average convergence times were less than 0.5 CPU seconds and solutions
could be found from many different initial conditions.
Problem 6 was the most sensitive to initial conditions. In fact, the results for
Problem 6 in the table are based on choosing the entries of initial K’s from a normal
distribution of zero mean and variance 100. (Choosing initial conditions for this
problem in the same manner as for all the other problem lead to a rather low success
rate of 5%.)
81
No. references (n, m, p) T i S.R. (%)1 [1, ex 1, case 1] (4, 2, 2) 0.19 103 842 [1, ex 1, case 2] (4, 2, 2) 0.16 77 923 [1, ex 2], [70] (5, 2, 4) 0.21 88 964 [64] (4, 3, 2) 0.07 26 1005 [49, ex 1] (4, 2, 2) 0.10 40 986 [49, ex 2] (6, 3, 2) 2.0 923 447 [49, ex 3] (5, 3, 2) 0.29 132 1008 [66, ex 1] (4, 3, 2) 0.06 21 1009 [66, ex 2] (3, 1, 2) 0.01 4.1 10010 [66, ex 3] (4, 2, 2) 0.03 8.9 10011 [46] (8, 4, 3) 0.46 151 9212 [25] (3, 2, 2) 0.05 18 100
Table 6.3: Particular problems. T and i are based only on those problems that were
successfully solved. ε = 10−6.
6.5.3 Repeated Eigenvalues
Each of the problems considered in the prior subsection (as well as all the random
problems) had distinct desired eigenvalues. In this subsection we consider what can
be achieved if the desired eigenvalues are not distinct.
Consider the following problem from [14],
A =
−1 0.5 −0.2 0.85 0.45 0.9
0 −0.5 −2 0.9 0.4 0.1
0.15 0 −2 −0.2 0.1 0.8
0 0.1 −0.25 −0.8 0 0.2
−0.2 0.4 0 −0.5 −2 0.1
0.6 −0.7 0 0.2 0 −2.5
, B =
1 0
1 0
1 0
1 1
0.5 1
0 1
, C =
0 0 1
0 0 1
0 1 0
0 1 0
1 0 0
1 0 0
T
,
with desired eigenvalues λD = −3,−3,−2,−2,−1,−1T . Notice that this problem
has three pairs of repeated eigenvalues.
The algorithms do not provide a way to exactly solve this problem. However, a
82
fairly good approximate solution can be found by considering a slightly perturbed
desired spectrum with distinct eigenvalues. For example, suppose λD is replaced with
λDδ = −3 − δ,−3,−2 − δ,−2,−1 − δ,−1T with δ = 10−4. Then this perturbed
problem can often be solved.
An alternative strategy is to solve a series of perturbed problems with decreasing
δ’s. First solve a perturbed problem with δ = 10−1. Then, setting δ = 10−2 and using
the solution of the prior problem as an initial condition, solve this new perturbed
problem. Continue this process with δ = 10−3 and δ = 10−4.
Using the Levenberg-Marquardt based algorithm, the first strategy lead to a so-
lution for 55% of initial conditions tried, with average convergence time of 0.72 CPU
seconds. The second strategy was successful in 59% of cases, with average convergence
time of 0.35 CPU seconds.
We note that the main problem we encountered in solving these problems was not
convergence to local minima, though this can occur, but rather that near solutions
the Hessian of the cost can have very large eigenvalues. This leads to numerical
issues when trying to solve the constrained quadratic subproblems (6.2.1). The code
we have implemented for these subproblems works very well in the vast majority of
cases though we expect it could still be improved further and hence that even better
results may be achievable.
6.6 Summary
In this chapter two related numerical methods for the static output feedback pole
placement problem have been presented. Both algorithms are well behaved globally
and have the property that local convergence to solutions often occurs quadratically.
83
Extensive computational results presented indicate that the algorithms can be highly
effective in practice. While it is required that the desired poles are distinct, the
algorithms can still be successfully utilized for problems with repeated poles if small
perturbations to the desired poles are allowed.
Keywords: pole placement, static output feedback, trust region method, Newton’s
method, Levenberg-Marquardt method, eigenvalue derivatives.
Chapter 7
A Gauss-Newton Method for
Classical Pole Placement
7.1 Introduction
In this chapter we present a Gauss-Newton algorithm for solving static output feed-
back pole placement problems see Problem 6.1.1.
The problem is formulated as a constrained nonlinear least squares problem and
the minimization is achieved via a Gauss-Newton algorithm. This chapter contains
an overview of the algorithm and some of the numerical results. As part of the future
work, a convergence analysis and more numerical experiments will be carried out.
Consider Problem 6.1.1 again. Define
f(Q,G, K) = ‖Q(D + G)QT − (A + BKC)‖2. (7.1.1)
Here A ∈ Rn×n, B ∈ Rn×m, C ∈ Rp×n are given system matrices. D is a block
diagonal matrix with some 1× 1 and 2× 2 blocks placed properly on the diagonal. D
84
85
is constructed in the following way to possess the desired eigenvalues λD ∈ Cn. The
1 × 1 blocks contain the real eigenvalues in λD. Each 2 × 2 block contains the real
parts of a pair of complex conjugate eigenvalues on the diagonal and the values of the
imaginary parts on the subdiagonal. Note that the eigenvalues are sorted to give the
minimal norm, which is a combinatorial least squares problem (this problem and its
solution methods are explained in Section 4.2.2).
A Gauss-Newton method is considered for solving the following constrained non-
linear least squares problem
Problem 7.1.1.
min f(Q,G, K)
s.t Q ∈ On;
G block super triangular;
K ∈ Rm×p.
For clarity on the form of G, look at the following example. Suppose λD =
−1,−2± i,−3, then
D =
−1 0 0 0
0 −2 1 0
0 −1 −2 0
0 0 0 −3
and G =
0 g1 g2 g3
0 0 0 g4
0 0 0 g5
0 0 0 0
. (7.1.2)
In Section 4.2.2, we recall a Schur’s result that any matrix Z ∈ Cn×n is unitarily
equivalent to an upper triangular matrix [40, Th 2.3.1]. Here we recall another Schur’s
result that for any matrix Z ∈ Rn×n, there is a real orthogonal matrix V ∈ On such
that Z = V TV T where T is a block upper triangular matrix [40, Th 2.3.4].
86
Theorem 7.1.2. Given Z ∈ Rn×n with eigenvalues λ1, . . . , λn in any prescribed order,
there is an orthogonal matrix V ∈ On and a block upper triangular matrix T ∈ Rn×n
such that
Z = V TV T (7.1.3)
and Tii = Re(λi), i = 1, . . . , n. Complex conjugate eigenvalues form 2×2 blocks along
the diagonal.
Proof. See, for example, [40, Th 2.3.4].
Note the problem formulation of Problem 7.1.1 is motivated from the Schur’s form
of real matrices in Theorem 7.1.2.
Gauss-Newton method is a variation of standard Newton’s method and is itera-
tive in nature. Gauss-Newton method has the advantage of just requiring the first
derivative of the cost function. In order to employ Gauss-Newton method to solve
Problem 7.1.1, we need to calculate the first derivative at the each iteration. As we
have three variables in f(Q,G, K) and these variables have dependent relationships,
we first use elimination technique to simply the cost function. Then we do the first
order Taylor approximation to derive the first derivatives.
This chapter is organized as follows 1. The main algorithm is presented in Section
7.2. Since the calculation is intensive and is hard to present, Section 7.2 contains an
overview of the algorithm. Section 7.3 presents some results of applying the algorithm
to some randomly generated problems. Further analysis and experiments will be done
in the future work.
1For more information on Gauss-Newton method, please refer to Section 9.2
87
7.2 The Gauss-Newton Method
This section presents the solution algorithm.
In cost function f(Q,G, K), we have three variables. Assume K and Q are given
and fixed, the calculation of G reduces to a standard least squares problem. The
solution of G is a formula in variables K and Q. We put the solution back into
(7.1.1) and get a new cost function in K and Q. Assume again K is given and fixed,
the calculation of Q in the new cost function reduces to a standard least squares
problem in Q. The solution of K is a formula in Q. We put this solution back into
the cost function. After these two steps, we obtain a new cost function in variable
Q only, from which we can do the first order Taylor approximation and hence apply
Gauss-Newton method. What follows describes this procedure in greater detail.
Throughout this section, we assume A,B and C are given and fixed. The capital
letters are used to represent matrices and small bold letters represent vectors.
Stage I : eliminate G
f(Q,G, K) = ‖Q(D + G)QT − (A + BKC)‖2
= ‖vec(G)− vec[QT (A + BKC)Q−D]‖22
= ‖Qgg− vec(V )‖22
= 0.
gopt = (QTg Qg)
−1QTg v. (7.2.1)
where V := QT (A + BKC)Q−D. So the new cost function is:
f(Q, K) = ‖ [ Qg(QTg Qg)
−1QTg − I ] vec ( QT (A + BKC)Q−D ) ‖2
2
88
Look at example (7.1.2) again. Note that g = [ g1 g2 g3 g4 g5 ]T and Qg has the
form
Qg =
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
T
,
and to a given problem, Qg is a constant matrix.
Stage II : eliminate K
Define Q := Qg(QTg Qg)
−1QTg − I. We have
f(Q, K) = ‖ Q vec ( QT (A + BKC)Q−D ) ‖22
= ‖ Q vec ( QT AQ−D ) + Q ( QT CT ⊗QT B )k ‖22
= 0.
so kopt = −(Y T Y )−1Y Tw. (7.2.2)
where Y := Q ( QT CT ⊗QT B ) and w := Q vec ( QT AQ−D ).
Define Q := I − Y (Y T Y )−1Y T , we have a new cost function
f(Q) = ‖Qw‖22
Note that Q is a constant matrix since Qg is fixed once the problem is given.
The cost function we are going to proceed with Gauss-Newton method is the
following:
f(Q) = ‖Qw‖22
= ‖I − Q ( QT CT ⊗QT B ) [ ( QC ⊗QBT ) QT Q ( QT CT ⊗QT B ) ]−1
( QC ⊗QBT )QT Q vec ( QT AQ−D ) ‖22
89
So far the cost function f(Q,G, K) has been reduced to a new cost function f(Q).
The first order Taylor approximation of f(Q) is based on the fact that the first order
Taylor approximation of an orthogonal matrix Q = Q0(I + Ω + O(Ω2)) where Ω is
a skew-symmetric matrix. The calculation of the Jacobian of f(Q) is quite lengthy
and hence we omit here.
Here is our algorithm for Problem 7.1.1.
Algorithm:
Problem Data. A ∈ Rn×n, B ∈ Rn×m, C ∈ Rp×n and λD ∈ Cn.
Initialization. Choose a randomly generated orthogonal matrix Y ∈ On , a randomly
generated block super triangular matrix G (the form is determined by λD) and
a randomly generated matrix K ∈ Rm×p.
repeat
1. Calculate gopt using (7.2.1).
2. Calculate kopt using (7.2.2).
3. Do first order Taylor approximation of f(Q).
4. Derive ∆Q to update Q.
until f(Q,G, K) < ε.
7.3 Computational Results
This section contains some numerical results of applying the algorithm to randomly
generated problems. The algorithm was implemented in Matlab 6.5 and all results
were obtained using a 3.19 GHz Pentium 4 machine.
90
A 1000 random problems were created for each of a number of different choices
for the system dimensions (n, m, p). Each problem was created as follows. A,B and
C were randomly generated, i.e., whose entries are drawn from a normal distribution
of zero mean and variance 1. To ensure the problem is feasible, the desired eigen-
values λD are taken from a random matrix and a scalar was added to λD to ensure
maxi Re λDi = −0.1. Initial conditions were chosen randomly. Each triple (n,m, p)
was chosen to satisfy m+p > n. As the problems are randomly generated and satisfy
condition m + p > n, Kimura’s condition (see Theorem A.0.3) ensures each problem
is solvable.
(n,m,p) (6, 4, 3) (9, 5, 5)
Gauss- S.R. 98% 99%Newton T 0.57 0.92Method i 79 31
Table 7.1: A comparison of performance for different n, m and p. S.R. denotes the
success rate, T denotes the average convergence time in CPU seconds, and i the
average number of iterations. T and i are based only on those problems that were
successfully solved. ε = 10−3.
As can be seen from Table 7.1, the results for the algorithm are very good. It
solves most of the randomly generated problems. Figure 7.1 shows a typical plot of
f(Q,G,K) versus i. It has the desired property that near solution, the convergence
was quadratic. As mentioned in the introduction, a more comprehensive analysis and
numerical experiments will be carried out in near future.
91
0 10 20 30 40 5010
−8
10−6
10−4
10−2
100
102
Iteration No. i
f(Q,G,K
)
Figure 7.1: Quadratic convergence near solution of the Gauss-Newton algorithm.
7.4 Summary
In this chapter, a Gauss-Newton algorithm is proposed for solving static output feed-
back pole placement problem. The problem is formulated as a constrained nonlinear
least squares problem and minimization is achieved via a Gauss-Newton method. As
this chapter gives an overview of the algorithm and contains some of the numerical
results, convergence analysis and numerical experiments will be done in near future,
which we believe will be performing well.
Keywords: pole placement, static output feedback, Gauss-Newton method.
Part IV
Problems Arising in Nonnegative
Matrices
92
93
In this part, numerical algorithms are presented for solving two inverse eigenvalue
problems, namely, the inverse eigenvalue problem for nonnegative matrices (NIEP)
or stochastic matrices (StIEP) and the inverse eigenvalue problem for symmetric
nonnegative matrices (SNIEP).
In Chapter 8, we present two related numerical methods, one for NIEP/StIEP and
another for SNIEP. The methods are iterative in nature and utilize alternating pro-
jection ideas. For the algorithm for the symmetric problem, the main computational
component of each iteration is an eigenvalue-eigenvector decomposition, while for the
other problem, it is a Schur matrix decomposition. Numerical results are presented
demonstrating that the algorithms are very effective in solving various problems in-
cluding high dimensional problems.
In Chapter 9, two related numerical algorithms are presented for NIEP/StIEP and
SNIEP. Both algorithms are iterative in nature. One is based on Newton’s method
and the other one is based on Gauss-Newton method. The main computational com-
ponents of each iteration are the first and second order derivatives of the eigenvalues.
Extensive numerical experiments show that the algorithms are very effective in prac-
tice though convergence to a solution is not guaranteed for either algorithm.
Chapter 8
A Projective Methodology for
Nonnegative Inverse Eigenvalue
Problems
8.1 Introduction
A real n× n matrix is said to be nonnegative if each of its entries is nonnegative.
The Nonnegative Inverse Eigenvalue Problem (NIEP) is the following: given a
list of n complex numbers λ = λ1, . . . , λn, find a nonnegative n × n matrix with
eigenvalues λ (if such a matrix exists).
A related problem is the Symmetric Nonnegative Inverse Eigenvalue Problem
(SNIEP): given a list of n real numbers λ = λ1, . . . , λn, find a symmetric non-
negative n× n matrix with eigenvalues λ (if such a matrix exists).
The NIEP and SNIEP are different problems even if λ is restricted to contain only
94
95
real entries; there exist lists of n real numbers λ for which the NIEP is solvable but
the SNIEP is not [42].
Finding necessary and sufficient conditions for a list λ to be realizable as the
eigenvalues of a nonnegative matrix has been a challenging area of research for over
fifty years and this problem is still unsolved; see the recent survey paper [28]. As
noted in [19, Section 6], while various necessary or sufficient conditions exist, the
necessary conditions are usually too general while the sufficient conditions are too
specific. Under a few special sufficient conditions, a nonnegative matrix with the
desired spectrum can be constructed; however, in general, proofs of sufficient con-
ditions are nonconstructive. Two sufficient conditions that are constructive and not
restricted to small n are, respectively, given in [65], for the SNIEP, and [68], for the
NIEP with real λ. (See also [58] for an extension of the results of the latter paper.)
A good overview of known results relating to necessary or sufficient conditions can be
found in the recent survey paper [28] and general background material on nonnegative
matrices, including inverse eigenvalue problems and applications, can be found in the
texts [3] and [54]. We also mention the recent paper [22], which can be used to help
determine whether a give list λ may be realizable as the eigenvalues of a nonnegative
matrix.
In this chapter we are interested in generally applicable numerical methods for
solving NIEPs and SNIEPs. To the best of our knowledge, the only algorithms that
have appeared up to now in the literature consist of [18] for the SNIEP and [21]
for the NIEP. In the case of [18], the following constrained optimization problem is
considered:
minQT Q=I,R=RT
1
2
∥∥QT ΛQ−R R∥∥2
. (8.1.1)
96
Here Λ is a constant diagonal matrix with the desired spectrum and stands for the
Hadamard product, i.e., componentwise product. Note that the symmetric matrices
with the desired spectrum are exactly the elements of QT ΛQ | Q ∈ On and that
the symmetric nonnegative matrices are exactly the elements of R R | R ∈ Sn.In [18], a gradient flow based on (8.1.1) is constructed. A solution to the SNIEP is
found if the gradient flow converges to a Q and an R that zero the objective function.
The approach taken in [21] for the NIEP is similar but is complicated by the fact that
the set of all matrices, both symmetric and nonsymmetric, with a particular desired
spectrum is not nicely parameterizable. In particular, these matrices can no longer
be parameterized by the orthogonal matrices.
In this chapter we present a numerical algorithm for the NIEP and another for
the SNIEP. In both cases, the problems are posed as problems of finding a point in
the intersection of two particular sets. Unlike the approaches in [18] and [21] which
are based on gradient flows, our algorithms are iterative in nature. For the SNIEP,
the solution methodology is based on an alternating projection scheme between the
two sets in question. The solution methodology for the NIEP is also based on an
alternating projection like scheme but is more involved, as we will shortly explain.
While alternating projections can often be a very effective means of finding a
point in the intersection of two or more convex sets, for both the SNIEP and NIEP
formulations, one set is nonconvex. Nonconvexity of one of the sets means that
alternating projections may not converge to a solution. This is in contrast to the case
where all sets are convex and convergence to a solution is guaranteed.
As mentioned above, for each problem, one set in the problem formulation is non-
convex. For the NIEP, this set is particularly complicated; it consists of all matrices
97
with the desired spectrum. At least some of the members of this set will be non-
symmetric matrices and it is this that causes complications. In particular, though
the set is closed and hence projections are well defined theoretically, how to calculate
projections onto such sets is an unsolved difficult problem. An alternate method for
mapping onto this set is used, which is motivated by the control counterpart in Part
II. Though the resulting points are not necessarily projected points, they are members
of the set and share a number of other desirable properties. As will be shown, this
alternate ‘projection’ is very effective in our context. Furthermore, we believe that
it may also be quite effective for other inverse eigenvalue problems involving non-
symmetric matrices. For more on other inverse eigenvalue problems, see the survey
papers [17] and [19], and the recent text [20].
Before concluding this introductory section we would like to point out how the
NIEP is related to another problem involving stochastic matrices. A n× n matrix is
said to be stochastic if it is nonnegative and the sum of the entries in each row equals
one. Another variation of the NIEP is the STochastic Inverse Eigenvalue Problem
(StIEP): given a list of n complex numbers λ = λ1, . . . , λn, find a stochastic n× n
matrix with eigenvalues λ (if such a matrix exists). It turns out that the NIEP and
the StIEP are almost exactly the same problem, as we now show. (See also [21].)
The vector of all 1’s is always an eigenvector for a stochastic matrix, implying
each stochastic matrix must have 1 as an eigenvalue. Also, the maximum row sum
matrix norm of a stochastic matrix equals 1 and hence the spectral radius cannot be
greater than 1, and as a result, must actually equal 1. Suppose λ satisfies the above
mentioned necessary conditions to be the spectrum of a stochastic matrix and that
a nonnegative matrix A with this spectrum can be found. Then if an eigenvector x
98
of A corresponding to the eigenvalue 1 can be chosen to have positive entries (by the
Perron-Frobenius theorem this is certainly possible if A is irreducible), then, if we
define D = diag(x), it is straightforward to verify that
D−1AD
is a stochastic matrix with the desired spectrum. (In fact it can be shown that
if λ satisfies the above mentioned necessary conditions, then it is the spectrum of a
stochastic matrix if and only if it is the spectrum of a nonnegative matrix [77, Lemma
5.3.2].)
This chapter is structured as follows 1. The SNIEP algorithm is presented first,
in Section 8.2, and then insights from this algorithm are used to address the more
difficult NIEP in Section 8.3. Numerical results for both algorithms are presented in
Section 8.4.
8.2 The Symmetric Problem
Our algorithm for solving the SNIEP consists of alternately projecting onto two par-
ticular sets. The details are given in this section.
Given a list of real eigenvalues λ = λ1, . . . , λn, renumbering if necessary, suppose
λ1 ≥ . . . ≥ λn. Let
Λ = diag(λ1, . . . , λn), (8.2.1)
and let M denote the set of all real symmetric matrices with eigenvalues λ,
M = A ∈ Sn |A = V ΛV T for some orthogonal V . (8.2.2)
1Please refer to Chapter 2 for background information on projections.
99
Let N denote the set of symmetric nonnegative matrices,
N = A ∈ Sn |Aij ≥ 0 for all i, j. (8.2.3)
The SNIEP can now be stated as the following particular case of Problem 2.3.1:
Find X ∈M∩N . (8.2.4)
Our solution approach is to alternatively project between M and N , and we next
show that it is indeed possible to calculate projections onto these sets.
Figure 8.1 illustrates the problem formulation and the alternating projections for
SNIEP. For visualization convenience, the figure is demonstrated in R3. However it
should be clear that the real problem can be any dimensional. The pink ball represents
the set M (actually nonconvex) and the box region represents N . The shaded pink
region is the intersection of the two sets. We know exactly how to do projections onto
both N and M and hence both projections are well established.
starting point
feasible set
Figure 8.1: Illustration of the problem formulation for SNIEP.
100
First, in order for the term ‘projection’ to make sense, we need to define an
appropriate Hilbert space and associated norm. From now on, Sn will be viewed as
a Hilbert space with inner product 〈A,B〉 = tr(AB) =∑
i,j aijbij. The associated
norm is the Frobenius norm ‖A‖ = 〈A, A〉 12 .
The projection of A ∈ Sn ontoM is given by Theorem 8.2.1 below. More precisely,
it gives a projection of A onto M. The reason for this is that the set M is nonconvex
and hence projections onto this set are not guaranteed to be unique.
M is nonconvex if its defining λ contains a pair of nonequal eigenvalues. For
example, if n = 2, consider
A =
[λ1 0
0 λ2
]and B =
[λ2 0
0 λ1
].
If λ1 6= λ2, then the convex combination (A+B)/2 does not have the same spectrum
as A and B.
Theorem 8.2.1 is based on the result of Hoffman and Wielandt, see Lemma 4.2.3
(see for example [40, Corollary 6.3.8]).
Theorem 8.2.1. Given A ∈ Sn, let A = V diag(µ1, . . . , µn)V T with V a real or-
thogonal matrix and µ1 ≥ . . . ≥ µn. If Λ is given by (8.2.1), then V ΛV T is a best
approximant in M to A in the Frobenius norm.
Proof. A matrix B ∈ M is a projection of A onto M if ‖A−B‖ ≤ ‖A−M‖ for all
M ∈ M. M ∈ M if and only if M = WΛW T for some real orthogonal matrix W .
Hence Lemma 4.2.3 implies
‖diag(µ1, . . . , µn)− Λ‖ ≤ ‖A−M‖ for all M ∈M. (8.2.5)
If B = V ΛV T then, using the fact that the Frobenius norm is orthogonally invariant,
101
it follows that
‖A−B‖ =∥∥V (diag(µ1, . . . , µn)− Λ)V T
∥∥
= ‖diag(µ1, . . . , µn)− Λ‖ .
(8.2.6)
The result follows by combining (8.2.5) and (8.2.6).
Projection onto N is straightforward and is given by Theorem 8.2.2 below.
Theorem 8.2.2. Given A ∈ Sn, define A+ ∈ Sn by
(A+)ij = max Aij, 0 for all 1 ≤ i, j ≤ n. (8.2.7)
A+ is the best approximant in N to A in the Frobenius norm.
Proof. The projection of x ∈ R onto the nonnegative real numbers equals max x, 0.The general result follows by noting that if B ∈ Sn, and in particular if B ∈ N , then
‖A−B‖ =
(∑i,j
|Aij −Bij|2) 1
2
and hence that the problem reduces to n2 decoupled scalar problems.
Our proposed algorithm for solving the SNIEP is the following.
SNIEP algorithm:
Problem Data. List of desired real eigenvalues λ = λ1, . . . , λn, λ1 ≥ . . . ≥ λn.
Initialization. Choose a randomly generated symmetric nonnegative matrix Y ∈ Sn+.
repeat
1. Calculate an eigenvalue-eigenvector decomposition of Y :
Y = V diag(µ1, . . . , µn)V T , µ1 ≥ · · · ≥ µn.
2. X := V diag(λ1, . . . , λn)V T .
102
3. X := (X + XT )/2.
4. Y := X+.
until ‖X − Y ‖ < ε.
In the above algorithm, X+ is given by (8.2.7).
Note that at each iteration of the algorithm, X has the desired spectrum λ and Y
is nonnegative. If ε is small, say ε = 10−14, termination of the loop ensures X equals
Y (approximately) and hence that Y solves the SNIEP.
Due to small numerical inaccuracy, X from Step 2 of the algorithm may not be
perfectly symmetric. Step 3 makes it so.
Of course, while Corollary 2.3.5 ensures ‖X − Y ‖ is nonincreasing from one iter-
ation to the next, the set M is nonconvex and hence there is no guarantee that the
algorithm will terminate.
8.3 The General Problem
In this section we consider the NIEP. As we will see, the main difference between the
SNIEP and the NIEP is that the set of real, not necessarily symmetric, matrices with
a given (possibly complex) spectrum no longer has as nice a characterization as the
set of real symmetric matrices with a given spectrum.
Throughout this section, Cn×n will be regarded as a Hilbert space with inner
product 〈A,B〉 = tr(AB∗) =∑
i,j aij bij and the associated norm is the Frobenius
norm ‖A‖ = 〈A,A〉 12 .
The Schur’s result in Theorem 4.2.5 shows that a complex square matrix is uni-
tarily equivalent to an upper triangular matrix.
103
We now redefine some terms from the prior section.
Let λ = λ1, . . . , λn be a given list of complex eigenvalues. Define
T = T ∈ Cn×n |T is upper triangular with spectrum λ. (8.3.1)
Theorem 4.2.5 implies that the set of all complex matrices with spectrum λ is
given by the following set:
M = A ∈ Cn×n |A = UTU∗ for some unitary U and some T ∈ T . (8.3.2)
Let N denote the set of (not necessarily symmetric) nonnegative matrices,
N = A ∈ Rn×n |Aij ≥ 0 for all i, j. (8.3.3)
Having redefined M and N , the NIEP can now be stated as the following partic-
ular case of Problem 2.3.1:
Find X ∈M∩N . (8.3.4)
Figure 8.2 illustrates the problem formulation and the alternating projections for
NIEP. The pink region represents the set M which is a rather complicated nonconvex
set and the box region represents the set N . Again we illustrate in R3 for visual-
ization convenience. The shaded pink region is the intersection of the two sets. The
projections onto N is well established. While the calculation of the projection onto
M denoted by the pink dot with dotted arrow is an unsolved problem, a substitute
mapping is applied denoted by the black dot.
We would like to use alternating projections to solve the NIEP, alternatively pro-
jecting between M and N . However, to the best of our knowledge, the way to
calculate projections onto M is an unsolved problem. Suppose instead we could find
104
starting point
feasible set
Figure 8.2: Illustration of the problem formulation for NIEP.
a mapping that was in some sense a reasonable substitute for a projection map for
M. Using this substitute mapping and the projection map for N in an alternating
projection like scheme may still produce a viable algorithm. The following function
PM is used as a substitute for a true projection map onto M.
Definition 8.3.1. Suppose U ∈ Cn×n is unitary and T ∈ Cn×n is upper triangular.
Let λi, i = 1, . . . , n, be a permutation of the list of eigenvalues λ such that, among all
possible permutations, it minimizes
n∑i=1
|λi − Tii|2. (8.3.5)
Define
PM(U, T ) = UTU∗, (8.3.6)
where T ∈ T is given by
Tij =
λi, if i = j,
Tij, otherwise.
105
Note that PM maps into the set M.
A given matrix A ∈ Cn×n may have a nonunique Schur decomposition and A =
U1T1U∗1 = U2T2U
∗2 does not imply PM(U1, T1) = PM(U2, T2). The nonuniqueness of
Schur’s decomposition is demonstrated by the following example [40]:
T1 =
1 1 4
0 2 2
0 0 3
and T2 =
2 −1 3√
2
0 1√
2
0 0 3
are unitarily equivalent via
U =1√2
1 1 0
1 −1 0
0 0√
2
as UT1U∗ = T2. If λ = 0, 0, 0, it is readily verified that PM(U, T1) 6= PM(I, T2).
Hence PM is decomposition dependent.
It turns out that the fact that PM may give different points for different Schur
decompositions of the same matrix is not particularly important. Because different
Schur decompositions lead to points in M of equal distance from the original matrix.
The proof of this fact is similar to the proof for Theorem 4.2.7.
The next theorem shows that given A = UTU∗, if we restrict attention to matrices
of the form UTU∗, T ∈ T , then PM(U, T ) is a point in M closest to A.
Theorem 8.3.2. Suppose A = UTU∗ ∈ Cn×n with U a unitary matrix and T upper
triangular. Then PM(U, T ) satisfies
‖PM(U, T )− A‖ ≤ ‖UTU∗ − A‖ for all T ∈ T .
Proof. This proof is similar to the proof for Theorem 4.2.8 and it omitted here.
106
For completeness, we note that, given A = UTU∗, PM(U, T ) may not satisfy
‖PM(U, T )− A‖ ≤ ‖M − A‖ for all M ∈M.
For example, if
U =1
5
[−3 4
4 3
]and T =
[1 −3
0 2
],
and
U =1
5
[−4 3
3 4
]and T =
[0 −3
0 0
],
then, if λ = 0, 0, one can readily verify that
‖PM(U, T )− UTU∗‖ ‖U T U∗ − UTU∗‖.
As for the symmetric case, projection onto N is straightforward.
Theorem 8.3.3. Given A ∈ Cn×n, define A+ ∈ Rn×n by
(A+)ij = maxRe(Aij), 0 for all 1 ≤ i, j ≤ n. (8.3.7)
A+ is the best approximant in N to A in the Frobenius norm.
Proof. The projection of z ∈ C onto the nonnegative real numbers is given by
maxRe(z), 0. The remainder of the proof follows by exactly the same reasoning
used in the proof of Theorem 8.2.2.
Our proposed algorithm for solving the NIEP is the following.
NIEP algorithm:
Problem Data. List of desired complex eigenvalues λ = λ1, . . . , λn.
Initialization. Choose a randomly generated nonnegative matrix Y ∈ Rn×n.
107
repeat
1. Calculate a Schur decomposition of Y : Y = UTU∗.
2. X := PM(U, T ).
3. Y := X+.
until ‖X − Y ‖ < ε.
In the above algorithm, PM(U, T ) is given by Definition 8.3.1 and X+ is given by
(8.3.7).
As for the SNIEP algorithm, at each iteration of the NIEP algorithm, X has the
desired spectrum λ and Y is nonnegative. If ε is small, say ε = 10−14, termination of
the loop ensures X equals Y (approximately) and hence that Y solves the NIEP.
Remark 8.3.1. If each of the members of λ are real and we seek a symmetric non-
negative matrix with spectrum λ, then the NIEP algorithm reduces to the SNIEP
algorithm. More precisely, this is true if the members of λ are real, if the initial
condition Y is a symmetric nonnegative matrix, and, for Schur decompositions used
in the NIEP algorithm, U is restricted to be real. This can be shown by comparing
the steps in each algorithm.
Indeed, suppose the current Y is symmetric and nonnegative. For any Schur
decomposition of Y , T must be real diagonal matrix. As we restrict the U matrix
to be real, such a decompositioin is nothing but a standard eigenvalue-eigenvector
decomposition for a symmetric matrix (though the eigenvalues are not necessarily
ordered along the diagonal of T ).
As both the elements of λ and the diagonal entries of T are real, the permutation
that minimizes (8.3.5) can be easily characterized. Indeed, in this case (8.3.5) is
minimized if and only if
108
n∑i=1
λiTii (8.3.8)
is maximized. From Lemma 4.2.3, (8.3.8) is maximized if the λi’s are ordered in the
same way as the Tii’s. This implies that if Y is symmetric, the step of producing a
X from Y is the same in both algorithms.
Lastly, projection of a symmetric matrix onto (8.3.3) gives the same matrix as
projection onto (8.2.3) and hence this step in both algorithms is also the same. This
established our claim.
We close this section by noting that unlike the SNIEP algorithm, for the NIEP
algorithm there is no guarantee that ‖X − Y ‖ is nonincreasing from one iteration to
the next.
8.4 Computational Results
This section contains some numerical results for both the SNIEP and NIEP algo-
rithms.
All computational results were obtained using a 3 GHz Pentium 4 machine. The
algorithms were coded using Matlab 7.0.
Throughout this section, when we say a matrix is ‘randomly generated’ we mean
each entry of that matrix is randomly drawn from the uniform distribution on the
interval [0, 1]. When dealing with the SNIEP algorithm, all randomly generated
matrices are chosen symmetric.
For both algorithms, the initial starting Y is always randomly generated. For
both algorithms, the convergence tolerance ε is set to 10−14.
109
8.4.1 SNIEP
This subsection starts with some results for randomly generated SNIEPs. To ensure
each problem is feasible, each desired spectrum is taken from a randomly generated
matrix.
Results for various problem sizes n are given in Table 8.1. For each value of n,
1000 problems were considered. The table contains the average number of iterations
required to find a solution, the average time required to find a solution, and the
success rate. As can be seen, the algorithm performed extremely well and was able
to solve every problem. In all cases, both the average number of iterations and the
average solution time was very small.
n i T % solved
5 19 0.0016 10010 18 0.0030 10020 17 0.0075 100100 12 0.15 100
Table 8.1: SNIEP: A comparison of performance for different problem sizes n. i
denotes the average number of iterations and T denotes the average convergence time
in CPU seconds.
Remark 8.4.1. It is interesting to note that T increases with n, as would be expected,
while i decreases. A reason for this could be the following. For any choice of desired
eigenvalues, M is a smooth manifold. In addition, if the eigenvalues defining M are
distinct, as they will be if they were taken from a randomly generated matrix, then
the dimension of M is n(n−1)2
; see [39, Chapter 2]. The dimension of Sn is n(n+1)2
.
110
Hence,
dim Mdim Sn
=n− 1
n + 1,
which is an increasing function of n. For larger n, M is ‘thicker’ relative to the
ambient space and hence, intuitively, the corresponding SNIEP is easier to solve.
0 2 4 6 8 10 12 1410
−14
10−12
10−10
10−8
10−6
10−4
10−2
100
Iteration No. i
‖Xi−X‖
Figure 8.3: Linear convergence of the SNIEP algorithm.
Suppose X1, X2, . . . , is a sequence of X’s produced by the SNIEP algorithm
and that these points converge to a solution X. Figure 8.3 shows a typical plot
of∥∥Xi − X
∥∥ versus i. Convergence is clearly linear. This is to be expected: Suppose
X is a point on the boundary of N and that the (Xi)+’s lie in a particular face of N .
As M is a manifold, near X it looks locally like an affine subspace of Sn. As the face
of N also looks locally like an affine subspace, we could expect local linear conver-
gence as alternating projections between two intersecting affine subspaces converge
linearly [27].
111
Randomly generated problems have properties that are not shared by all SNIEPs.
For example, as already mentioned, randomly generated problems have distinct eigen-
values. We next consider a problem with repeated eigenvalues, namely λ = 3− t, 1+
t,−1,−1,−1,−1 for 0 < t < 1. The t = 1/2 version of the problem is also consid-
ered in [18], where a numerical solution is sought via the gradient flow approach of
that paper. An analytic solution to this problem is given in [65].
Notice that for any value of t the desired eigenvalues sum to zero and hence
there exist arbitrarily small perturbations of the spectrum which lead to an infeasible
SNIEP. In particular this problem cannot have any solutions in the interior of N .
We have tried the SNIEP algorithm on a number of other problems with repeated
eigenvalues with excellent results. This is the hardest problem we have encountered
so far.
The results of applying the algorithm to the problem for various values of t are
given in Table 8.2. They are based on running the algorithm a 100 times for each
value of t.
t i T % solved
0.25 480 0.061 1000.5 470 0.061 970.75 340 0.050 650.95 310 0.046 59
Table 8.2: SNIEP: A problem with repeated eigenvalues, λ = 3 − t, 1 +
t,−1,−1,−1,−1. i and T do not include the attempts that had not converged
after 5000 iterations.
First, the results indicate that the SNIEP algorithm is not always successful in
112
finding a solution. However, they also show that the algorithm can still be quite
successful if a number of initial conditions are tried. It is interesting to note that the
algorithm becomes more sensitive to the choice of the initial condition the larger t is.
Notice that as t → 1, the eigenvalues 3− t and 1+ t both converge to the same value,
and the dimension of the manifold M (which depends solely on the multiplicities of
the eigenvalues) goes from 9 when 0 < t < 1 to 8 when t = 1 [39, Chapter 2].
Aside: Regarding initial conditions, as noted before, both the SNIEP and NIEP
algorithms use a nonnegative initial starting point. This is important and, in fact,
the performance of neither algorithm is as good if non-nonnegative initial condition
are used.
Here is a solution that was found to the t = 12
problem:
0 0 0√
32
0 1
0 0 1 12
1 0
0 1 0 12
1 0√32
12
12
0 12
√32
0 1 1 12
0 0
1 0 0√
32
0 0
.
This solution is different to both the solution in [18] and the solution in [65]. A
number of other solutions were also found.
Here is a X+ corresponding to an infeasible X (again for the t = 12
problem):
12
0 0 0 0√
32
0 0 78
78
78
0
0 78
0 78
78
0
0 78
78
0 78
0
0 78
78
78
0 0√32
0 0 0 0 0
.
113
The eigenvalues of this matrix are λ = 258, 11
2,−7
8,−7
8,−7
8,−1.
8.4.2 NIEP
This subsection starts with some results for randomly generated NIEPs. Again, to
ensure each problem is feasible, each desired spectrum is taken from a randomly
generated matrix. Results are given in Table 8.3.
n i T % solved
5 26 0.011 99.710 44 0.045 99.820 48 0.12 99.8100 200 12 96.6
Table 8.3: NIEP: A comparison of performance for different problem sizes n. i and
T do not include the problems that had not converged after 5000 iterations.
As can be seen, the results are again very good with almost all problems solved.
The results indicate that NIEPs are harder to solve than SNIEPs. Also, the
number of iterations, time, and time per iteration are greater. Part of the reason for
an increase in time per iteration will be the extra computation required to calculate
the least squares matching component of each PM(U, T ) calculation; see (8.3.5). (For
SNIEPs, the corresponding step is easy: the eigenvalues are real and just need to be
sorted in decreasing order.)
For the NIEPs, both i and T increased with n.
The final problem we consider is taken from [21]. It is to find a stochastic matrix
with (presumably randomly generated) spectrum λ = 1.0000,−0.2608, 0.5046, 0.6438,
114
−0.4483. Furthermore the problem requires the matrix to have zeros in certain po-
sitions. In the context of Markov chains, we require the states to form a ring and
that each state be linked to at most two immediate neighbors. The zero pattern is
given by the zeros of the following matrix:
Z =
1 1 0 0 1
1 1 1 0 0
0 1 1 1 0
0 0 1 1 1
1 0 0 1 1
. (8.4.1)
Our algorithm as it stands is not able to solve this problem though it is able to do so
if a simple modification is made. Using Z from (8.4.1), define
N = A ∈ Rn×n | Aij ≥ 0 and Aij = 0 if Zij = 0.
N is still a convex set. In the NIEP algorithm, replacing projection onto N by
projection onto N gives solutions (nonnegative matrices) with zeros in the desired
places. Using the transformation discussed in the introduction of the paper, solutions
found by the algorithm can be converted into stochastic matrices with the same
spectrum. Note that this transformation preserves zeros.
Using this methodology readily produced many solutions. An example is
X =
0.6931 0.2887 0 0 0.0182
0.1849 0.2422 0.5729 0 0
0 0.5476 0.3622 0.0902 0
0 0 0.5437 0.1233 0.3330
0.3712 0 0 0.6103 0.0185
.
115
Another solution is
X =
0.8634 0.0431 0 0 0.0936
0.6224 0 0.3776 0 0
0 0.4935 0.1564 0.3501 0
0 0 0.1107 0.0115 0.8778
0.3452 0 0 0.2467 0.4080
.
Notice that this latter solution has an extra zero. While this X still solves the
problem, by further modifying N it is possible to ensure zeros appear only in the
places specified by (8.4.1) and nowhere else.
For example, using
N = A ∈ Rn×n | Aij = 0 if Zij = 0 and Aij ≥ δ otherwise,
with δ > 0 a small constant, does the trick. Note that the stochastic matrix trans-
formation leaves positive entries positive.
8.5 Summary
In the chapter we presented two related numerical methods, one for the NIEP, which
can also be used to solve the inverse eigenvalue problem for stochastic matrices, and
another for the SNIEP. While this chapter deals with two specific types of inverse
eigenvalue problems, the ideas presented should also be applicable to many other
inverse eigenvalue problems, including those involving nonsymmetric matrices.
Keywords: inverse eigenvalue problem, nonnegative matrices, stochastic matrices,
alternating projections, Schur’s decomposition.
Chapter 9
Newton Type Methods for
Nonnegative Inverse Eigenvalue
Problems
9.1 Introduction
In this chapter, we present two related Newton methods for solving SNIEP and
NIEP/StIEP. Details of the considered problems can be found in Section 8.1.
For NIEP, given a list of complex eigenvalues λD ∈ Cn, two approaches are con-
sidered for solving the following unconstrained nonlinear least squares problem
minA∈Rn×n
f(A) :=1
2‖λ(A A)− λD‖2
2. (9.1.1)
Here AA denotes the Hadamard product of A, i.e. componentwise product. Note
that any nonnegative matrix is exactly an element of A A | A ∈ Rn×n. λ(A A)
denotes the vector of eigenvalues of A A, with entries sorted to give the minimum
116
117
norm.
Along the same line for SNIEP, given a list of real eigenvalues λD ∈ Rn, two
approaches are considered for solving the following problem
minA∈Sn
f(A) :=1
2‖λ(A A)− λD‖2
2. (9.1.2)
Note that any symmetric nonnegative matrix is exactly an element of AA | A ∈ Sn.
Newton type methods, which are well known in the optimization community, are a
type of iterative methods. The specific Newton type methods we use are the standard
Newton’s method and the Gauss-Newton method. In order to employ the Newton
type methods, at each iteration, the first and second derivatives of the eigenvalues
of A A must be calculated. The Gauss-Newton method has the advantage of only
requiring the first derivatives of the eigenvalues. Both algorithms presented have the
desired property that, near solutions, they often converge quadratically.
The eigenvalues of A A may not be differentiable everywhere. However they
will be differentiable at all points at which A A has distinct eigenvalues. Hence the
algorithms can still be used to solve problems whose desired eigenvalues are distinct
but whose separation is quite small and hence this does not appear to be a serious
limitation.
This chapter is structured as follows. Section 9.2 contains a brief overview of
Newton type methods, as Newton’s method is well known and is a standard and
classical method, only basic formulas are given. In order to use these methods to
solve the SNIEP and NIEP/StIEP, the first and second derivatives of (9.1.1) and
(9.1.2) are required. Details of these calculations are given in Section 9.3. Section 9.4
contains computational results.
118
9.2 Newton Type Methods
This section gives an overview of Newton type methods.
Newton’s method plays a central role in the development of numerical techniques
for optimization. It arises naturally from the Taylor approximation to a function.
Newton’s method is an very important tool in solving many optimization problems.
Most of the existing practical methods for optimization (e.g. Gauss-Newton method
and Quasi-Newton method) are variations of Newton’s method [55]. Additional in-
formation on Newton type methods can be found in [55] and [56].
It is assumed that the function f : RN → R to be minimized is (sufficiently)
smooth. Let f be twice continuously differentiable. Given a current iterate xk and
an increment step pk, the quadratic approximation is assumed to be of the form
f(xk + p) = f(xk) +∇f(xk)T p +
1
2pT Bkp.
Here Bk is typically either the Hessian of f at xk or some approximation of this Hes-
sian. If Bk is the Hessian, then f(xk+p) is simply the 2nd order Taylor approximation
of f at xk. In the variations of standard Newton’s method, other choices of Bk may
be employed.
Newton iterate takes the form
xk+1 = xk −H−1k ∇f(xk), k ≥ 0. (9.2.1)
Note in Newton iterate, the choice of Bk is the Hessian of f at xk. When the
objective function f is a least squares cost, say f(x) = 12
∑Mi=1 r2
i (x) for some functions
ri : Rn → R. If we define r(x) = (r1(x), . . . , rM(x))T and let J(x) denote the Jacobian
119
of r(x), then
∇f(x) = J(x)T r(x) and ∇2f(x) = J(x)T J(x) +M∑i=1
ri(x)∇2ri(x),
and a good choice for Bk is J(xk)T (xk). For this choice of Bk’s, the method is called
the Gauss-Newton method.
Note that a rather surprising result of the proposed algorithms is that usually
the convergence properties of both Newton’s method and Gauss-Newton method can
only be analyzed from a local point of view. That is, we have to start sufficiently
close to a solution for the algorithm to converge and only in this case local quadratic
convergence can be expected. This does not appear to be the case to our algorithms.
The computational results show that we can choose random starting points for the
algorithms to converge. Typically the algorithms converge quadratically near solu-
tions.
9.3 Derivative Calculations
In order to apply the Newton type methods from the prior section, we need to calculate
the appropriate first and second derivatives. Theorem 6.3.1 shows that if A A in
(9.1.1) and (9.1.2) have distinct eigenvalues, the eigenvalues of AA are differentiable.
Suppose the conditions in Theorem 6.3.1 are satisfied with k ≥ 2. Then we can
write down explicit expressions for the first and second derivatives of the eigenvalues.
The calculation of the first derivative of eigenvalues is given in (6.3.2) and the second
derivative in (6.3.4).
These results can be used to calculate the derivatives of our objective function
120
f(A) = 12
∑ni=1(λi − λD
i )∗(λi − λDi ). Differentiating we have
∂f(A)
∂Akl
= Re
n∑
i=1
(λi − λDi )∗
∂λi
∂Akl
(9.3.1)
and
∂2f(A)
∂Akl∂Apq
= Re
n∑
i=1
(∂λi
∂Akl
)∗ (∂λi
∂Apq
)+
n∑i=1
(λi − λDi )∗
∂2λi
∂Akl∂Apq
. (9.3.2)
Note that ‘∂A(x)∂xk
’ is given by
∂(A A)
∂Akl
= 2AkleTk el, (9.3.3)
where ek is a zero vector with only a 1 on the k-th position. Identity (9.3.3) implies
that the first term appearing in (6.3.4) is always zero. Combining (6.3.2), (6.3.4),
(9.3.1), (9.3.2) and (9.3.3) we now have a complete characterization of the first and
second derivatives of our cost (at points where A A has distinct eigenvalues).
Note that when applying the Gauss-Newton method, the approximate second
derivatives are given by the first term in (9.3.2),
∂2f(A)
∂Akl∂Apq
≈ Re
n∑
i=1
(∂λi
∂Akl
)∗ (∂λi
∂Apq
). (9.3.4)
Remark 9.3.1. The calculations of the first and second derivatives for SNIEP and
NIEP/StIEP are roughly the same. The only difference would be that the first deriva-
tive needs less calculation for SNIEP as (9.3.1) for SNIEP is symmetric.
9.4 Computational Results
This section contains some numerical results for both SNIEP and NIEP problems.
All computational results were obtained using a 3.19 GHz Pentium 4 machine.
The algorithms were coded using Matlab 6.5.
121
n 5 10 20
Newton’s S.R. 100% 100% 100%method T 0.45 13 595
i 5.9 5.7 5.9
Gauss- S.R. 100% 100% 100%Newton T 0.03 0.16 4.8method i 5.9 5.7 5.8
Table 9.1: SNIEP: A comparison of performance for different n. S.R. denotes the
success rate, T denotes the average convergence time in CPU seconds, and i the
average number of iterations.
Throughout this section, when we say a matrix is ‘randomly generated’ we mean
each entry of that matrix is randomly drawn from the uniform distribution on the
interval [0, 1]. When dealing with the SNIEP algorithm, all randomly generated
matrices are chosen symmetric.
9.4.1 SNIEP
This subsection presents some results for SNIEPs.
First we present some results for randomly generated problems. To ensure each
problem was feasible, each desired spectrum was taken from a randomly generated
matrix. Results for various problem sizes n are given in Table 9.1. For each value of n,
1000 problems were created (with exception to n = 20 of Newton’s method where 100
problems were created). Initial conditions were chosen randomly. The convergence
condition used was ‖λ(A A)− λD‖2 < ε with ε = 10−14.
As can be seen, the algorithms perform extremely well and they solve every prob-
lem. As expected, the average solution time for the Gauss-Newton method is much
122
1 2 3 4 5 6
10−14
10−12
10−10
10−8
10−6
10−4
10−2
100
Iteration No. i
||λ(AA
)−λD|| 2
Figure 9.1: Quadratic convergence near solution of the SNIEP algorithm.
less than that for the Newton’s method. The average numbers of iterations are
roughly the same for both algorithms. (The algorithms were able to solve problems
up to n = 48. Due to the computational constraint of the computer, we did not get
results for problems larger than this size. As far as we tested, the success rate for
n = 48 problems was also 100%. However the computation was extensive and each
iteration consumed a lot of time.)
Figure 9.1 shows a typical plot of ‖λ(A A) − λD‖2 versus i. As expected near
solution, the convergence was quadratic.
Randomly generated problems have properties that are not shared by all SNIEPs,
for example, they have distinct eigenvalues. We next show results for a problem with
repeated eigenvalues, namely λD = 3− t, 1 + t,−1,−1,−1,−1 for 0 < t < 1. Refer
to Section 8.4 for details on the literature of this particular problem and the existing
solution methods.
Table 9.2 gives the results of applying the Gauss-Newton algorithm to the problem
123
t i T % solved
0.25 67 0.35 96%0.5 123 0.64 91%0.75 141 0.76 83%0.95 206 0.99 89%
Table 9.2: SNIEP: A particular problem with repeated eigenvalues. i and T are only
based on the problems that had converged within 1000 iterations. ε = 10−3.
for various values of t. They are based on running the algorithm for 100 times for
each value of t.
Though the algorithm did not successfully find a solution in very trial, the per-
formance was again very good. As mentioned in the introduction, the eigenvalues of
AA may not be differentiable where it has repeated eigenvalues. As we can see from
Table 9.2, it is not a serious limitation on applying our algorithm to problems with
repeated eigenvalue.
9.4.2 NIEP
This subsection presents some results for NIEPs.
First we present the results for some randomly generated problems. Again to
ensure each problem is feasible, each desired spectrum is taken from a randomly
generated matrix.
Results for various problem sizes n are given in Table 9.3. For each value of n,
1000 problems were created (with exception to n = 20 of Newton’s method where 100
problems were created.). The table contains the average number of iterations required
to find a solution, the average time required to find a solution and the success rate.
As can be seen, the algorithms perform very good in solving random problems.
124
n 5 10 20
Newton’s S.R. 98.7% 98.0% 91%method T 0.78 32 1913
i 8.0 9.3 12
Gauss- S.R. 99.1% 98.3% 98.2%Newton T 0.04 0.62 45method i 8.1 9.6 13
Table 9.3: NIEP: A comparison of performance for different n. T and i are based only
on those problems that were successfully solved within 1000 iterations. ε = 10−14.
Figure 9.2 shows a typical plot of ‖λ(A A) − λD‖2 versus i. As expected near
solution the convergence was quadratic.
As introduced in Section 8.1, under rather mild assumption on the desired spec-
trum, a nonnegative matrix is readily transformed to a stochastic matrix. The final
problem we consider is from the counterpart in Section 8.4. It is to find a stochastic
matrix with spectrum λ = 1.0000,−0.2608, 0.5046, 0.6438,−0.4483. Furthermore
the problem requires the matrix to have zeros in certain positions as the pattern given
in (8.4.1). Using either Newton’s method or Gauss-Newton method produced many
solutions. A special solution is
X =
0 0.9667 0 0 0.0333
0.1866 0 0.8133 0 0
0 0.0828 0.8581 0.059 0
0 0 0.0004 0.5811 0.4185
0.6083 0 0 0.3916 0
.
Notice that this solution has three extra zeros, which shows the flexibility of the
proposed algorithms.
125
1 2 3 4 5 6 7 8 9 10 11 1210
−16
10−14
10−12
10−10
10−8
10−6
10−4
10−2
100
102
Iteration No. i
‖λ(AA
)−λD‖ 2
Figure 9.2: Quadratic convergence near solution of the NIEP algorithm.
9.5 Summary
In the chapter two Newton algorithms are presented for solving NIEP/StIEP and
SNIEP problems. As demonstrated by experiments, they perform very good. While
they deal with two specific types of inverse eigenvalue problems, the ideas are ap-
plicable to many other inverse eigenvalue problems subject to different structural
constraint, especially those involving nonsymmetric matrices.
Keywords: inverse eigenvalue problem, nonnegative matrices, Newton’s method,
Gauss-Newton method.
Part V
Conclusion and Future Work
126
Chapter 10
Conclusion and Future Work
10.1 Conclusion
The main goal of this research is to develop optimization algorithms to tackle some
interesting, important and challenging inverse eigenvalue problems arising in con-
trol and nonnegative matrices. A common feature shared by all the problems being
solved is that they have simple problem expressions and/or formulations, but they
are all very hard to be solved. Hence novel, efficient, reliable and general applicable
algorithms are of great interest.
On the control part, we concentrate on static output feedback related problems.
The control tasks range from the most basic stabilization to much more sophisticated
hybrid pole placement and simultaneous stabilization problem etc. The fact that the
desired pole placement regions need not to be convex or even connected allows much
flexibility in choosing pole placement regions to meet various control design tasks.
The three methodologies have their own features.
The projective methodology is generally applicable to various control problems.
127
128
The problem formulation is indeed a new perspective and is one of the main contribu-
tions. The computation of ‘projection’ onto nonconvex set is an open hard problem.
We propose reasonable substitute to make the alternating projection scheme working.
Such ideas should be widely applicable to control and noncontrol related problems,
especially those involving nonsymmetric matrices. The algorithms are iterative in
nature and the computation of each iteration is very fast, which makes it possible
to solve high dimensional problems. As there is no guarantee the algorithms will
always converge, the freedom to choose multiple starting points greatly increase the
likelihood that solutions could be found.
The trust region methods are proposed for solving classical pole placement prob-
lem and will be extended to pole placement problems in generality. The problem
formulation is naturally arisen from the control goals and has a clear geometric inter-
pretation. As the algorithms are iterative in nature, the computation of each iteration
is comparatively expensive. However in most cases near solutions, the convergence is
quadratic, which ends up with a low overall computational cost.
The Gauss-Newton method is built on a novel problem formulation and sophisti-
cated computations. The problem formulation is motivated by the real Schur’s form
of the closed loop system matrices. The algorithm is generally applicable to various
problems, including those with repeated eigenvalues. Though its computation is ex-
tensive and choice of starting point is important, the algorithm performs very well in
experiments.
On nonnegative matrices part, we tackle nonnegative/symmetric nonnegative/stochastic
inverse eigenvalue problems respectively. To the best of our knowledge, there are only
129
few algorithms exiting in literature dealing with these problems. As the wide applica-
tion of nonnegative matrices, these problems have broad potential applications. The
features of the methodologies are as follows.
The projective methodology is built on a simply problem formulation with clear
geometric meaning. The projective algorithms are very fast and can be applied to
high dimensional problems. The way of handling nonconvexity in the problem setting
should be applicable to related problems, especially those involving nonsymmetric
matrices.
The Newton type methods are proposed for constrained nonlinear optimization
problem formulations. They utilize the first and second derivatives of eigenvalues and
are iterative in nature. The algorithms have the desired property that near solutions
the convergence is typically quadratic.
A lot of experiments were carried out for all the algorithms proposed and the
results were compared with other existing methods. As observed from the computa-
tional results and some preliminary future work results, we believe the algorithms in
this thesis are new, interesting and promising.
10.2 Future Work
Inverse eigenvalue problem is a big family and encompasses a large scope of problems.
No matter the theoretical analysis on necessary and/or sufficient conditions under
which a problem is solvable or the practical issue on developing efficient and reliable
algorithms, a lot more work need to be done. In this section, we mention a few future
work directions. We believe these problems should be solvable based on either the
methodologies proposed in this thesis or some variations.
130
1. Chapter 5 presents an overview of a projective methodology and some of the
numerical results of static output feedback simultaneous stabilization and decen-
tralized control problems. In the same framework more related control problems
could be solved, for example, reduced order output feedback control, regional
pole placement, H2/H∞ control etc. As simultaneous stabilization via static
output feedback is NP-hard, it is interesting to see the performance of the pro-
posed algorithm in more examples.
2. In Chapter 6, trust region methods are introduced to solve classical pole place-
ment problems. With little modification, these methods can be used to solve all
other problems in generalized pole placement framework. As the performance
for classical pole placement is extremely good, we expect the performance for
other problems especially those less standard problems will be as good.
3. Chapter 7 introduces an interesting problem formulation for classical pole place-
ment via static output feedback and proposes a Gauss-Newton method for solv-
ing it. A detailed analysis of convergence properties and related issues is needed.
In addition, as we believe this method is able to handle problems with repeated
eigenvalues, it does not have any formal constraints. However this guess needs
to be proved in numerical experiments.
4. COMPleib is a problem database containing many milestone static output feed-
back control problems. As we tested the projective methodology for solving
static output feedback stabilization problems (results are shown in Appendix
B), it is interesting to see the performance using trust region methods and
Gauss-Newton method, especially for those problems where mp < n.
131
5. As mentioned previously, the projective methodology, Newton type methods,
and trust region methods can all be potentially applied to other inverse eigen-
value problems subject to different structural and spectral constraints. For
example [20, Problem 4.15], given L = (it, jt)lt=1 of double subscripts and a
set of values a1, . . . , al over a field F, and a set of values λD = λ1, . . . , λn,find a matrix X ∈ Fn×n such that
λ(X) = λD
Xit,jt = at, for t = 1, . . . l.
This is the most general setting for inverse eigenvalue problems with prescribed
entries. The choice of the field F can be real or complex. The methodologies
employed in this thesis are readily applied to solve such problems.
Appendix A
Results for Classical Pole
Placement
This appendix lists some of the main theoretical pole placement results that have
appeared in the literature to date. References to the literature have already been
given in the introduction of Chapter 4.
When referring to ‘arbitrary pole placement’ it is with the understanding that
complex poles are placed in complex conjugate pairs.
Theorem A.0.1. Arbitrary pole placement implies the system is controllable and
observable, and that mp ≥ n.
The necessity of controllability and observability can be seen by considering the
controllability and observability canonical forms. The necessity of the dimension
constraint follows from fact that if one wants to arbitrarily assign n values, at least
as many degrees of freedom are required of the input variable.
For following result was shown by Wang in [74].
132
133
Theorem A.0.2. Consider fixed system dimension n, m and p, and suppose mp > n.
Arbitrary pole placement is possible for generic systems with these dimension.
The sufficient condition just mentioned is for generic systems. For particular
systems we have the following sufficient condition by Kimura [45].
Theorem A.0.3. Consider a controllable and observable system satisfying m+p > n.
Given any choice of desired poles λ1, . . . , λn (with complex poles in complex conjugate
pairs) and arbitrary ε > 0, there exists K such that |λi(A + BKC) − λi| < ε, i =
1, . . . , n.
Appendix B
More Computational Results for
Stabilization
COMPLeib is a control problem database containing many milestone control prob-
lems.
We ran the projective algorithm presented in Chapter 4 on the static output
feedback stabilization problems given in the COMPLeib, which were not open loop
stable. Details of the problems can be found in [50].
For each problem, 100 randomly initial conditions are tested and the maximum
number of iterations per initial condition was set to 1000. Once a stabilizing solution
was found, the algorithm was terminated.
134
135
Problem (n,m, p) S.R. i T ρ(A + BKC)AC1 (5, 3, 3) 97% 15 0.005 -0.47AC2 (5, 3, 3) 100% 12 0.003 -0.52AC4 (4, 1, 2) 100% 18 0.003 -0.05AC5 (4, 2, 2) 100% 517 0.11 -0.0013AC7 (9, 1, 2) 55% 148 0.065 -6.9e-4AC8 (9, 1, 5) 43% 929 0.32 -0.10AC9 (10, 4, 5) 98% 129 0.088 -0.099AC10 (55, 2, 2) - - - -AC11 (5, 2, 4) 3% 199 0.00048 -5.1e-4AC12 (4, 3, 4) 100% 2.7 0.0011 -0.72AC13 (28, 3, 4) - - - -AC14 (40, 3, 4) - - - -AC18 (10, 2, 2) 100% 612 0.31 -2.9e-4HE1 (4, 2, 1) - - - -HE3 (8, 4, 6) 100% 5.6 0.0034 -0.20HE4 (8, 4, 6) 100% 7.1 0.0055 -0.16HE5 (8, 4, 2) 3% 527 0.24 -6.8e-5HE6 (20, 4, 6) 100% 2.2 0.005 -0.005HE7 (20, 4, 6) 100% 1.9 0.0032 -0.005JE2 (21, 3, 3) 54% 131 0.25 -1.28JE3 (24, 3, 6) 58% 155 1.3 -1.1581REA1 (4, 2, 3) 51% 62 0.018 -0.54REA2 (4, 2, 2) 71% 23 0.0057 -0.071REA3 (12, 1, 3) 100% 22 0.018 -0.02REA4 (8, 1, 1) - - - -DIS2 (3, 2, 2) 100% 2.6 0.0005 -0.94DIS4 (6, 4, 6) 100% 2.7 0.016 -0.72DIS5 (4, 2, 2) 100% 523 0.13 -0.002WEC1 (10, 3, 4) 100% 508 0.26 -0.0036BDT2 (82, 4, 4) 100% 13 0.43 -0.075IH (21, 11, 10) 100% 10 0.095 -0.5PAS (5, 1, 3) - - - -TF1 (7, 2, 4) 100% 80 0.033 -0.08TF2 (7, 2, 3) 100% 4.8 0.002 -1e-5TF3 (7, 2, 3) - - - -NN1 (3, 1, 2) - - - -NN2 (2, 1, 1) 100% 1.5 0 -0.98NN3 (4, 1, 1) - - - -NN5 (7, 1, 2) - - - -
136
Problem (n,m, p) S.R. i T ρ(A + BKC)NN6 (9, 1, 4) - - - -NN7 (9, 1, 4) - - - -NN9 (5, 3, 2) - - - -NN10 (8, 3, 3) - - - -NN12 (6, 2, 2) - - - -NN13 (6, 2, 2) 100% 15 0.0045 -0.086NN14 (6, 2, 2) 100% 15 0.005 -0.092NN15 (3, 2, 2) 100% 2.4 0.0006 -0.31NN16 (8, 4, 4) 100% 2.2 0.0018 -0.30NN17 (3, 2, 1) 1% 1 0 -0.55
Table B.1: S.R. denotes the success rate, T denotes the average convergence time
in CPU seconds, and i the average number of iterations. T and i are based only on
those problems that were successfully solved. ‘−′ indicates a stabilizing solution was
not found.
Bibliography
[1] A. T. Alexandridis and P. N. Paraskevopoulos, A new approach to eigenstructure
assignment by output feedback, IEEE Trans. on Automatic Control 41 (1996),
no. 7, 1046–1050.
[2] Heinz H. Bauschke and Jonathan M. Borwein, On projection algorithms for solv-
ing convex feasibility problems, SIAM Review 38 (1996), no. 3, 367–426.
[3] A. Berman and R. J. Plemmons, Nonnegative matrices in the mathematical sci-
ences, Academic Press, New York, 1979, Also published as Classics in Applied
Mathematics 9, SIAM, Philadelphia, 1994.
[4] V. Blondel, Simultaneous stabilization of linear systems, Lecture Notes in Control
and Information Sciences 191 (1994).
[5] V. D. Blondel and J. N. Tsitsiklis, A survey of computational complexity results
in systems and control, Automatica 36 (2000), 1249–1274.
[6] Vincent Blondel and John H. Tsitsiklis, NP-hardness of some linear control de-
sign problems, SIAM J. Control Optim. 35 (1997), no. 6, 2118–2127.
137
138
[7] J. Bosche, O. Bachelier, and D. Mehdi, Robust pole placement by static output
feedback, Proceedings of the 43rd IEEE Conference on Decision and Control
(Paradise Island, Bahamas), 2004, pp. 869–874.
[8] S. Boyd and J. Dattorro, Alternating projections, Lecture Note, 2003.
[9] L. M. Bregman, The method of successive projection for finding a common point
of convex sets, Soviet Mathematics 6 (1965), no. 3, 688–692.
[10] C. I. Byrnes, Pole assignment by output feedback, Three decades of mathematical
system theory (H. Nijmeijer and J. M. Schumacher, eds.), Lecture notes in control
and inform. sci. 135, Springer Verlag, 1989, pp. 31–78.
[11] Y. Y. Cao and Y. X. Sun, Static output feedback simultaneous stabilization: Ilmi
approach, INT. J. Control 70 (1998), no. 5, 803–814.
[12] Y. Y. Cao, Y. X. Sun, and J. Lam, Simultaneous stabilizatioin via static output
feedback and state feedback, IEEE Trans. Automatic Control 44 (1999), no. 6,
1277–1282.
[13] Y. Y. Cao, Y. X. Sun, and W. J. Mao, Output feedback decentralized stabilization:
Ilmi approach, Systems & Control Letters 35 (1998), 183–194.
[14] L. Carotenuto, C. Franze, and P. Muraca, New computational procedure to the
pole placement problem by static output feedback, IEE Proceedings Control The-
ory & Applications 148 (2001), no. 6, 466–471.
[15] H. B. Chen, J. H. Chow, M. A. Kale, and K. D. Minto, Simultaneous stabilization
using stable system inversion, Automatica 31 (1995), no. 4, 531–542.
139
[16] M. Chilali and P. Gahinet, H∞ design with pole placement constraints: An LMI
approach, IEEE Trans. on Automatic Control 41 (1996), no. 3, 358–367.
[17] M. T. Chu, Inverse eigenvalue problem, SIAM Rev 40 (1998), 1–39.
[18] M. T. Chu and K. R. Driessel, Constructing symmetric nonnegative matrices
with prescribed eigenvalues by differential equations, SIAM J. Math. Anal. 22
(1991), 1372–1387.
[19] M. T. Chu and G. H. Golub, Structured inverse eigenvalue problems, Acta Nu-
merica 11 (2002), 1–71.
[20] , Inverse eigenvalue problems, Oxford Science Publications, 2005.
[21] M. T. Chu and Q. Guo, A numerical method for the inverse stochastic spectrum
problem, SIAM J. Matrix Anal. Appl. 19 (1998), no. 4, 1027–1039.
[22] M. T. Chu and S. F. Xu, On computing minimal realizable spectral radii of non-
negative problem, Numer. Linear Algebra 12 (2005), 77–86.
[23] P. L. Combettes and H. J. Trussell, Method of successive projections for finding
a common point of sets in metric spaces, J. Optim. Theory Applics. 67 (1990),
487–507.
[24] A. R. Conn, N. I. M. Gould, and P. L. Toint, Trust-region methods, MPS-SIAM
Series on Optimization, SIAM, Philadelphia, 2000.
[25] E. J. Davison and S. H. Wang, On pole assignment in linear multivariable systems
using output feedback, IEEE Trans. on Automatic Control 20 (1975), no. 4, 516–
518.
140
[26] M. C. de Oliveira and J. C. Geromel, Numerical comparison of output feedback
design methods, Proceedings of American Control Conference (New Mexico),
1997, pp. 72–76.
[27] F. Deutsch, Best approximation in inner product spaces, Springer-Verlag, New
York, 2001.
[28] P. D. Egleston, T. D. Lenker, and S. K. Narayan, The nonnegative inverse eigen-
value problem, Linear Algebra and its Applications 379 (2004), 475–490.
[29] A. Eremenko and A. Gabrielov, Pole placement by static output feedback for
generic linear systems, SIAM J. Control Optimization 41 (2002), no. 1, 303–
312.
[30] S. Friedland, J. Nocedal, and M. L. Overton, The formulation and analysis of
numerical methods for inverse eigenvalue problems, SIAM J. Numer. Anal. 24
(1987), no. 3, 634–667.
[31] M. Fu, Pole placement via static output feedback is NP-hard, IEEE Trans. on
Automatic Control 49 (2004), no. 5, 855–857.
[32] G. Garcia, B. Pradin, and F. Zeng, Stabilization of discrete time linear systems
by static output feedback, IEEE Trans. on Automatic Control 46 (2001), no. 12,
1954–1958.
[33] K. M. Grigoriadis and E. B. Beran, Alternating projection algorithms for linear
matrix inequalities problems with rank constraints, Advances on Linear Matrix
Inequality in Control, (1999), 251–267, L. El Ghaoui and S,-I. Niculescu, Eds.
SIAM.
141
[34] K. M. Grigoriadis and R. E. Skelton, Alternating convex projection methods for
covariance control design, Int. J. Control 60 (1994), no. 6, 1083–1106.
[35] , Low-order control design for LMI problems using alternating projections,
Automatica 32 (1996), no. 8, 1117–1125.
[36] L. G. Gubin, B. T. Polyak, and E. V. Raik, The method of projections for finding
the common point of convex sets, USSR Comput. Math. Phys. 7 (1967), no. 6,
1–24.
[37] S. Gutman and E. Jury, A general theory for matrix root-clustering in subregions
of the complex plane, IEEE Trans. on Automatic Control 26 (1981), no. 4, 853–
863.
[38] A. Hassibi, J. How, and S. Boyd, Low-authority controller design via convex
optimization, AIAA J. of Guidance, Control, and Dynamics 22 (1999), no. 6,
862–872.
[39] U. Helmke and J. B. Moore, Optimization and dynamical systems, Springer-
Verlag, London, 1994.
[40] R. A. Horn and C. A. Johnson, Matrix analysis, Cambridge University Press,
1985.
[41] , Topics in matrix analysis, Cambridge University Press, 1991.
[42] C. R. Johnson, T. J. Laffey, and R. Loewy, The real and the symmetric non-
negative inverse eigenvalue problems are different, Proceedings of the American
Mathematical Society 124 (1996), no. 12, 3647–3651.
142
[43] T. Kato, A short introduction to perturbation theory for linear operators,
Springer-Verlag, New York, 1982.
[44] L. H. Keel, S. P. Bhattacharyya, and J. W. Howze, Robust control with structured
perturbation, IEEE Trans. on Automatic Control 33 (1988), no. 1, 68–78.
[45] H. Kimura, Pole assignment by gain output feedback, IEEE Trans. Automat.
Control (1975), 509–516.
[46] , A further result on the problem of pole assignment by output feedback,
IEEE Trans. on Automatic Control 22 (1977), no. 3, 458–463.
[47] B. Labibi, B. Lohmann, A. K. Sedigh, and P. J. Maralani, Output feedback decen-
tralized control of large-scale systems using weighted sensitivity functions mini-
mizatioin, Systems & Control Letters 47 (2002), 191–198.
[48] P. Lancaster, On eigenvalues of matrices dependent on a parameter, Numer.
Math. 6 (1964), 377–387.
[49] T. H. Lee, Q. G. Wan, and E. K. Koh, An iterative algorithm for pole placement
by output feedback, IEEE Trans. on Automatic Control 39 (1994), no. 3, 565–568.
[50] F. Leibfritz, Compleib: Costrained matrix-optimization problem library - a collec-
tion of test examples for nonlinear semidefinite programs, control system design
and related problems, Tech. rep., University of Trier, 2003.
[51] D. Luenberger, Optimization by vector space methods, Wiley, New York, 1969.
143
[52] R. E. Mahony and U. Helmke, System assignment and pole placement for sym-
metric reaslisations, Journal of Math. Systems, Estimation and Control 5 (1995),
no. 2, 1–32.
[53] S. Martello and P. Toth, Linear assignment problems, Annals of Discrete Math-
ematics 31 (1987), 259–282.
[54] H. Minc, Nonnegative matrices, Wiley, New York, 1988.
[55] J. J. More and D. C. Sorensen, Newton’s method, Studies in Numerical Analysis
24 (1985), 29–82.
[56] J. Nocedal and S. J. Wright, Numerical optimization, Springer, 1999.
[57] R. Orsi, U. Helmke, and J. B. Moore, A Newton-like method for solving rank
constrained linear matrix inequalities, Proceedings of the 43rd IEEE Conference
on Decision and Control (Paradise Island, Bahamas), 2004, pp. 3138–3144.
[58] H. Perfect, Methods of constructing certain stochastic matrices, Duke Math. J.
20 (1953), 395–404.
[59] B. T. Polyak and P. S. Shcherbakov, Hard problems in linear control theory: pos-
sible approaches to solution, Automation and Remote Control 66 (2005), no. 5,
681–718, Translated from Avtomatika i Telemekhanika, no. 5, 2005, pp. 7–46.
[60] J. Rosenthal and F. Sottile, Some remarks on real and complex output feedback,
Systems and Control Letters 33 (1998), 73–80.
[61] J. Rosenthal and J. C. Willems, Open problems in the area of pole placement,
Open problems in mathematical systems and control theory (V. D. Blondel,
144
E. D. Sontag, M. Vidyasager, and J. C. Willems, eds.), Springer Verlag, 1998,
pp. 181–191.
[62] R. Saeks and J. Murray, Fractional representation, algebraic geometry and the
simultaneous stabilization problem, IEEE Trans. Automatic Control 27 (1982),
no. 4, 895–903.
[63] A. Satoh, J. Okubo, and K. Sugimoto, Transient response shaping in H∞ con-
trol by eigenstructure assignment to convex regions, Proc. 42nd IEEE Conf. on
Decision and Control (Maui, USA), 2003, pp. 2288–2293.
[64] H. Seraji, A new method for pole placement using output feedback, Int. J. Control
28 (1978), 147–155.
[65] G. W. Soules, Constructing symmetric nonnegative matrices, Linear and Multi-
linear Algebra 13 (1983), 241–251.
[66] B. Sridhar and D. P. Lindorff, Pole placement with constraint gain output feed-
back, Int. J. Control 18 (1973), no. 5, 993–1003.
[67] H Stark and Yongyi Yang, Vector space projections, Wiley, 1998.
[68] K. R. Suleimanova, Stochastic matrices with real characteristic numbers, Soviet
Math. Dokl. 66 (1949), 343–345, In Russian.
[69] V. L. Syrmos, C. T. Abdallah, P. Dorato, and K. Grigoriadis, Static output
feeback — a survey, Automatica 33 (1997), no. 2, 125–137.
145
[70] V. L. Syrmos and F. L. Lewis, Output feedback eigenstructure assignment using
two sylvester equations, IEEE Trans. on Automatic Control 38 (1993), no. 3,
495–499.
[71] L. Vandenberghe and S. Boyd, Semidefinite programming, SIAM 38 (1996), no. 1,
49–95.
[72] M. Vidyasagar, Control system synthesis, MIT Press, Cambridge, MA, 1986.
[73] M. Vidysagar and N. Viswanadham, Algebraic design techniques for reliable sta-
bilization, IEEE Trans. Automatic Control 27 (1982), no. 5, 1085–1095.
[74] X. A. Wang, Pole placement by static output feedback, J. of Math. Systems,
Estimation, and Control 2 (1992), 205–218.
[75] D. N. Wu, W. B. Gao, and M. Chen, Algorithm for simultaneous stabilization
of single-input systems via dynamic feedback, INT. J. Control 51 (1990), no. 3,
631–642.
[76] J. L. Wu and T. T. Lee, Optimal static output feedback simultaneous regional
pole placement, IEEE Trans. Systems, Man, and Cybernetics 35 (2005), no. 5,
881–893.
[77] S. F. Xu, An introduction to inverse algebraic eigenvalue problems, Peking Uni-
versity Press, Beijing and Vieweg & Sohn, Braunschweig, Germany, 1998.
[78] D. C. Youla and H. Webb, Image restoration by the method of convex projections:
Part 1 – theory, IEEE Trans. on Medical Imaging MI-1 (1982), no. 2, 81–94.