Download - Introduction to Convex Constrained Optimizationdspace.mit.edu › ... › 0 › lecture10.pdf · 2019-09-12 · Nonlinear Optimization Proof of Proposition Proof of the Proposition:

Introduction to

Convex Constrained Optimization

March 12, 2002

Overview�Nonlinear Optimization

�Portfolio Optimization

�An Inventory Reliability Problem

�Further Concepts for Nonlinear Optimization

�Convex Sets and Convex Functions

�Convex Optimization

c�2002 Massachusetts Institute of Technology. All rights reserved. 15.094 1

Overview�Pattern Classification

�Some Geometry Problems

�On the Geometry of Nonlinear Optimization

�Classification of Nonlinear Optimization Problems

�Solving Separable Convex Optimization via LinearOptimization

�Optimality Conditions for Nonlinear Optimization

�A Few Concluding Remarks


Nonlinear versusLinear Optimization

Linear Optimization

��

��

��

�

� �

...

��

� � ��c�2002 Massachusetts Institute of Technology. All rights reserved. 15.094 3

Nonlinear versusLinear Optimization

Nonlinear Optimization

��

��

�

� �

...

��

� � ��

In this model, we have ��

and �� .


PortfolioOptimization

Investor Preferences

Investors prefer higher annual rates of return oninvesting to lower annual rates of return.

Furthermore, investors prefer lower risk to higher risk.

Portfolio optimization seeks to optimally trade off riskand return in investing.



Portfolio Fractions

We consider � assets, whose annual rates of return �� arerandom variables, � � �� .

The expected annual return of asset � is ��, � � �� .

If we invest a fraction �� of our investment dollar in asset �,

the expected return of the portfolio is:

��

��

(� � � � � � � �� )



Portfolio Covariances & Variance

The covariance of the rates of return of assets � and � is given as

��

We can think of the �� values as forming a matrix

The variance of portfolio is then:

��

��

��

��

��

will always by SPD (symmetric positve-definite).

The “risk” in the portfolio is the standard deviation of the portfolio:

��

��c�2002 Massachusetts Institute of Technology. All rights reserved. 15.094 7


Maximal Return Model

We would like to determine the fractional investment values

�� in order to maximize the return of the portfolio, subject

to meeting some pre-specified target risk level, such as �� .

MAXIMIZE: RETURN ��

s.t.FSUM: �� ST. DEV.:

��

NONNEGATIVITY: � � � .c�2002 Massachusetts Institute of Technology. All rights reserved. 15.094 8


Minimal Risk Model

Alternatively, we might want to determine the fractional

investment values �� in order to minimize the risk of the

portfolio, subject to meeting a pre-specified target expected return

level, such as �� .

MINIMIZE: STDEV �

��

s.t.FSUM: �� EXP. RETURN: �� NONNEGATIVITY: � � � �



Discussion

Variations of this type of model are used pervasivelyby asset management companies worldwide.

This model can be extended to yield the CAPM(Capital Asset Pricing Model).

The Nobel Prize in economics was awarded in 1990 toMerton Miller, William Sharpe, and Harry Markowitz fortheir work on portfolio theory and portfolio models (andthe implications for asset pricing).


An InventoryReliability Problem

Given positive coefficients �� , and

Æ � �, solve for �� :

��

��

��

�� Æ�

��


Further Concepts forNonlinear Optimization

Feasible Region, Ball

Recall the basic nonlinear optimization model:

��

��

� �

� �

...

��

� � ��

where �� and �� .



Feasible Region, Ball

The feasible region of �� is the set:

��

The ball centered at �� with radius � is the set:

��



Local & Global Minima

Definition 1 � � is a local minimum of �� ifthere exists � � � such that �� for all

� � �� .

Definition 2 � � is a global minimum of �� if

�� for all � � .



Strict Local & Global Minima

Definition 3 � � is a strict local minimum of ��

if there exists � � � such that �� for all

� � �� , � � �.

Definition 4 � � is a strict global minimum of

�� if �� for all � � , � � �.



Local & Global Maxima

Definition 5 � � is a local maximum of �� ifthere exists � � � such that �� for all

� � �� .

Definition 6 � � is a global maximum of �� if

�� for all � � .



Strict Local & Globla Maxima

Definition 7 � � is a strict local maximum of

�� if there exists � � � such that �� forall � � �� , � � �.

Definition 8 � � is a strict global maximum of

�� if �� for all � � , � � �.



Local versus Global Minima

F

Illustration of local versus global optima.



Convex Sets and Functions

Definition 9 A subset � � �� is a convex set if

��

for any � � �� .

Illustration of convex and non-convex sets.



Intersections of Convex Sets

Proposition 1 If �� are convex sets, then � � � isa convex set.

Proposition 2 The intersection of any collection ofconvex sets is a convex set.

Illustration of the intersection of convex sets.



Convex Functions

Definition 10 A function �� is a convex function if��

for all � and � and for all � � �� .

Definition 11 A function �� is a strictly convexfunction if

��

for all � and � and for all � � �� , � � �.



Convex Functions

Illustration of convex and strictly convex functions.



Convex Optimization

� � ��

��

� �

Proposition 3 Suppose that is a convex set,

� � � is a convex function, and �� is a localminimum of � . Then �� is a global minimum of over

.



Proof of Proposition

Proof of the Proposition: Suppose �� is not a global minimum,i.e., there exists � � � for which �� . Let

�� , which is a convex combination of �� and �

for � � � �� (and therefore, �� for � � � ��). Note that

�� as �� .

From the convexity of ��,

��

for all � � � ��. Therefore, �� for all � � � ��,

and so �� is not a local minimum, resulting in a contradiction.

q.e.d.



Examples of Convex Functions

Functions of One Variable

Examples of convex functions of one variable:

��

��

��

�� for � � �

��

for � � �

��



Concave Functions & Maximization

The “opposite” of a convex function is a concave function:

Definition 12 A function �� is a concave function if

��

for all � and � and for all � � �� .

Definition 13 A function �� is a strictly concavefunction if

��

for all � and � and for all � � �� , � � �.



Concave Functions & Maximization

Illustration of concave and strictly concave functions.



Concave Maximization

Now consider the maximization problem � :

� � ��

��

� �

Proposition 4 Suppose that is a convex set,

� � � is a concave function, and �� is a localmaximum of � . Then �� is a global maximum of over

.



Linear Functions

Proposition 5 A linear function �� isboth convex and concave.

Proposition 6 If �� is both convex and concave,then �� is a linear function.

A linear function is convex and concave.



Convex Optimization

Level Sets of Convex Functions...

Suppose that �� is a convex function. The set

��

is the level set of �� at level �.

Proposition 7 If �� is a convex function, then ��

is a convex set.



Convex Optimization

...Level Sets of Convex Functions

The level sets of a convex function are convex sets.



Convex Optimization

Convex Optimization Model...

� � ��

��

...

��

!� ��

� � ��

�� is called a convex optimization problem

if �� are convex functions.



Convex Optimization

...Convex Optimization Model...

Proposition 8 The feasible region of � is aconvex set.

Corollary 1 Any local minimum of � will be aglobal minimum of � .

This is a most important aspect of a convexoptimization problem.



Convex Optimization

...Convex Optimization Model

Remark 1 If we replace “min” by “max” in � and if

�� is a concave function while �� areconvex functions, then any local maximum of � willbe a global maximum of � .



Further Properties of Convex Functions

Additions & Affine Transformations

Proposition 9 If �� and �� are convexfunctions, and �� , then

��

is a convex function.

Proposition 10 If �� is a convex function and

� !�� , then

�� !��

is a convex function.c�2002 Massachusetts Institute of Technology. All rights reserved. 15.094 35



Gradient and Hessian

A function �� is twice differentiable at � �� if thereexists a vector �� (called the gradient of ��) anda symmetric matrix"�� (called the Hessian of ��)for which:

��

�"�� #��

where #�� as �� .

�� $��

$�� $��

$��

� "�� $��

$��$��




Gradient & Hessian of Convex Functions

Theorem 1 Suppose that �� is twice differentiableon the open convex set �. Then �� is a convexfunction on the domain � if and only if "�� is SPSDfor all � � �.

(Recall that SPSD means symmetric and positivesemi-definite.)




Examples in � Dimensions��

�� where� is SPSD

�� for any norm �

��

��

� �� for � satisfying !� � �.

Corollary 2 If �� where � is a

symmetric matrix, then �� is a convex function if andonly if � is SPSD. Furthermore, �� is a strictlyconvex function if and only if � is SPD.




Norm Function

Illustration of a norm function.




The Gradient Inequality

The gradient inequality for convex functions:

Proposition 11 If �� is a differentiable convexfunction, then for any �� ,

��




The Subgradient Inequality

Even when a convex function is not differentiable, itmust satisfy the following “subgradient inequality.”

Proposition 12 If �� is a convex function, then forevery �, there must exist some vector � for which

�� for any � �

The vector � in this proposition is called a subgradientof �� at the point �.


More Examples of ConvexOptimization Problems

A Pattern Classification Training Problem

We are given:

� points �� that have property “P”

� points �� that do not have property “P”

Illustration of the pattern classification problem.




We seek a vector % and a scalar � for which:

�%�� for all � � � � � � &

�%�� for all � � � � � �

We will then use %� � to predict whether or not anyother point � has property P or not.

� If %�� , then we declare that � has property P.

� If %�� , then we declare that � does not haveproperty P.



A Pattern Classification Training Problem%� � define the hyperplane

"� � � ��%��

for which:

�%�� for all � � � � � � &

�%�� for all � � � � � � We would like the hyperplane"� � to be as far awayfrom the points �� as possible.




The distance of the hyperplane �� to any point �� is

��

The distance of the hyperplane �� to any point �� is

� ��

�If we normalize the vector � so that � � �, then the minimimdistance of the hyperplane �� to the points ��

�� is:

��




We would also like � and � to satisfy:

� � �, and

�� ismaximized.

�� Æ Æ

��

�� Æ� � � ��

� �� Æ� � � ��

� � ��

� � �� c�2002 Massachusetts Institute of Technology. All rights reserved. 15.094 46


A Pattern Classification Training Problem� � � �� Æ Æ

��

%�� Æ� � � � � � � &

� � %�� Æ� � � � � � �

% �

% � ��

� � is not a convex optimization problem,due to the presence of the constraint “ % .”




�� Æ Æ

�� Æ � � � � � � �

� � �� Æ � � � � � � �

��

� � ��

Transformation of variables:� %

Æ

� � �

Æ

�� Æ �� %

� ��

� ��




�� Æ Æ

s.t. �� Æ � � � � � � �

� � �� Æ � � � � � � �

��

� � ��

� ��

Æ

� ��

Æ

��

��

��

� ��

� � �� c�2002 Massachusetts Institute of Technology. All rights reserved. 15.094 49



��

��

��

��

� � ��

� � is a convex problem.

Solve � � for ��, and substitute

% �

� �

� �

� �

Æ

�



The Minimum Norm Problem

Given a vector �, we would like to find the closest pointto � that also satisfies the linear inequalities !� � �.

��

��

!� � �

� � ��



The Minimum Norm Problem

Illustration of the minimum norm problem.



The Fermat-Weber Problem

Given points �� , determine thelocation of a distribution center at the point � � �� thatminimizes the sum of the distances from � to each ofthe points �� .

'(� � ��

��

��

� � ��



The Fermat-Weber Problem

Illustration of the Fermat-Weber problem.

'(� is a convex unconstrained problem.c�2002 Massachusetts Institute of Technology. All rights reserved. 15.094 54


The Ball Circumscription Problem

Given points �� , determine thelocation of a distribution center at the point � � �� thatminimizes the maximum distance from � to any of thepoints �� .

� � � �� Æ Æ

��

�� Æ� � � � � � � �

� � ��



The Ball Circumscription Problem

Illustration of the ball circumscription problem.

This is a convex (constrained) optimization problem.



The Analytic Center Problem

Given a system of linear inequalities !� � �, wewould like to determine a “nicely” interior point �� thatsatisfies !�� .

! � � ��

��!��

��

!� � ��

� � ��c�2002 Massachusetts Institute of Technology. All rights reserved. 15.094 57



Illustration of the analytic center problem.




��

��

��

��

� � ��

Proposition 13 Suppose that �� solves the analyticcenter problem ! � . Then for each �, we have:

��!��

where

��

��!�� c�2002 Massachusetts Institute of Technology. All rights reserved. 15.094 59



��

��

��

��

� � ��

�� is not a convex problem, but it is equivalent to:

��

��

��

��

� � ��

Now notice that �� is a convex problem.



The Circumscribed Ellipsoid Problem

Given � points �� , determine an ellipsoid of

minimum volume that contains each of the points

�� .

Illustration of the circumscribed ellipsoid problem.




Recall that an SPD matrix � and a given point � can be used todefine an ellipsoid in ��:

��

The volume of �� is proportional to ��

�.

Illustration of an ellipsoid.




Our problem is:� �� #��

�

#�)�� *� �� &�

# is SPD.

� �� #��

#�)�� )��#�� )� � � � � � � � � &

# is SPD.




��

� �

��

� is SPD.

Factor # �� where � is SPD

� ��

��)�� )�� )� � � � � � � � � &�

� is SPD.




��

� �

��

� is SPD.

� ��

��)

�� )� � � � � � � � � &�

� is SPD.




��

� �

��

� is SPD.

Next substitute � �� to obtain:

��

��

��

� is SPD.– This is convex problem in the variables ��.

– We can recover � and � after solving �� by substituting

� �� and � ��.


Classification of Nonlinear Optimization Problems

General Nonlinear Problem�� or ��

��

Unconstrained Optimization

�� or ��

�� c�2002 Massachusetts Institute of Technology. All rights reserved. 15.094 67


Linearly Constrained Problems

�� or ��

�� !� � �

Linear Optimization��

�� !� � �



Quadratic Problem

��

�� !� � �

The objective function here is

��

��

��

��.

Quadratically Constrained Quadratic Problem

��

��

��



Convex Problem

��

��

��

where �� is a convex function and �� areconvex functions, or:

��

��

��

where �� is a concave function and �� are

convex functions.c�2002 Massachusetts Institute of Technology. All rights reserved. 15.094 71

Solving Separable Convex Optimization via Linear Optimization

Consider the problem

�� #��

�� %�� &

&��%�� $

��

This problem has:

1. linear constraints

2. a separable objective function � � �� #��

3. a convex objective function (since �� and �� are convexfunctions)



��

��

��

��

We can approximate �� and �� bypiecewise-linear (PL) functions.

Suppose we know that � � �� and � � �� .



PL approximation of ��.

�� ' ��%��

��

� ��



PL approximation of ��.

�� $�� '#��

��

� ��



The linear optimization approximation then is:

��

��

��

��

��

��

��

��

� � ��

� � ��


On the Geometry ofNonlinear Optimization

Extreme Point Solution

First example of the geometry of the solutionof a nonlinear optimization problem.



Non-Extreme Point Boundary Solution

Second example of the geometry of the solutionof a nonlinear optimization problem.



Interior Solution

Third example of the geometry of the solutionof a nonlinear optimization problem.



Discussion

Unlike linear optimization, the optimal solution of anonlinear optimization problem need not be a “cornerpoint” of .

The optimal solution may be on the boundary or evenin the interior of .

The optimal solution will be characterized by how thecontours of �� are “aligned” with the feasible region

.


Optimality Conditions forNonlinear Optimization

Convex Problem

Consider the convex problem:

� � � ��

��

�� are convex functions.



The KKT Theorem

Convex Optimization

Theorem 2 (Karush-Kuhn-Tucker Theorem)Suppose that �� are all convexfunctions. Then under very mild conditions, �� solves(CP) if and only if there exists �� ,such that

(�) ��

�� (gradients line up)

(��) �� (feasibility)

(��) �� (multipliers must be nonnegative)

(��) �� (multipliers for nonbinding constraints are zero)



The KKT Theorem

Linear Optimization & Strong Duality...

Consider the following linear optimization problem andits dual:

! � � �� !"� � ��

��

��

��

� � �



The KKT Theorem

...Linear Optimization & Strong Duality...

Let us look at the KKT Theorem applied here. If ��

solves LP, then there exists �� for which:

(�) ��

��


(��) �� ( primal feasibility)

(��) ��

(��) ��

� �� (complementary slackness)



The KKT Theorem

...Linear Optimization & Strong Duality

(�) ��

��


(��) �� ( primal feasibility)

(��) ��

(��) �� (complementary slackness)

– (��) is primal feasbility of ��– (�) and (��) together are dual feasibility of ��

– (��) is complementary slackness

The KKT Theorem states that �� and �� together must be

primal feasible and dual feasible, and must satisfy complementary

slackness.



The KKT Theorem

Geometry

g1(x) = 0

g2(x) = 0

g3(x) = 0

∇g1(x)

∇g2(x)

−∇f(x)

∇f(x) x–

Illustration of the KKT Theorem.



The KKT Theorem

An example...

Consider the problem:

�� '�� %��

��

�� %�� %

��

��

�

��

� �&

�� '�� %��

��

� �� %�� %

��

��

��

�&



The KKT Theorem

...An example...

��

��

��

��

��

� ��

��

� ��

� ��

��

$�� %��

�

��

� ��

�� %��

� ��

� ��

��

� ��

� ��

��

��



The KKT Theorem

...An example...

��

��

��

��

Is �� an optimal solution?



The KKT Theorem

...An example...

��

��

��

��

��

� ��

��

� ��

We first check for feasibility of �� :��

��

��



The KKT Theorem

...An example...

��

��

��

��

��

� ��

��

� ��

To check for optimality, we compute all gradients at ��:

��

��

��

��

�

��

� ��

��

��

�



The KKT Theorem

...An example

��

��

��

��

��

� ��

��

� ��

We next try to solve for �� in the system:

��

% �

��

��'

�

��

��'

��

��

�

� �

��

�

�

�

�� '� solves this system in nonnegativevalues, and �� .

Therefore �� is an optimal solution to this problem.



KKT Theorem with Different Formats of Constraints

Convex Optimization with Linear Equations

��

��

��

Theorem 3 (Karush-Kuhn-Tucker Theorem) Suppose that �� are allconvex functions for � � �. Then under very mild conditions, �� solves (CPE) ifand only if there exists �� and �� such that

(�) ��

��

��




(��) �� (nonnegative multipliers on inequalities)




KKT Theorem with Different Formats of Constraints

Non-convex Optimization

��

��

��

Theorem 4 (Karush-Kuhn-Tucker Theorem) Under some very mild conditions,if �� solves (NCE), then there exists �� and �� such that

(�) ��

��

��




(�� ) �� (nonnegative multipliers on inequalities)



A Few Concluding Remarks

Nonconvex optimization problems can be difficult tosolve.

This is because local optima may not be global optima.

Most algorithms are based on calculus, and so canonly find a local optimum.



There are a large number of solution methods forsolving nonlinear constrained problems.

Today, the most powerful methods fall into two classes:

�methods that try to generalize the simplex algorithmto the nonlinear case.

�methods that generalize the barrier method to thenonlinear case.



Quadratic Problems. A quadratic problem is “almostlinear”, and can be solved by a special implementationof the simplex algorithm, called the complementarypivoting algorithm.

Quadratic problems are roughly eight times harder tosolve than linear optimization problems of comparablesize.


Download - Introduction to Convex Constrained Optimizationdspace.mit.edu › ... › 0 › lecture10.pdf · 2019-09-12 · Nonlinear Optimization Proof of Proposition Proof of the Proposition:

Top Related