constrained optimization: direct...

Constrained optimization:

direct methods

Jussi Hakanen

Post-doctoral researcher [email protected]

spring 2014 TIES483 Nonlinear optimization

mailto:[email protected]

Classification of the methods

Indirect methods: the constrained problem is

converted into a sequence of unconstrained

problems whose solutions will approach to the

solution of the constrained problem, the

intermediate solutions need not to be feasible

Direct methods: the constraints are taking into

account explicitly, intermediate solutions are

feasible


Direct methods

Also known as methods of feasible directions

Idea – in a point 𝑥ℎ, generate a feasible search direction where

objection function value can be improved

– use line search to get 𝑥ℎ+1

Reminder: A direction 𝑑 ∈ 𝑅𝑛 is a feasible descent direction in 𝑥∗ ∈ 𝑆 if there exists 𝛼∗ > 0 such that 𝑓 𝑥∗ + 𝛼𝑑 < 𝑓 𝑥∗ and 𝑥∗ + 𝛼𝑑 ∈ 𝑆 for all 𝛼 ∈ (0, 𝛼∗]. Methods differ in – how to choose a feasible direction and

– what is assumed from the constraints (linear/nonlinear, equality/inequality)


Algorithm

1) Choose a starting point 𝑥1 and set ℎ = 1.

2) Determine a feasible direction 𝑑ℎ such that

for some (small enough) 𝛼 > 0 it holds that

𝑥ℎ + 𝛼𝑑ℎ ∈ 𝑆 and 𝑓 𝑥ℎ + 𝛼𝑑ℎ < 𝑓(𝑥ℎ).

3) Use line search to find an optimal step length

𝛼ℎ > 0 such that 𝑥ℎ + 𝛼ℎ𝑑ℎ ∈ 𝑆.

4) Test convergence. If not converged, set

𝑥ℎ+1 = 𝑥ℎ + 𝛼ℎ𝑑ℎ , ℎ = ℎ + 1 and go to 2).


Examples of direct methods

Projected gradient method

Active set method

Sequential Quadratic Programming (SQP)

method


Projected gradient method

Consider a problem min 𝑓 𝑥 𝑠. 𝑡. 𝐴𝑥 = 𝑏,

where 𝐴 is an 𝑙 × 𝑛 matrix (𝑙 ≤ 𝑛) and 𝑏 ∈ 𝑅𝑙

– 𝑥∗ ∈ 𝑆 if and only if 𝐴𝑥∗ = 𝑏

Direction of the steepest descent is −𝛻𝑓(𝑥)

– May not be feasible

Idea: project the steepest descent direction

into the feasible region 𝑆

– A projection matrix is needed


Projection

Definition: If for an 𝑛 × 𝑛 matrix 𝑃 it holds that

𝑃𝑇 = 𝑃 and 𝑃𝑃 = 𝑃, then 𝑃 is a projection

matrix

Vector 𝑃𝛻𝑓(𝑥∗) is a projected gradient in 𝑥∗

Matrix 𝑃𝐻 𝑥∗ 𝑃 is a projected Hessian in 𝑥∗


Idea

If 𝑥∗ ∈ 𝑆, then 𝑑 is a feasible direction in 𝑥∗ if

and only if 𝐴𝑑 = 0

𝑥∗ is a local minimizer if and only if for all

feasible directions 𝑑, 𝑑 = 1, there exists

𝛼 > 0 such that 𝑓 𝑥∗ + 𝛼𝑑 ≥ 𝑓(𝑥∗), when

0 < 𝛼 < 𝛼

Find a feasible 𝑑 where 𝑓 improves the most

⟹ min𝑑∈𝑅𝑛

𝛻𝑓 𝑥∗ 𝑇𝑑 𝑠. 𝑡. 𝐴𝑑 = 0, 𝑑 = 1


𝐴𝑑 = 0

Let’s consider the constraints 𝐴𝑑 = 0: form a subspace in 𝑅𝑛

Denote by 𝑍 the matrix whose columns span 𝑑 𝐴𝑑 = 0} ⊂ 𝑅𝑛 – Each feasible direction can be represented as a

linear combination of the columns of 𝑍, that is, 𝑑 = 𝑍𝑞 for some 𝑞 ∈ 𝑅𝑛−𝑙

𝑍 can be formed e.g. by using 1) elimination method or 2) LQ decomposition (check some book for numerical methods


Elimination method

min𝑓 𝑥 𝑠. 𝑡. 𝐴𝑥 = 𝑏 has 𝑙 linear equality constraints – Can be transformed into an unconstrained problem by eliminating 𝑙

variables by using the constraints

Assume that the rows of 𝐴 are linearly independent (otherwise some of the constraints is not needed or there are no feasible solutions) → 𝐴 has 𝑙 linearly independent columns – 𝐴 = (𝐵 𝑁), where 𝐵 ∈ 𝑅𝑙×𝑙 has linearly independent columns

– 𝑥 = 𝑥𝐵, 𝑥𝑁𝑇 , 𝑥𝐵 ∈ 𝑅𝑙 , 𝑥𝑁 ∈ 𝑅𝑛−𝑙

Then 𝐴𝑥 = 𝐵 𝑁𝑥𝐵

𝑥𝑁= 𝐵𝑥𝐵 + 𝑁𝑥𝑁 = 𝑏 and further 𝑥𝐵 =

𝐵−1(𝑏 − 𝑁𝑥𝑁) since 𝐵 is nonsingular

Therefore, by choosing 𝑥𝑁 the resulting 𝑥 ∈ 𝑆

Now we can set 𝐴𝑍 = 𝐵 𝑁 𝑍 = 0 so that 𝑍 = −𝐵−1𝑁𝐼


Optimality conditions

Necessary: If 𝑥∗ (𝐴𝑥∗ = 𝑏) is a local

minimizer, then

1) 𝑍𝑇𝛻𝑓 𝑥∗ = 0 and

2) matrix 𝑍𝑇𝐻 𝑋∗ 𝑍 is positive semidefinite

Sufficient: If in 𝑥∗ (𝐴𝑥∗ = 𝑏)

1) 𝑍𝑇𝛻𝑓 𝑥∗ = 0 and

2) matrix 𝑍𝑇𝐻 𝑥∗ 𝑍 is positive definite,

then 𝑥∗ is a local minimizer


Using projected gradient

Result: It holds that 𝑍𝑇𝛻𝑓 𝑥∗ = 0 if and only if

there exists Lagrange multiplier vector 𝜈∗ ∈ 𝑅𝑙

such that 𝛻𝑓 𝑥∗ + 𝐴𝑇 𝜈∗ = 0.

So, if 𝑍𝑇𝛻𝑓 𝑥∗ = 0, then 𝑥∗ is a local

minimizer. Otherwise, −𝑍𝑇𝛻𝑓 𝑥∗ is a descent

direction.


Rosen’s projected gradient method

Feasible direction: 𝑑ℎ = 𝑍𝑞ℎ for some 𝑞ℎ ∈ 𝑅𝑛−𝑙

Maximum decrease:

min𝑑𝑇𝛻𝑓 𝑥ℎ = min 𝑞𝑇𝑍𝑇𝛻𝑓(𝑥ℎ)

If it is required that 𝑑 = 1, then min𝑞

𝑞𝑇𝑍𝑇𝛻𝑓(𝑥ℎ)

𝑍𝑞

The solution is 𝑞ℎ = − 𝑍𝑇𝑍 −1𝑍𝑇𝛻𝑓(𝑥ℎ) and we

have 𝑑ℎ = 𝑍𝑞ℎ = −𝑍 𝑍𝑇𝑍 −1𝑍𝑇𝛻𝑓(𝑥ℎ)

𝑃 = 𝑍 𝑍𝑇𝑍 −1𝑍𝑇


Algorithm

1) Choose a starting point 𝑥1. Determine 𝑍 and set

ℎ = 1. Compute 𝑃 = 𝑍 𝑍𝑇𝑍 −1𝑍𝑇.

2) Compute direction 𝑑ℎ = −𝑃𝛻𝑓 𝑥ℎ 𝑇.

3) If 𝑑ℎ = 0, stop. Otherwise, find

𝛼𝑚𝑎𝑥 = max 𝛼 | 𝑥ℎ + 𝛼𝑑ℎ ∈ 𝑆 and solve

min𝑓 𝑥ℎ + 𝛼𝑑ℎ 𝑠. 𝑡. 0 ≤ 𝛼 ≤ 𝛼𝑚𝑎𝑥 .

Let the solution be 𝛼ℎ. Set 𝑥ℎ+1 = 𝑥ℎ + 𝛼ℎ𝑑ℎ, ℎ = ℎ +

1 and go to 2).


Note

If there are both linear equality and inequality constraints, the projection matrix does not remain the same – At each iteration, it includes only the equality and active

inequality constraints

𝑃 = 𝑍 𝑍𝑇𝑍 −1𝑍𝑇 = 𝐼 − 𝐴𝑇 𝐴𝐴𝑇 −1𝐴 – 𝑍 is not necessarily needed

– In bigger problems, 𝑍 is more convenient since it is smaller than 𝐴

If min𝑞

𝑞𝑇𝑍𝑇𝛻𝑓(𝑥ℎ)

𝑞 then 𝑑ℎ = −𝑍𝑍𝑇𝛻𝑓(𝑥ℎ), 𝑃 = 𝑍𝑍𝑇

Convergence is quite low since the second derivatives are not taken into account


constrained optimization: direct...

Documents