lecture 7

Math for CS Lecture 7 1

Lecture 7

Constrained Optimization

Lagrange Multipliers

____________________________________________

Ordinary Differential equations


Constrained optimization problem can be defined as following:

Minimize the function

,while searching among x, that satisfy the constraints:

Constrained Optimization

)(xf

0)(

...

0)(1

xg

xg

k


Example 1

g(x,y)= x12+x2

2-1=0

solution x*, the gradient of f(x) is orthogonal to the circle. Otherwise,

there is non-zero projection of the gradient on the circle, and

therefore, sliding contrary to this projection projection decreases the

f(x) without violating the constraint g(x)=0.

Consider the problem of

minimizing f(x)=x1+x2 and

constraint that g(x)=x12+x2

2-1.

The figure shows the circle

defined by g(x)=0 and the

lines of constant value of f(x).

One can see, that at the

x*


Example 2g(x,y)=0

Minimize the path between M and C, so that the path touches the

constraint g(x)=0. Each ellipse describes the points lying on paths

of the same lengths. Again, in the solution the gradient of f(x) is

orthogonal to the curve of the constraint.


The straightforward method to solve constrained optimization

problem is to reduce the number of free variables: If x=(x1,..xn)

and there are k constraints g1(x)=0,…gk(x)=0, then, the k

constraint equations can (sometimes) be solved to reduce the

dimensionality of x from n to n-k:

Dimensionality Reduction

),..,(~0),..,( 11111 nnn xxgxxxg

),..,(~0),..,( 111 ininini xxgxxxg

...


Now we consider the hard case, when dimensionality reduction is

impossible.

If there are no constraints (k=0), the gradient of f(x) vanishes at

the solution x*:

In the constrained case, the gradient must be orthogonal to the

subspace, defined by the constraints (otherwise a sliding along

this subspace will decrease the value f(x), without violating the

constraint).

Surfaces defined by constraint

0)*()*(

xfxf

x

kixgxfx ix

,..1;0)();(minarg*


Explanation

The constraints limit the subspace of the solution. Here the solution

lies on the intersection of the planes, defined by g1(x)=0 and

g2(x)=0. The gradient f(x) must be orthogonal to this subspace

(otherwise there is non-zero projection of f(x) along the constraint

and the function value can be further decreased). The orthogonal

subspace is spanned by λ1 g1(x)+ λ2 g2(x).

Thus f(x*)= λ1 g1(x*)+ λ2 g2(x*).

g1 (x)=0

g 2(x)=

0

λ 1∆g 1

(x)λ

2∆g2 (x)


We observe, that the more additional constraints are applied, the

more restricted is the coordinate of the optimum (anywhere in R3

for k=0, on the surface for k=1, on the line for k=2), but the less

restricted is the gradient of the function f(x) (zero, along the

normal to the surface, within the plane, orthogonal to the line).

This requirement for the gradient to lie in the hyperspace,

orthogonal to the intersection of the hypersurfaces, defined by the

constraints can be summarized as:

Constrain for coordinate and relaxation for the gradient

k

iii xg

xxf

1

)*()*( (1)


The second requirement is to satisfy the constraints:

The requirements (1-2) can be written together in the elegant

form. Define the function:

Then, we can write to satisfy (1) and

to satisfy (2).


(2)kixgi ,..,1;0)(

),..,(...),..,(

),..,(),..,,,..,(

1111

111

nkkn

nnk

xxgxxg

xxfxxF

0),(

xF

x 0),(

xF


In other words, we have constructed a function

that depends on a variable , and we require

for that function.


),..,(...),..,(

),..,(),..,,,..,()~(

1111

111

nkkn

nnk

xxgxxg

xxfxxFxF

),(~ xx

0)~(~

xFx


The constants λi are called Lagrange Multipliers. We obtained

that the solution (λ* , x*) of the constrained optimization problem

Satisfies the equations

,where

Summary

0)*,*(

xF

x

0)*,*(

xF

kixgxfx ix

,..1;0)();(minarg*

),..,(...),..,(

),..,(),..,,,..,(

1111

111

nkkn

nnk

xxgxxg

xxfxxF


Ordinary Differential Equations

Linear Equations

Separable equations


1. A differential equation is an equation involving an unknown function

and its derivatives.

2. The order of the differential equation is the order of the highest

derivative of the unknown function involved in the equation.

3. A linear differential equation of order n is a differential equation

written in the following form:

where is not the zero function.

Differential Equations

)()()(...)()( 011

1

1 xfyxadx

dyxa

dx

ydxa

dx

ydxa n

n

nn

n

n


4. Existence: Does a differential equation have a solution?

5. Uniqueness: Does a differential equation have more than one

solution? If yes, how can we find a solution which satisfies

particular conditions?

6. If the values of the unknown function y(x) and its derivatives at

some point are known is called an initial value problem (in short

IVP).

7. If no initial conditions are given, we call the description of all

solutions to the differential equation the general solution.

Differential Equations


A first order linear differential equation has the following form:

To solve this equation, let us multiply both sides by :

First order differential equation

)()( xqyxpdx

dy

dxxp

exu)(

)(

)()'()()(

xqeyedxxpdxxp

)())('()()(

xqeyxpyedxxpdxxp

Cdxxqeyedxxpdxxp

)()()(

dxxp

dxxp

e

Cdxxqey

)(

)()(


The differential equation of the form

is called separable, if f(x,y) = h(x)·g(y); that is,

In order to solve it, perform the following steps:

(1) Solve the equation g(y) = 0, which gives the constant solutions of (a);

(2) Rewrite the equation (a) as

Separable Equations

)()( ygxhdx

dy

),(' yxfy (a)

dxxhyg

dy)(

)(


(3) Now we can integrate

to obtain

(4) Now we can write down all the solutions, obtained in (1) and (2).

If this is an IVP, we must use an initial to find a particular solution.

Separable Equations

dxxhyg

dy)(

)(

CxHyG )()(

lecture 7

Documents