1 optimization multi-dimensional unconstrained optimization part i: non-gradient methods

1

Optimization

Multi-Dimensional Unconstrained Optimization

Part I: Non-gradient Methods

2

Optimization MethodsOne-Dimensional Unconstrained Optimization

Golden-Section SearchQuadratic InterpolationNewton's Method

Multi-Dimensional Unconstrained OptimizationNon-gradient or direct methodsGradient methods

Linear Programming (Constrained)Graphical SolutionSimplex Method

3

Techniques to find minimum and maximum of

f(x1, x2, 3,…, xn)

2 classes of techniques:• Do not require derivative evaluation

– Non-gradient or direct methods

• Require derivative evaluation– Gradient or descent (or ascent) methods

Multidimensional Unconstrained Optimization

4

2-D Contour View of f(x, y)

5

max = -∞for i = 1 to N

for each xi

xi = a value randomly selected from a given interval

if max < f(x1, x2, 3,…, xn)

max = f(x1, x2, 3,…, xn)

DIRECT METHODS — Random Search

• N has to be sufficiently large

• Random numbers have to be evenly distributed.– Equivalent to selecting evenly distributed points

systematically.

6

Advantages• Works even for discontinuous and

nondifferentiable functions.

• More likely to find the global optima rather than the local optima.

Disadvantages• As the number of independent variables grows,

the task can become onerous.

• Not efficient, it does not account for the behavior of underlying function.

Random Search

7

Finding the Optima Systematically

Question– If we start from an arbitrary point,

how should we "move" so that we can locate the peak in the "shortest amount of time"?

• Good guess of direction toward the peak

• Minimize computation

Basic Idea (Like climbing a mountain)– If we keep moving upward, we will eventually reach

the peak.

You are here. Peak is covered by the cloud.

Which path should you take??

8

General Optimization Algorithm

All the methods discussed subsequently are iterative methods that can be generalized as:

Select an initial point, x0 = ( x1, x2 , …, xn )

for i = 0 to Max_Iteration

Select a direction Si

xi+1 = Optimal point reached by traveling from xi

in the direction of Si

Stop loop if2

1

11

1

1)(

or )(

)()(s

i

iis

i

ii ef

ef

ff

x

xx

x

xx

9

Univariate SearchIdea: Travel in alternating directions that are parallel to the coordinate axes. In each direction, we travel until we reach the peak along that direction and then select a new direction.

10

Univariate Search• More efficient than random search and still

doesn’t require derivative evaluation

• The basic strategy is:– Change one variable at a time while the other

variables are held constant.

– Thus problem is reduced to a sequence of one-dimensional searches

– The search becomes less efficient as you approach the maximum. (Why?)

11

Univariate Search – Example

• f(x, y) = y – x – 2x2 – 2xy – y2

• Start from (0, 0)

Iteration #1 • Current point: (0, 0)

• Direction: Along the the y-axis (i.e., x stays unchanged)• Objective: Find y that maximizes f(0, y) = y – y2

• Let g(y) = y – y2 .

Solving g'(y) = 0 => 1 – 2y = 0 => ymax = 0.5

• Next point: (0, 0.5)

12

Univariate Search – Example

• f(x, y) = y – x – 2x2 – 2xy – y2

Iteration #2• Current point: (0, 0.5),

• Direction: Along the the x-axis (i.e., y stays unchanged)• Objective: Find x that maximizes

f(x, 0.5) = 0.5 – x – 2x2 – x – 0.25

• Let g(x) = 0.5 – x – 2x2 – x – 0.25.

Solving g'(x) = 0 => -1 – 4x – 1 = 0 => xmax = -0.5

• Next point: (-0.5, 0.5),

13

Univariate Search – Example• f(x, y) = y – x – 2x2 – 2xy – y2

Iteration #3• Current point: (-0.5, 0.5),• Direction: Along the the y-axis (i.e., x stays unchanged)• Objective: Find y that maximizes

f(-0.5, y) = y – (-0.5) – 2(0.25) – 2(-0.5)y - y2

= 2y – y2

• Let g(y) = 2y – y2.

Solving g'(y) = 0 => 2 – 2y = 0 => ymax = 1• Next point: (-0.5, 1)

… Repeat until xi+1 = xi or yi+1 = yi or ea < es.

14

General Optimization Algorithm (Revised)

Select an initial point, x0 = ( x1, x2 , …, xn )


Select a direction Si

Find h such that f (xi + hSi) is maximized

xi+1 = xi + hSi

Stop loop if 21

11

1

1)(

or )(

)()(s

i

iis

i

ii ef

ef

ff

x

xx

x

xx

15

Direction represented as a vector (Review)

5.0

11S

1

22S

direction" same"

therepresent , 21 SS

(x1, y1)

(x3, y3)

(x2, y2)

21

11

1

1

3

3

21

11

1

1

2

2

)1(or )2(

or 2

SS

SS

y

x

y

x

y

x

y

x

y

x

y

x

11

1

as drepresente be

can line thisalong Points

Shy

x

16

Finding optimal point in direction S

• Current point: x = ( x1, x2 , …, xn )

• Direction: S = [ s1 s2 … sn ]T

• Objective: Find h that optimizesf(x + hS) = f (x1 + hs1, x2 + hs2, …, xn + hsn)

Note: f is a function of one variable – h.

17

Finding optimal point in direction S (Example)

• f(x, y) = y – x – 2x2 – 2xy – y2

• Current point: (x, y) = (-0.5, 0.5)

• Direction: S = [ 0 1 ]T

• Objective: Find h that optimizes

f (-0.5, 0.5 + h)

= (0.5 + h) – (-0.5) – 2 (-0.5) 2 – 2 (-0.5)(0.5 + h) – (0.5 + h)2

= 0.5 + h + 0.5 – 0.5 + 0.5 + h – 0.25 – h – h2

= 0.75 + h – h2

• Let g(h) = 0.75 + h – h2

• Solving g'(h) = 0 => 1 – 2h = 0 => h = 0.5

• Thus the optima in the direction of S from (-0.5, 0.5) is (-0.5, 1)

18

Univariate Search AlgorithmLet Dk = [ d1 d2 … dn ]T where dk = 1, dj = 0 for j ≠ k and j, k ≤ n.

e.g.: n = 4, D2 = [ 0 1 0 0 ]T, D4 = [ 0 0 0 1 ]T

Univariate Search AlgorithmSelect an initial point, x0 = ( x1, x2 , …, xn )


Si = Dj where j = i mod n + 1

Find h such that f (xi + hSi) is maximized

xi+1 = xi + hSi

Stop loop if x converges or if the error is small enough

19

Pattern Search MethodsObservation: Lines connecting alternating points (1:3, 2:4, 3:5, etc.) give better indication where the peak is (as compared to the lines parallel to the coordinate axes).

The general directions that point toward the optima is also known as the pattern directions.

Optimization methods that utilize the pattern directions to improve convergent rate are known as pattern search methods.

20

Powell’s method (a well-known pattern search methods) is based on the observation that if points 1 and 2 are obtained by one-dimensional searches in the same direction but from different starting points, then, the line formed by 1 and 2 will be directed toward the maximum. The directions represented by such lines are called conjugate directions.

Powell's Method

21

How Powell’s method selects directions **

• Start with initial set of n distinct directions, S[1], S[2], …, S[n] • Let counter[k] be the number of times S[k] is used.• Initially, counter[k] = 0 for all k = 1, 2, …, n

Si = S[j] where j = i mod n + 1

xi+1 = optimum point traveled from xi in the direction Si

counter[j] = counter[j] + 1if (counter[j] == 2)

S[j] = direction defined by xi+1 and xi+1–n

counter[j] = 0

i.e., Each direction in the set, after being used twice, is replaced immediately by a new conjugate direction.

22

Quadratically Convergent• Definition: If an optimization method, using exact

arithmetic, can find the optimum point in n steps while optimizing a quadratic function with n variables, the method is called a quadratically convergent method.

• If f(x) is a quadratic function, sequential search along conjugate directions will converge quadratically. That is, in a finite number of steps regardless of the starting points.

23

Conjugate-based Methods• Since general non-linear functions can often be

reasonably approximated by a quadratic function, methods based on conjugate directions are usually quite efficient and are in fact quadratically convergent as they approach the optimum.

24

Summary• Random Search• General algorithm for locating optimum point

– Guess direction– Find optimum point in the guessed direction

• How to find h such that f (xi + hSi) is maximized

• Univariate Search Method• Pattern Search Method

1 optimization multi-dimensional unconstrained optimization part i: non-gradient methods

Documents

y slide

x n max

direction s i x

random search slide

direction of s i

max fx

nongradient methods

univariate search example