lec 15 summary of single var methods

8/9/2019 Lec 15 Summary of Single Var Methods

1/28

Topic: All about One Dimensional

Unconstrained OT

Dr. Nasir M Mirza

Optimization Techniques Optimization Techniques

Email: [email protected]


2/28

• If f ( x) and the constraints are linear, wehave linear programming .

• e.g.: Maximize x + y subject to

3x + 4y ≤ 2

y ≤ 5• If f ( x) is quadratic and the constraints are

linear, we have quadratic programming.

• If f ( x) is not linear or quadratic and/or theconstraints are nonlinear, we have

nonlinear programming.

Classification of Optimization Problems


3/28

When constraints (equations marked with *)are included, we have a constrained

optimization problem.

Otherwise, we have an unconstrainedoptimization problem.

Classification of Optimization Problems


4/28

Optimization Methods

One-Dimensional Unconstrained OptimizationGolden-Section Search

Quadratic InterpolationNewton's Method

Multi-Dimensional Unconstrained Optimization

Non-gradient or direct methodsGradient methods

Linear Programming (Constrained)

Graphical SolutionSimplex Method


5/28

A function is said to be multimodal on a giveninterval if there are more than oneminimum/maximum point in the interval.

Global and Local Optima


6/28

Characteristics of Optima

To find the optima, we can find the zeroes of f'( x).


7/28

Mathematical Background

Objective: Maximize or Minimize f (x)

subject to

sConstraint*,,2,1)(

*,,2,1)(

⎭⎬⎫

===≤

pibe

miad

ii

ii

Κ Κ

x

x

x = { x1, x2, …, xn}

f (x): objective functiond i(x): inequality constraints

ei(x): equality constraints

ai and bi are constants)(

)(

x

x

f Minimize

f Maximize

−

χ


8/28

A function is said to be multimodal on a given interval ifthere are more than one minimum/maximum point inthe interval.

Global and Local Optima


9/28

Line search techniques are in essence optimization algorithmsfor one-dimensional minimization problems.

They are often regarded as the backbones of nonlinear

optimization algorithms.Typically, these techniques search a bracketed interval.

Often, unimodality is assumed.

Line search techniques are in essence optimization algorithmsfor one-dimensional minimization problems.

They are often regarded as the backbones of nonlinear

optimization algorithms.Typically, these techniques search a bracketed interval.

Often, unimodality is assumed.

Exhaustive search requires N = (b-a)/ε + 1 calculations tosearch the above interval, where ε is the resolution.

a bx*

Line search methods


10/28

xl xu x

f f (( x x))

Bracketing Method

Suppose f ( x) is unimodal on the interval [ xl, xu]. That is,there is only one local maxima in [ xl, xu].

Objective: Gradually narrowing down the interval by

eliminating the sub-interval that does not contain themaxima.


11/28

x xll x xaa x xbb x xuu x x

Bracketing Method

Let xa and xb be two points in ( xl, xu) where xa f ( xb), then the maximum point will not reside in theinterval [ xb, xu] and as a result we can eliminate the portiontoward the right of xb.

In other words, in the next iteration we can make xb the new xu

x xll x xaa x xbb x xuu x x


12/28

Two point search (dichotomous search) for finding the solution tominimizing ƒ(x):

0) assume an interval [a,b]

1) Find x1 = a + (b-a)/2 - /2 and x2 = a+(b-a)/2 + /2where is the resolution.

2) Compare ƒ(x1) and ƒ(x2)

3) If ƒ(x1) < ƒ(x2) then eliminate x > x2 and set b = x2

If ƒ(x1) > ƒ(x2) then eliminate x < x1 and set a = x1

If ƒ(x1) = ƒ(x2) then pick another pair of points

4) Continue placing point pairs until interval < 2

a bx1 x2

ε

Basic bracketing algorithm


13/28

Generic Bracketing Method (Pseudocode)

// xl, xu: Lower and upper bounds of the interval

// es: Acceptable relative error

function BracketingMax(xl, xu, es) {

do { prev_optimal = optimal;

Select xa and xb s.t. xl


14/28

How would you suggest we select xa and xb (withthe objective to minimize computation)?

• Eliminate as much interval as possible in each iteration• Set xa and xb close to the center so that we can

halve the interval in each iteration

• Drawbacks: function evaluation is usually a costlyoperation.

• Minimize the number of function evaluations• Select xa and xb such that one of them can be

reused in the next iteration (so that we only need to

evaluate f ( x) once in each iteration).• How should we select such points?

Bracketing Method


15/28

Current iteration

x' l x' a x' b x' u

If we can calculate xa and xbbased on the ratio R w.r.t.the current interval length in

each iteration, then we canreuse one of xa and xb in thenext iteraton.

In this example, xa is reusedas x' b in the next iteration soin the next iteration we only

need to evaluate f ( x' a).

xl xa x b xu

Next iteration

l 1

l 1

l o

l'1l'1

l'o

baab x xor x x

l

l R

l

l

==

==

''

'

0

'

1

0

1

Objective:

xl xa x b xu


16/28

xl xa x b xu

Current iteration

Next iteration

x' l x' a x' b x' u

l 1

l 1

l o

l'1l'1

l'o

61803.02

15

)1(2

)1(411

010

][

'

'

'and'Since

2

000

2

01

0

1

0

00

1

10

0

1

10110

≈−

=

−−+−

=⇒

=−+⇒=−+⇒

=⇒==−

⇒

=−

⇒

=

−==

R

R R

l Rll R

Rll Rl

l R

Rl

Rll

R

l

ll

Rl

l

lllll

Golden Ratio xl xa x b xu


17/28

)(

2

15whereor lulbua x xd d x xd x x −

−=+=−=

• Starts with two initial guesses, xl and xu

• Two interior points xa and xb are calculated based on the

golden ratio as

Golden-Section Search

• In the first iteration, both xa and xb need to becalculated.

• In subsequent iteration, xl and xu are updatedaccordingly and only one of the two interior pointsneeds to be calculated. (The other one is inherited from

the previous iteration.)


18/28

• In each iteration the interval is reduced to about 61.8%(Golden ratio) of its previous length.

• After 10 iterations, the interval is shrunk to about(0.618)10 or 0.8% of its initial length.

• After 20 iterations, the interval is shrunk to about(0.618)20 or 0.0066%.

Golden-Section Search


19/28

a bx1 x2

Initialize:

x1 = a + (b-a)*0.382

x2 = a + (b-a)*0.618

f1 = ƒ(x1)f2 = ƒ(x2)

Loop:

if f1 > f2 then

a = x1; x1 = x2; f1 = f2

x2 = a + (b-a)*0.618

f2 = ƒ(x2)

elseb = x2; x2 = x1; f2 = f1

x1 = a + (b-a)*0.382

f1 = ƒ(x1)

endif

Bracketing a Minimum using Golden Section


20/28

a bx1 x2

L2

L2

L3

ε

L1

L1 = L2 + L3

It can be derived that

Ln = (L1 + Fn-2 ) / Fn

Fibonacci Search

Fibonacci numbers are:

1,1,2,3,5,8,13,21,34,..

that is , the sum of the last 2 numbersFn = Fn-1 + Fn-2


21/28

x0 x1 x3 x2 x

f ( x)

Quadratic Interpolation

Idea:

(i) Approximate f ( x) using a quadratic function g( x) = ax2+bx+c

(ii) Optima of f ( x)≈Optima of g( x)

Optima of f ( x)Optima of g( x)


22/28

• Shape near optima typically appears like aparabola. We can approximate the originalfunction f ( x) using a quadratic function:

g( x) = ax2 + bx + c.

• At the optimum point of g( x), g' ( x) = 2ax + b = 0.

Let x3 be the optimum point, then x3 = -b/2a.• How to compute b and a?

• 2 points => unique straight line (1st-order polynomial)

• 3 points => unique parabola (2nd

-order polynomial)• So, we need to pick three points that surround the optima.

• Let these points be x0, x1, x2 such that x0


23/28

• a and b can be obtained by solving the system of linearequations

)(

)(

)(

2

1

0

2

2

2

1

2

1

0

2

0

x f

x f

x f

cbxax

cbxax

cbxax

=

=

=

++

++

++

))((2))((2))((2))(())(())((

102021210

2

1

2

02

2

0

2

21

2

2

2

103

x x x f x x x f x x x f x x x f x x x f x x x f x−+−+−

−+−+−=

• Substitute a and b into x3 = -b/2a yields



24/28

• The process can be repeated to improve theapproximation.

• Next step, decide which sub-interval to discard• Since f ( x3) > f ( x1)

if x3 > x1, discard the interval toward the left of x1

i.e., Set x0 = x1 and x1 = x3

if x3 < x1, discard the interval toward the right of x1i.e., Set x

2= x

1and x

1= x

3

• Calculate x3 based on the new x0 , x1 , x2



25/28

If your function is differentiable , then you do notneed to evaluate two points to determine the

region to be discarded. Get the slope and the signindicates which region to discard.

If your function is differentiable , then you do notneed to evaluate two points to determine theregion to be discarded. Get the slope and the signindicates which region to discard.

Basic premise in Newton-Raphson method:

Root finding of first derivative is equivalent to finding optimum

(if function is differentiable).

Method is sometimes referred to as a line search by curve fitbecause it approximates the real (unknown) objectivefunction to be minimized.

Gradient method: Newton's Method


26/28

Newton’s Method

Let g( x) = f' ( x)

Thus the zeroes of g( x) is the optima of f ( x).

Substituting g( x) into the updating formula ofNewton-Rahpson method, we have

)("

)('

)('

)(

1i

i

ii

i

ii x f

x f x

xg

xg x x −=−=

+


27/28

Newton’s Method

• Shortcomings

• Need to derive f' ( x) and f" ( x).

• May diverge• May "jump" to another solution far away

• Advantages• Fast convergent rate near solution

• Hybrid approach: Use bracketing method to find an

approximation near the solution, then switch toNewton's method.


28/28

False Position Method or Secant MethodFalse Position Method or Secant Method

Second order information is expensive to calculate (for multi-variable problems).

Thus, try to approximate second order derivative.

Question: Why is this an advantage ?

Replace y''(xk) in Newton Raphson with

1k k

1k k

k xx

)x('y)x('y)x(''y

−

−

−

−=

Hence, Newton Raphson becomes

))x('y()x('y)x('y

xxxx k 1k k

1k k k 1k

−

−+ −−−=

Main advantage is no second derivative requirement

lec 15 summary of single var methods

Documents