optimization - cornell university · bisection method • suppose we have an interval [a,b] and we...

OptimizationCS3220 - Summer 2008

Jonathan Kaldor

Problem Setup

• Suppose we have a function f(x) in one variable (for the moment)

• We want to find x’ such that f(x’) is a minimum of the function f(x)

• Can have local minimum and global minimum - one is a lot easier to find than the other, though, without special knowledge about the problem

Problem Setup

• Can find global minimum for certain types of problems

• Our focus: general (possibly nonlinear) functions, and local minima

• Finding x’ such that f(x’) is minimal over some local region around x’.

Constrained Versus Unconstrained

• Most general problem statement: find x’ such that f(x’) is minimized, subject to constraints g(x’) ≤ 0

• Function g(x) represents constraints on the solution (can be equality or inequality constraints)

• Our focus: unconstrained optimization (so no g() function)

Applications

• Many applications in sciences

• Energy Minimization

• Function representing the energy in a system

• Find the minimum energy - stable state of system

Applications

• Minimum Surface Area

• Given some shape, find the minimum surface that matches at the boundaries

• Real life example: take some shape and dip it into bubble solution. Bubble solution takes the shape of the minimum surface

Applications

• Planning and Scheduling

• Schedules for sports teams

• Finals schedules

• Typically solving constrained optimization problems (we’ll look at unconstrained problems only)

Applications

• Protein Folding

• Image Recognition

• ... and many, many more

Bisection Method

• Suppose we have an interval [a,b] and we would like to find a local minimum in that interval.

• Evaluate at c = (a+b)/2. Do we gain anything

• Answer: no (don’t have any idea which interval the minimum is in

Bisection Method

a c b

Trisection Method

• If we divide up our interval into three pieces, though, we can tell which interval has a minimum

• Suppose we have a < u < v < b. Then if f(u) < f(v), we know that there is a local minimum in the interval [a,v]

• Similar case for f(v) < f(u), local minimum in [u,b]

Trisection Method

• So we evaluate at u = a + (b-a)/3 and v = a + 2*(b-a)/3. Suppose f(u) < f(v), and our new interval is [a,v]

• The u we computed before is the midpoint between a and v. We can’t reuse the function value on the next step, so we need to evaluate our function twice at every iteration

Trisection Method

• Note: we aren’t restricted to choosing u and v evenly spaced between a and b

• Can we choose u and v such that we can reuse the computation of whichever one doesn’t become an endpoint at the next step?

Trisection Method

• More accurately, suppose we are at [a0, b0], and we compute u0, v0. Our next interval happens to be [a0, v0], and we now need to compute u1, v1. Can we arrange our u’s and v’s such that u0 = u1 or u0 = v1?

• Note: this will allow us to save a function evaluation at each iteration

Golden Section Search

• Define h = ρ(b - a). This is the distance of our u and v from the endpoints a and b. Thus, u = a + h, v = b - h.

• We want to find ρ so that u0 is in the correct position for the next step.


0 1v0u0

v1u1

ρ 1-ρ

1-ρ0


0 1v0u0

v1u1

ρ 1-ρ

1-ρ0

ρ1-ρ

1-ρ1

=


• Given the relation between the ratios, we can then compute the desired value of ρ:

ρ = 1 - 2ρ + ρ2

ρ2 - 3ρ + 1 = 0

ρ = (3 - √5)/2 = 2 - ϕ ≈ 0.382

where ϕ is the golden ratio (again)

Convergence

• At every iteration, we compute a new interval 61.8% the size of the old one

• Larger than the interval we compute during bisection, but smaller than the interval we compute using trisection

• End up needing around 75 iterations to reduce error in our bounds to epsilon

Existence / Uniqueness

• If our function is unimodular on the original interval [a,b] (has only one minimum) then we will converge to it.

• If the function has multiple local minima in the interval, we will converge to one of them (but it may not be the global minimum)

Sidestep: Calculus (again)

• Suppose we have some function f(x), and we want to find the critical points of f

• Minima, maxima, and points of inflection

• Critical points are where f ’(x) = 0

• Minima: f ’(x) = 0, f ’’(x) > 0Maxima: f ’(x) = 0, f ’’(x) < 0Point of inflection: f ’(x) = f ’’(x) = 0

Sidestep: Calculus (again)

• So, we want to find roots of f ’(x)

• We can use the root finding strategies we already talked about!

• In particular, we can use Newton’s Method applied to f ’(x). Our linear approximation is thenf’(x+h) = f ’(x0) + f ’’(x0) h

• So h = -f ’(x0) / f ’’(x0)

Newton’s Method

• Another way of deriving it: suppose we are at a point x0. When we wanted to find the root, we took a linear approximation to the function at that point, since it was easy to find the root

• What function is easy to find a minimum of? A quadratic function! So take the quadratic approximation of f at x0

Newton’s Method

• f(x+h) = f(x0) + f ’(x0) h + f ’’(x0) h2 /2

• We can compute the minimum of this function by taking the derivative with respect to h and solving.

• End up with h = - f ’(x0) / f ’’(x0)

• Note: same answer either derivation

Newton’s Method

• We gain all of the benefits (and drawbacks) of Newton’s Method

• In particular, we have no guarantee of convergence if we don’t start reasonably close to the minimum

• We have the added wrinkle that we may not even converge to a minimum (could converge to maximum or inflection pt)

optimization - cornell university · bisection method • suppose we have an interval [a,b] and we...

Documents