optimization - cornell university · bisection method • suppose we have an interval [a,b] and we...
TRANSCRIPT
OptimizationCS3220 - Summer 2008
Jonathan Kaldor
Problem Setup
• Suppose we have a function f(x) in one variable (for the moment)
• We want to find x’ such that f(x’) is a minimum of the function f(x)
• Can have local minimum and global minimum - one is a lot easier to find than the other, though, without special knowledge about the problem
Problem Setup
• Can find global minimum for certain types of problems
• Our focus: general (possibly nonlinear) functions, and local minima
• Finding x’ such that f(x’) is minimal over some local region around x’.
Constrained Versus Unconstrained
• Most general problem statement: find x’ such that f(x’) is minimized, subject to constraints g(x’) ≤ 0
• Function g(x) represents constraints on the solution (can be equality or inequality constraints)
• Our focus: unconstrained optimization (so no g() function)
Applications
• Many applications in sciences
• Energy Minimization
• Function representing the energy in a system
• Find the minimum energy - stable state of system
Applications
• Minimum Surface Area
• Given some shape, find the minimum surface that matches at the boundaries
• Real life example: take some shape and dip it into bubble solution. Bubble solution takes the shape of the minimum surface
Applications
• Planning and Scheduling
• Schedules for sports teams
• Finals schedules
• Typically solving constrained optimization problems (we’ll look at unconstrained problems only)
Applications
• Protein Folding
• Image Recognition
• ... and many, many more
Bisection Method
• Suppose we have an interval [a,b] and we would like to find a local minimum in that interval.
• Evaluate at c = (a+b)/2. Do we gain anything
• Answer: no (don’t have any idea which interval the minimum is in
Bisection Method
a c b
Trisection Method
• If we divide up our interval into three pieces, though, we can tell which interval has a minimum
• Suppose we have a < u < v < b. Then if f(u) < f(v), we know that there is a local minimum in the interval [a,v]
• Similar case for f(v) < f(u), local minimum in [u,b]
Trisection Method
• So we evaluate at u = a + (b-a)/3 and v = a + 2*(b-a)/3. Suppose f(u) < f(v), and our new interval is [a,v]
• The u we computed before is the midpoint between a and v. We can’t reuse the function value on the next step, so we need to evaluate our function twice at every iteration
Trisection Method
• Note: we aren’t restricted to choosing u and v evenly spaced between a and b
• Can we choose u and v such that we can reuse the computation of whichever one doesn’t become an endpoint at the next step?
Trisection Method
• More accurately, suppose we are at [a0, b0], and we compute u0, v0. Our next interval happens to be [a0, v0], and we now need to compute u1, v1. Can we arrange our u’s and v’s such that u0 = u1 or u0 = v1?
• Note: this will allow us to save a function evaluation at each iteration
Golden Section Search
• Define h = ρ(b - a). This is the distance of our u and v from the endpoints a and b. Thus, u = a + h, v = b - h.
• We want to find ρ so that u0 is in the correct position for the next step.
Golden Section Search
0 1v0u0
v1u1
ρ 1-ρ
1-ρ0
Golden Section Search
0 1v0u0
v1u1
ρ 1-ρ
1-ρ0
ρ1-ρ
1-ρ1
=
Golden Section Search
• Given the relation between the ratios, we can then compute the desired value of ρ:
ρ = 1 - 2ρ + ρ2
ρ2 - 3ρ + 1 = 0
ρ = (3 - √5)/2 = 2 - ϕ ≈ 0.382
where ϕ is the golden ratio (again)
Convergence
• At every iteration, we compute a new interval 61.8% the size of the old one
• Larger than the interval we compute during bisection, but smaller than the interval we compute using trisection
• End up needing around 75 iterations to reduce error in our bounds to epsilon
Existence / Uniqueness
• If our function is unimodular on the original interval [a,b] (has only one minimum) then we will converge to it.
• If the function has multiple local minima in the interval, we will converge to one of them (but it may not be the global minimum)
Sidestep: Calculus (again)
• Suppose we have some function f(x), and we want to find the critical points of f
• Minima, maxima, and points of inflection
• Critical points are where f ’(x) = 0
• Minima: f ’(x) = 0, f ’’(x) > 0Maxima: f ’(x) = 0, f ’’(x) < 0Point of inflection: f ’(x) = f ’’(x) = 0
Sidestep: Calculus (again)
• So, we want to find roots of f ’(x)
• We can use the root finding strategies we already talked about!
• In particular, we can use Newton’s Method applied to f ’(x). Our linear approximation is thenf’(x+h) = f ’(x0) + f ’’(x0) h
• So h = -f ’(x0) / f ’’(x0)
Newton’s Method
• Another way of deriving it: suppose we are at a point x0. When we wanted to find the root, we took a linear approximation to the function at that point, since it was easy to find the root
• What function is easy to find a minimum of? A quadratic function! So take the quadratic approximation of f at x0
Newton’s Method
• f(x+h) = f(x0) + f ’(x0) h + f ’’(x0) h2 /2
• We can compute the minimum of this function by taking the derivative with respect to h and solving.
• End up with h = - f ’(x0) / f ’’(x0)
• Note: same answer either derivation
Newton’s Method
• We gain all of the benefits (and drawbacks) of Newton’s Method
• In particular, we have no guarantee of convergence if we don’t start reasonably close to the minimum
• We have the added wrinkle that we may not even converge to a minimum (could converge to maximum or inflection pt)