line search algorithms

Upload: lephuduc

Post on 05-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 Line Search Algorithms

    1/13

    9.1

    Line Search Algorithms

    Katta G. Murty, IOE 611 Lecture slides

    Algorithms for one-dimensional opt. problems of form: min

    f() over 0 (or a b for some a < b).

    Bracket: An interval in feasible region which contains the

    min.

    A 2-point bracket: is 1 2 where f(1) < 0 and

    f(2) > 0.

    A 3-point bracket is 1 < 2 < 3 s. th. f(2)

    min{f(1), f(3)}.

    How to Select An Initial 3-pt. Bracket?

    First consider min f() over 0 where at 0 the

    function decreases as increases from 0 (otherwise 0 itself is alocal min).

    Select a step length s. th. f() < f(0) (possible because of

    78

  • 7/31/2019 Line Search Algorithms

    2/13

    above facts). Define

    0 = 0

    r = r1 + 2r1, for r = 1, 2, . . .

    as long as f(r) keeps on decreasing, until a value k for r is found

    for first time s. th. f(k+1) > f(k).

    So we have f(k) < f(k1) also. Among 4 points k1, k, 12(k+

    k+1), k+1, drop either k1 or k+1 whichever is farther from

    the point in pair {k,12(k + k+1)} that yields smallest value for

    f(); and remaining 3 pts. form an equi-distant 3-pt. bracket.

    If problem is min f() over a b it is reasonable

    to assume that f() decreases as increases thro a (otherwise

    a is a local min). Apply above procedure choosing sufficiently

    small, keep defining r as above until either a k defined as above

    is found, or until the upper bound b for is reached.

    79

  • 7/31/2019 Line Search Algorithms

    3/13

    Methods Using 3-pt. Brackets

    Golden Section Search: Works well when function is uni-

    modal in initial bracket (i.e., it has unique local min there).

    Quite reasonable assumption in many applications.

    Let be unknown min pt. Unimodality 1 < 2 <

    then f(1) > f(2) and < 1 < 2 then f(2) > f(1).

    General Step: Let [, ] be interval of current bracket.

    f(), f() would already have been computed earlier.

    = 21+5)

    0.618 is the Golden Ratio. Let 1 = +

    (1 0.618)( ), 2 = + 0.618( ). Iff(1), f(2) not

    available, compute them. If

    f(1) < f(2) new bracket is [, 1, 2]

    f(1) > f(2) new bracket is [1, 2, ]

    f(1) = f(2) new bracket interval is [1, 2]

    Repeat with new bracket, continue until bracket length be-

    80

  • 7/31/2019 Line Search Algorithms

    4/13

    comes small.

    Quadratic Fit Line Search Method

    General Step: Let (1, 2, 3) be current 3-pt. bracket. Fit

    a Quad. approx. Q() = a2 + b + c to f() using function

    values at 1, 2, 3. Because of bracket property, Q() will be

    convex. Its unique min is:

    =(22

    23)f(1) + (

    23

    21)f(2) + (

    21

    22)f(3)

    2[(2 3)f(1) + (3 1)f(2) + (1 2)f(3)]

    If

    > 2 and f() > f(2) new bracket is [1, 2, ]

    > 2 and f() < f(2) new bracket is [2, , 3]

    > 2 and f() = f(2) new bracket is either of the above.

    If < 2 similar to above.

    If = 2, quad. fit failed to produce a new pt. In this case:

    if 3 1 = positive tolerance, stop with 2 as best pt.

    81

  • 7/31/2019 Line Search Algorithms

    5/13

    3 1 > , define =

    2 + /2 if 2 1 < 3 2

    2 /2 if 2 1 > 3 2

    Compute f() and go to the above step again.

    Terminate method when either of max{f(1), f(3)}f(2)

    or 3 1 or |f(2)| becomes smaller than a tolerance.

    82

  • 7/31/2019 Line Search Algorithms

    6/13

    Methods Using 2-pt. Brackets

    The method of Bisection: Let [a, b] be current bracket.

    Compute f((a + b)/2). If

    f((a + b)/2)

    = 0 (a + b)/2 is a stationary pt., terminate

    > 0 continue with [a, (a + b)/2]

    < 0 continue with [(a + b)/2, b]

    Not preferred, because method does not use function values.

    Cubic Interpolation Method: Let [1, 2] be current bracket.

    Fit a cubic approx. to f() using values of f(1), f(2), f(1),

    f(2). Because of bracket conds., min of this cubic func. is

    inside bracket. It is:

    = 1 + (2 1)1

    f(2) + f(2) f(1) + 2

    where = 3(f(1)f(2))21 +f(1)+f(2) and = (2f(1)f(2))1/2

    If |f()| is small accept

    as approx. to min. Otherwise if

    f() > 0 (< 0) continue with [1, ] ([, 2]).

    83

  • 7/31/2019 Line Search Algorithms

    7/13

    Line Search Method Based on Piecewise Linear Ap-

    proximation

    Let [a, b] be initial 2-pt. bracket for f().

    A piecewise linear (or polyhedral) approximation for

    f() in the bracket [a, b] is the pointwise supremum function of

    the linearizations off() at a & b, P() = max{f(a)+ f(a)(

    a), f(b) + f(b)( b)}.

    From bracket conds., the min of P() occurs in [a, b] at the

    point where the two linearizations are equal, it is a + dP where

    dP =f(a) f(b) f(b)(a b)

    f(b) f(a)

    The quadratic Taylor series approximation of f() at a is

    Q() = f(a) + f(a)( a) + 12

    ( a)2f(a). The min ofQ()

    occurs at a + dQ where

    dQ =

    f(a)/f(a) iff(a) > 0

    iff(a) 0

    One simple strategy is to take the new point to be 1 =

    a + min{dP, dQ} if dQ > 0, or a + dP otherwise; and take the

    next bracket to be the 2-pt. bracket among [a, 1], [1, b] and

    84

  • 7/31/2019 Line Search Algorithms

    8/13

    continue.

    But C. Lemarechal & R. Mifflin make several modifications

    to guarantee convergence to a stationary pt. even when f() is

    nonconvex. Also see Murty LCLNP.

    85

  • 7/31/2019 Line Search Algorithms

    9/13

    Newtons Method for line search

    2nd order method for: min f() over entire real line.

    It is application of Newton-Raphson method to solve the eq.

    f() = 0.

    Starting with initial point 0, generates iterates by

    r+1 = r f(r)

    f(r)assuming all f(r) > 0. Not suitable if 2nd derivative 0 at a

    point encountered.

    Suitable if a near opt. sol. known, then function is locally

    convex in the nbhd. of opt.

    Secant Method: A modified Newtons method. Method ini-

    tiated with a 2-pt. bracket, and replaces f(r) by the finite

    difference approx. f(r)f(r1)

    rr1 in Newton formula.

    86

  • 7/31/2019 Line Search Algorithms

    10/13

    Inexact Line Search Procedures

    Exact line searches expensive in subroutines for solving higher

    dimensional min. problems. Inexact line searches with sufficient

    degree of descent do guarantee convergence of overall algo.

    Consider: min f() = (x + y) over 0. x is current

    point, y is a descent direction at x.

    Conditions for Global Convergence

    1. Sufficient Rate of Descent Condition: If is step

    length choosen, new pt. is x + y. Cond. is that average rate

    of descent from x to x + y be specified fraction of initial rate

    of decrease in theta(x) at x in direction y. Select (0, 1) and

    require satisfy:

    f() f(0) + f(0)

    2. Step Length not too Small: Cond. 1 will always be

    satisfied for any < 1 by sufficiently small steps. This requires

    that step lengths are not too small. Several equivalent ways to

    87

  • 7/31/2019 Line Search Algorithms

    11/13

    ensure this. One is to require that satisfy:

    (2.1) f() f(0)

    for some selected (, 1). Sats step must be long enough that

    rate of change in f at is specified fraction of its magnitude

    at 0.

    Theorem: If (x) is bounded below on Rn, (x)y < 0,

    0 < < < 1, there exists 0 < 1 < 2 s. th. for any

    [1, 2], x + y satisfies both conds. 1 & (2.1).

    Theorem: (x) is cont. diff., and there exists 0 s. th.

    ||(z) (x)|| ||z x|| x, z Rn. Starting with x0

    suppose the sequence {xr} generated by choosing yr satisfying

    (xr)yr < 0, and the iteration xr+1 = xr + ryr where r

    satisfies conds. 1 & (2.1). Then one of following holds.

    (a) Either (xk

    ) = 0 for some k, or

    (b) limr f(xr) = , or

    (c) limr(xr)yr||yr|| = 0.

    88

  • 7/31/2019 Line Search Algorithms

    12/13

    From (c) we know that unless the angle between (xr) & yr

    converges to 900 as r , either (xr) 0, or f(xr) ,

    or both.

    If we can guarantee that angle between (xr) & yr is bounded

    away from 900 (this can be guaranteed by proper choice of yr,

    for example in Quasi-Newton methods yr = Hr(xr) where

    Hr is PD, this property will hold if cond. numbers of {Hr} are

    uniformly bounded above) then either (xr) 0, or f(xr)

    , or both.

    Other equivalent conds. to ensure step lengths not too small

    are:

    (2.2) f() f(0) + f(0)

    for some selected (0, ).

    89

  • 7/31/2019 Line Search Algorithms

    13/13

    A simple procedure to satisfy these conds. uses 0 < 1 < 1 < 2

    (1 = 0.2, 2 = 2 commonly used). plays role of in above.

    Cond. 1 is equivalent to requiring step length satisfy f()

    () = f(0) + 1f(0).

    Cond. 2 is enforced by requiring f(2) > (2) = f(0) +

    21f(0).

    Step 1: See if length of 1 satisfies both conds. 1 and 2. If so,

    select = 1 as step length and terminate procedure. If step

    length of 1 violates cond. 1 [cond. 2] go to Step 2 [Step 3].

    Step 2: Take = 1t2

    where t is the smallest integer > 1 for which

    f( 1

    t

    2

    ) ( 1

    t

    2

    ), and terminate procedure.

    Step 3: Take = t2 where t is largest integer 0 for which

    f(t2) (t2), and terminate procedure.

    Other ways of implementing procedure uses quadratic fits to

    determine new step lengths when current step length (step length

    = 1 initially) violates one of the conds.

    90