l2 static optimization unconstrained numerical

Upload: pokornypvl

Post on 14-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 L2 Static Optimization Unconstrained Numerical

    1/10

    Static optimization unconstrained problemsGraduate course on Optimal and Robust Control (spring12)

    Zdenek Hurak

    Department of Control EngineeringFaculty of Electrical Engineering

    Czech Technical University in Prague

    February 19, 2013

    1 / 3 8

    Lecture outline

    Derivative-free optimizationNelder-Mead simplex method

    Derivative-based optimizationLine search methods

    Methods for line search (step length)Methods for descent direction search

    Trust region methods

    2 / 3 8

    Numerical algorithms for unconstrained optimization

    The key classification

    Methods based on derivatives

    Derivative-free methods (Nelder-Mead)

    3 / 3 8

    Derivative-free methods Nelder-Mead simplex method

    Not to be confused with simplex method in linear programming!

    1

    2

    3

    fminsearch() in Matlab

    4 / 3 8

  • 7/30/2019 L2 Static Optimization Unconstrained Numerical

    2/10

    Derivative-based methods

    Line search methods

    Trust region methods

    5 / 3 8

    Line search methods

    1. descent direction search . . . dk

    2. line search (step length determination) . . . k

    xk+1 = xk + kpk

    6 / 3 8

    Methods for line search

    1. Fibonacci, golden section

    2. Bisection

    3. Newton

    4. Inexact line search

    7 / 3 8

    Fibonacci searchFibonacci sequence = 1, 1, 2, 3, 5, 8, 13, . . .Fix the number of intervals at the beginning. Say, 13:

    1 2 3 5 8 13 x

    f(x)

    Start by evaluating f(x) at x = 5 and x = 8. Need 4 evaluations(13 is the n = 4th Fib. number). In general n 2 steps and theuncertainty (b a)/FnImprovement in the uncertainty Fn1/Fn.

    limn

    Fn1/Fn =1

    (1 +

    5)/2 0.618

    8 / 3 8

  • 7/30/2019 L2 Static Optimization Unconstrained Numerical

    3/10

    Golden section search

    dk+1/dk 0.618

    x

    f(x)

    a bx1x2

    9 / 3 8

    Speed of convergence Order of convergence

    Order p of convergence of the sequence {rk} to r

    0 limsupk

    rk+1 r(rk r)p

    <

    Examples:

    rk = ak, 0 < a < 1

    rk = a(2k), 0 < a < 1

    10/38

    Linear convergence

    limk

    rk+1 rrk r

    = < 1

    Geometric seriesrk = c

    k

    Comparisons of two linearly converging algorithms based on theirconvergence ratios .

    For = 0: superlinear convergence.For = 1: sublinear convergence.Ex.: rk = 1/k

    11/38

    Bisection method

    x

    f(x)

    a bx1

    12/38

  • 7/30/2019 L2 Static Optimization Unconstrained Numerical

    4/10

    Newtons method for line search

    Approximate the function by a parabola (use f(xk), f(xk) and

    f(xk)):

    q(x) = f(xk) + f(xk)(x xk) + 1

    2f(xk)(x xk)2

    Find the minimum of the approximating function can be done

    analytically:

    0 = q(x) = f(xk) + f(xk)(x xk)

    xk+1 = xk f(xk)

    f(xk)

    13/38

    Newtons method for line search

    0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.3

    0.35

    0.4

    0.45

    0.5

    0.55

    0.6

    0.65

    0.7

    0.75

    0.8

    x

    f(x)

    f(x)

    x0 + f(x0)(x x0) + 1/2f(x0)(x x0)

    14/38

    Another look at Newtons method equation solving

    Solving g(x) = 0

    g(x)

    0 xxk

    g(xk)

    g(xk)

    xk+1

    x

    xk+1 = xk g(xk)

    g(xk)

    15/38

    Quadratic convergence of Newtons method

    Lets stay with the equation solving formulation.

    xk+1 x = xk x g(xk) g(x)

    g(xk)

    = g(xk)

    g(x) + g(xk)(x

    xk)

    g(xk)

    =1

    2

    g()

    g(xk)(xk x)2

    |xk+1 x| k12k2

    |xk x|2

    16/38

  • 7/30/2019 L2 Static Optimization Unconstrained Numerical

    5/10

    Methods for descent direction search

    1. steepest descent

    2. Newton

    3. Quasi-Newton4. Conjugate direction /conjugage gradient method (CG)

    17/38

    Steepest descent

    Condition for descending direction

    dTk f(xk) < 0

    Recall the geometric interpettation of an inner product

    |dTk f(xk)| = |dk||f(xk)| cos The steepest descent

    xk+1 = xk kf(xk)

    18/38

    Steepest descent applied to quadratic cost

    f(x) =1

    2xTQx bTx

    Find that minimizes f(xk kfk)

    f(xkkfk) = 12

    (xkkfk)TQ(xkkfk)bT(xkkfk)Using that gradient is

    f(x) = Qx

    b we get upon differentiation

    wrt

    k =fTk fkfTk Qfk

    Hence steepest descent method is

    xk+1 = xkfTk fkfTk Qfk

    fk

    19/38

    Zigzagging of the steepest descent method

    0

    2500

    2500

    2500

    25002500

    5000

    5000

    5000

    5000

    5000

    5000

    5000

    7500

    7500

    7

    0

    7500

    7500

    7500

    7500

    7500

    10000

    10000

    10000

    10000

    12500

    1250

    0

    12500

    1250

    0

    15000

    15000

    17500

    17500

    20000

    20000

    22500

    22500

    x1

    x2

    100 80 60 40 20 0 20 40 60 80 100100

    80

    60

    40

    20

    0

    20

    40

    60

    80

    100

    Poor convergence rate depending on the scaling.

    20/38

  • 7/30/2019 L2 Static Optimization Unconstrained Numerical

    6/10

    Newtons search (also Newton-Raphson)

    Idea: The function to be minimized is approximated locallyby a quadratic function and this approximatingfunction is minimized exactly.

    xk+1 = xk [2f(xk) Hessian

    ]1 f(xk) gradient

    Local convergence guaranteed but not global!

    21/38

    Solving symmetric positive definite linear equations

    SolveAx = b

    Cholesky factorization

    A = XX.

    22/38

    Modifications of Newtons search damping

    A search parameter introduced

    xk+1 = xk k[2

    f(xk)]1

    f(xk)

    23/38

    Modifications of Newtons search positive definitenessPositive definite matrix Mk instead of Hessian

    xk+1 = xk kM1k f(xk)Interprettation in the scalar nonlinear equation case:

    x

    g(x)

    Another approach

    Bk = 2f(xk) + Ekwhere Ek = 0 iff(xk) is sufficiently positive definite, otherwise itis chosen so that Bk > 0.

    24/38

  • 7/30/2019 L2 Static Optimization Unconstrained Numerical

    7/10

    Quasi-Newton

    From the definition of Hessian

    2fk (xk+1 xk) sk

    fk+1 fk yk

    Find a matrix Bk+1 that mimics the Hessian behaviour above.

    Bk+1sk = yk

    Typically two requirements

    symmetry (as Hessian)

    low-rank approximation between the steps

    25/38

    Popular updates in quasi-Newton methods

    Symmetric-rank-one (SR1): Bk+1 = Bk +(ykBksk)(ykBksk)

    T

    (ykBksk)Tsk

    BFGS: Bk+1 = Bk BksksTkBk

    sTkBksk

    +yky

    Tk

    yTksk

    As inversion ofBk is needed, the update can be applied to itsinverse directly:

    DFP (Davidon, Fletcher and Powell)

    Other updates keep the Hessian in factored formulation

    H = RRT . . . Cholesky factorization

    Matlab chol() function

    26/38

    Conjugate gradient directions

    27/38

    Inexact line search

    Armijo

    Goldstein

    Wolfe

    28/38

  • 7/30/2019 L2 Static Optimization Unconstrained Numerical

    8/10

    Intuitive approach step size reduction

    Start with an initial step size s and if the corresponding vector

    xk+ sd does not yield an improved (smaller) value of f(), that is, if

    f(xk + sd) f(xk),reduce the step size, possibly by a fixed factor. Repeat.

    29/38

    However, convergence to minimum not guaranteed:

    f(x) =

    s(1x)2

    4 2(1 x) if x > 1s(1+x)2

    4 2(1 + x) if x < 1x2 1 if 1 x 1.

    2 1 1 2x

    1.0

    0.5

    0.5

    1.0

    1.5

    2.0

    2.5

    f

    xk+1 = xk 1f(xk).30/38

    Armijos condition

    31/38

    Goldsteins condition

    32/38

  • 7/30/2019 L2 Static Optimization Unconstrained Numerical

    9/10

    Wolfes condition

    33/38

    Terminal conditions

    34/38

    Trust region methods

    Recall

    f(xk + p) f(xk) + fT(xk) p+ 12p2f(xk)p

    We seek the minimum of the quadratic model function

    mk(p) = f(xk) + fT(xk) p + 12pBkp

    subject to

    p k.For Bk = 2f(xk) trust-region Newton method.

    35/38

    Trust region methods

    Contours off(x)

    Contours ofmk(x)

    Trust region

    Line search direction

    Trust region step

    36/38

  • 7/30/2019 L2 Static Optimization Unconstrained Numerical

    10/10

    Software

    1. Optimization toolbox for Matlab: fminunc() (trust-regionNewton), fminsearch() (Nelder-Mead simplex)

    2. UnconstrainedProblems package for Mathematica:FindMinimumPlot, FindMinimum

    37/38

    Summary

    line search methods: direction search, step lentgthdetermination, Newton methods, quasi-Newton.

    trust region methods.

    38/38