l1 static optimization unconstrained

Upload: pokornypvl

Post on 14-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 L1 Static Optimization Unconstrained

    1/6

    Static optimization unconstrained problemsGraduate course on Optimal and Robust Control (spring12)

    Zdenek Hurak

    Department of Control EngineeringFaculty of Electrical Engineering

    Czech Technical University in Prague

    February 17, 2013

    1 / 2 4

    Lecture outline

    General optimization problem

    Classes of optimization problems

    Optimization without constraints

    2 / 2 4

    General optimization problem

    minimize f(x)

    subject to

    x R

    n

    ,heq(x) = 0,

    hineq(x) 0.

    Note

    max f(x) = min(f(x))

    3 / 2 4

    Classes of optimization problems

    Linear programming

    Quadratic programming

    Semidefinite programming . . .

    (General) nonlinear programming

    4 / 2 4

  • 7/30/2019 L1 Static Optimization Unconstrained

    2/6

    Nonlinear optimization without constraints

    min f(x), x Rn.

    10

    5

    0

    5

    10

    10

    5

    0

    5

    10100

    0

    100

    200

    300

    400

    xy

    z

    5 / 2 4

    Optimization without constraints scalar case

    min f(x), x R.

    0 1 2 3 4 5 6 7 8 9 1 04

    3

    2

    1

    0

    1

    2

    x

    y(x)

    Local minimum at x if f(x) f(x) in an neighbourgood.

    Local maximum at x if f(x) f(x) in an neighbourgood.

    6 / 2 4

    Assumptions

    real variables (no integers)

    smoothness (at least first and second derivatives)

    convexity

    7 / 2 4

    Taylor approximation of the function around the minimum

    f(x + ) = f(x) + f(x) +1

    2f(x)2 + O(3)

    or

    f(x + ) = f(x) + f(x) +1

    2f(x)2 + o(2)

    Big-O or little-o concepts:

    lim0

    O(3)

    3 M < ,

    lim0

    o(2)

    2= 0

    8 / 2 4

  • 7/30/2019 L1 Static Optimization Unconstrained

    3/6

    First-order necessary conditions

    Taylor approximation of the increment in the cost function

    f = f(x + ) f(x) = f(x) + o()

    The classical necessary condition on the first derivative at the

    critical point

    f(x) = 0

    Recall this is just necessary, not sufficient!

    9 / 2 4

    Second-order necessary conditions for minimum

    Higher-order Taylor approximation of the increment in the costfunction

    f = f(x + ) f(x) =

    f(x) +1

    2f(x)2 + o(2)

    The classical necessary condition on the second derivative at thecritical point (knowing that the first derivative vanishes)

    f(x) 0

    Proof: there is some such that for || <

    1

    2|f(x)|2 < |o(2)|

    10/24

    Second-order sufficient conditions for minimum

    f(x) = 0, f(x) > 0

    Note this is just sufficient, not necessary! If f(x) = 0,higher-order terms need to be investigated.

    1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 10

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    x

    x4

    y(x)=x4

    11/24

    What if the cost function is a function of several variables

    Pick an arbitrary vector d

    f(x + d) =: g()

    Taylor expansion

    g() = g(0) + g(0) + o()

    First-order necessary condition

    g(0) = 0

    12/24

  • 7/30/2019 L1 Static Optimization Unconstrained

    4/6

    Back to f() using chain rule:

    g() = (f(x + d) gradient

    )T d

    f(x) = 0

    (Some regard the gradient a column, some row, some do not care.)

    13/24

    Second-order necessary conditions of minimum

    Again, back from g() to f()

    g() =n

    i=1

    d

    dxif(x + d)di

    g() =n

    i,j=1

    d2

    dxixjf(x + d)didj

    g(0) =n

    i,j=1

    d2

    dxixjf(x)didj

    g(0) = dT 2f(x) Hessian

    d

    2f(x) 0 positive semidefinite

    14/24

    Second-order sufficient conditions of minimum

    f(x + d) = f(x) + (f(x))T d + o(2)

    f(x) = 0, 2f(x) > 0 positive definite

    15/24

    Extending the argument to arbitrary direction d

    For every direction d (say, of unit length) there is a corresponding

    (d) such that for || < (d)

    1

    2|f(x)|2>|o(2)|

    If we can prove that a minimum of exists over all d then x is alocal minimum.

    Theorem (Weierstrass)

    A continuous function achieves a minimum on a compact set.

    Compact set = closed and bounded for finite-dimensional spaces.

    16/24

  • 7/30/2019 L1 Static Optimization Unconstrained

    5/6

    What the previous step does not sayIf at a given stationary point x and for an arbitrary

    direction d the one-variable function g() achieves alocal minimum, it can be concluded that the original

    function f (x) achieve a local minimum.

    NO! See

    f(x, y) = (x y2)(x 3y2)

    0.4

    0.2

    0.0

    0.2

    0.4

    x

    0.4

    0.2

    0.0

    0.2

    0.4

    y

    0.0

    0.1

    0.2

    0.3

    17/24

    Alternative development of necessary and sufficientconditions

    f(x + d) = f(x) + (f(x))T d + dT2f(x)d + o(d2)

    Frechet (above) vs. Gateaux (before) derivative.

    18/24

    Classification of stationary (critical) points

    f(x) = 0

    2f(x) > 0: Minimum

    2f(x) indefinite: Saddle point

    2f(x) = 0: Singular point

    19/24

    Quadratic surfaces

    f(x) =1

    2x

    T

    q11 q12q21 q22

    Q

    x +

    b1 b2

    bT

    x

    f(x) = Qx + b

    First-order necessary conditions for the stationary point

    x = Q1b

    Hessian2x = Q

    20/24

  • 7/30/2019 L1 Static Optimization Unconstrained

    6/6

    Example - minimum of a quadratic function

    f(x) =1

    2x

    T

    1 11 2

    x +

    0 1

    x

    0

    50

    50

    50

    50

    50

    50

    50

    100

    100

    100

    100

    150

    150

    200

    200

    x1

    x2

    10 8 6 4 2 0 2 4 6 8 1010

    8

    6

    4

    2

    0

    2

    4

    6

    8

    10

    21/24

    Example - saddle point of a quadratic function

    f(x) =1

    2x

    T

    1 11 2

    x +

    0 1

    x

    50

    50

    50

    0

    0

    0

    0

    0

    0

    0

    0

    50

    50 50

    50

    5050

    100

    100

    100

    100

    150

    x1

    x2

    10 8 6 4 2 0 2 4 6 8 1010

    8

    6

    4

    2

    0

    2

    4

    6

    8

    10

    22/24

    Example - singular point

    f(x) = (x1 x22 )(x1 3x

    22 )

    0

    0

    0

    0

    0

    0

    0

    0

    0.1

    0.1

    0.1

    0.1

    0.1

    0.1

    0.2

    0.2

    0.2

    0.2

    0.2

    0.3

    0.3

    0.3

    0.3

    0.4

    0.4

    0.5

    0.5

    0.6

    0.6

    0.7

    0.7

    0.8

    0.8

    u1

    u2

    0. 5 0. 4 0. 3 0. 2 0. 1 0 0 .1 0 .2 0 .3 0 .4 0 .50.5

    0.4

    0.3

    0.2

    0.1

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.5

    0

    0.5

    0.5

    0

    0.5

    0.5

    0

    0.5

    1

    u1

    u2

    L(u)

    s ym s x 1 x 2f = ( x 1 x 2 2 )( x13x 2 2 )f x = s i m p l i f y ( [ d i f f ( f , x1 ); d i f f ( f , x2 )] )f x x = [ d i f f ( fx , x1 ) , d i f f ( fx , x2 )]

    f x x x ( : , : , 1 ) = d i f f ( fxx , x1 );f x x x ( : , : , 2 ) = d i f f ( fxx , x2 )

    23/24

    Summary

    necessary and sufficient conditions of optimality (gradient,Hessian)

    classification of stationary points: minimum/maximum, saddlepoint, singular point

    24/24