l2 static optimization unconstrained numerical
TRANSCRIPT
-
7/30/2019 L2 Static Optimization Unconstrained Numerical
1/10
Static optimization unconstrained problemsGraduate course on Optimal and Robust Control (spring12)
Zdenek Hurak
Department of Control EngineeringFaculty of Electrical Engineering
Czech Technical University in Prague
February 19, 2013
1 / 3 8
Lecture outline
Derivative-free optimizationNelder-Mead simplex method
Derivative-based optimizationLine search methods
Methods for line search (step length)Methods for descent direction search
Trust region methods
2 / 3 8
Numerical algorithms for unconstrained optimization
The key classification
Methods based on derivatives
Derivative-free methods (Nelder-Mead)
3 / 3 8
Derivative-free methods Nelder-Mead simplex method
Not to be confused with simplex method in linear programming!
1
2
3
fminsearch() in Matlab
4 / 3 8
-
7/30/2019 L2 Static Optimization Unconstrained Numerical
2/10
Derivative-based methods
Line search methods
Trust region methods
5 / 3 8
Line search methods
1. descent direction search . . . dk
2. line search (step length determination) . . . k
xk+1 = xk + kpk
6 / 3 8
Methods for line search
1. Fibonacci, golden section
2. Bisection
3. Newton
4. Inexact line search
7 / 3 8
Fibonacci searchFibonacci sequence = 1, 1, 2, 3, 5, 8, 13, . . .Fix the number of intervals at the beginning. Say, 13:
1 2 3 5 8 13 x
f(x)
Start by evaluating f(x) at x = 5 and x = 8. Need 4 evaluations(13 is the n = 4th Fib. number). In general n 2 steps and theuncertainty (b a)/FnImprovement in the uncertainty Fn1/Fn.
limn
Fn1/Fn =1
(1 +
5)/2 0.618
8 / 3 8
-
7/30/2019 L2 Static Optimization Unconstrained Numerical
3/10
Golden section search
dk+1/dk 0.618
x
f(x)
a bx1x2
9 / 3 8
Speed of convergence Order of convergence
Order p of convergence of the sequence {rk} to r
0 limsupk
rk+1 r(rk r)p
<
Examples:
rk = ak, 0 < a < 1
rk = a(2k), 0 < a < 1
10/38
Linear convergence
limk
rk+1 rrk r
= < 1
Geometric seriesrk = c
k
Comparisons of two linearly converging algorithms based on theirconvergence ratios .
For = 0: superlinear convergence.For = 1: sublinear convergence.Ex.: rk = 1/k
11/38
Bisection method
x
f(x)
a bx1
12/38
-
7/30/2019 L2 Static Optimization Unconstrained Numerical
4/10
Newtons method for line search
Approximate the function by a parabola (use f(xk), f(xk) and
f(xk)):
q(x) = f(xk) + f(xk)(x xk) + 1
2f(xk)(x xk)2
Find the minimum of the approximating function can be done
analytically:
0 = q(x) = f(xk) + f(xk)(x xk)
xk+1 = xk f(xk)
f(xk)
13/38
Newtons method for line search
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.3
0.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
0.8
x
f(x)
f(x)
x0 + f(x0)(x x0) + 1/2f(x0)(x x0)
14/38
Another look at Newtons method equation solving
Solving g(x) = 0
g(x)
0 xxk
g(xk)
g(xk)
xk+1
x
xk+1 = xk g(xk)
g(xk)
15/38
Quadratic convergence of Newtons method
Lets stay with the equation solving formulation.
xk+1 x = xk x g(xk) g(x)
g(xk)
= g(xk)
g(x) + g(xk)(x
xk)
g(xk)
=1
2
g()
g(xk)(xk x)2
|xk+1 x| k12k2
|xk x|2
16/38
-
7/30/2019 L2 Static Optimization Unconstrained Numerical
5/10
Methods for descent direction search
1. steepest descent
2. Newton
3. Quasi-Newton4. Conjugate direction /conjugage gradient method (CG)
17/38
Steepest descent
Condition for descending direction
dTk f(xk) < 0
Recall the geometric interpettation of an inner product
|dTk f(xk)| = |dk||f(xk)| cos The steepest descent
xk+1 = xk kf(xk)
18/38
Steepest descent applied to quadratic cost
f(x) =1
2xTQx bTx
Find that minimizes f(xk kfk)
f(xkkfk) = 12
(xkkfk)TQ(xkkfk)bT(xkkfk)Using that gradient is
f(x) = Qx
b we get upon differentiation
wrt
k =fTk fkfTk Qfk
Hence steepest descent method is
xk+1 = xkfTk fkfTk Qfk
fk
19/38
Zigzagging of the steepest descent method
0
2500
2500
2500
25002500
5000
5000
5000
5000
5000
5000
5000
7500
7500
7
0
7500
7500
7500
7500
7500
10000
10000
10000
10000
12500
1250
0
12500
1250
0
15000
15000
17500
17500
20000
20000
22500
22500
x1
x2
100 80 60 40 20 0 20 40 60 80 100100
80
60
40
20
0
20
40
60
80
100
Poor convergence rate depending on the scaling.
20/38
-
7/30/2019 L2 Static Optimization Unconstrained Numerical
6/10
Newtons search (also Newton-Raphson)
Idea: The function to be minimized is approximated locallyby a quadratic function and this approximatingfunction is minimized exactly.
xk+1 = xk [2f(xk) Hessian
]1 f(xk) gradient
Local convergence guaranteed but not global!
21/38
Solving symmetric positive definite linear equations
SolveAx = b
Cholesky factorization
A = XX.
22/38
Modifications of Newtons search damping
A search parameter introduced
xk+1 = xk k[2
f(xk)]1
f(xk)
23/38
Modifications of Newtons search positive definitenessPositive definite matrix Mk instead of Hessian
xk+1 = xk kM1k f(xk)Interprettation in the scalar nonlinear equation case:
x
g(x)
Another approach
Bk = 2f(xk) + Ekwhere Ek = 0 iff(xk) is sufficiently positive definite, otherwise itis chosen so that Bk > 0.
24/38
-
7/30/2019 L2 Static Optimization Unconstrained Numerical
7/10
Quasi-Newton
From the definition of Hessian
2fk (xk+1 xk) sk
fk+1 fk yk
Find a matrix Bk+1 that mimics the Hessian behaviour above.
Bk+1sk = yk
Typically two requirements
symmetry (as Hessian)
low-rank approximation between the steps
25/38
Popular updates in quasi-Newton methods
Symmetric-rank-one (SR1): Bk+1 = Bk +(ykBksk)(ykBksk)
T
(ykBksk)Tsk
BFGS: Bk+1 = Bk BksksTkBk
sTkBksk
+yky
Tk
yTksk
As inversion ofBk is needed, the update can be applied to itsinverse directly:
DFP (Davidon, Fletcher and Powell)
Other updates keep the Hessian in factored formulation
H = RRT . . . Cholesky factorization
Matlab chol() function
26/38
Conjugate gradient directions
27/38
Inexact line search
Armijo
Goldstein
Wolfe
28/38
-
7/30/2019 L2 Static Optimization Unconstrained Numerical
8/10
Intuitive approach step size reduction
Start with an initial step size s and if the corresponding vector
xk+ sd does not yield an improved (smaller) value of f(), that is, if
f(xk + sd) f(xk),reduce the step size, possibly by a fixed factor. Repeat.
29/38
However, convergence to minimum not guaranteed:
f(x) =
s(1x)2
4 2(1 x) if x > 1s(1+x)2
4 2(1 + x) if x < 1x2 1 if 1 x 1.
2 1 1 2x
1.0
0.5
0.5
1.0
1.5
2.0
2.5
f
xk+1 = xk 1f(xk).30/38
Armijos condition
31/38
Goldsteins condition
32/38
-
7/30/2019 L2 Static Optimization Unconstrained Numerical
9/10
Wolfes condition
33/38
Terminal conditions
34/38
Trust region methods
Recall
f(xk + p) f(xk) + fT(xk) p+ 12p2f(xk)p
We seek the minimum of the quadratic model function
mk(p) = f(xk) + fT(xk) p + 12pBkp
subject to
p k.For Bk = 2f(xk) trust-region Newton method.
35/38
Trust region methods
Contours off(x)
Contours ofmk(x)
Trust region
Line search direction
Trust region step
36/38
-
7/30/2019 L2 Static Optimization Unconstrained Numerical
10/10
Software
1. Optimization toolbox for Matlab: fminunc() (trust-regionNewton), fminsearch() (Nelder-Mead simplex)
2. UnconstrainedProblems package for Mathematica:FindMinimumPlot, FindMinimum
37/38
Summary
line search methods: direction search, step lentgthdetermination, Newton methods, quasi-Newton.
trust region methods.
38/38