= argmin - thecatweb.cecs.pdx.edu/~edam/slides/multivariateoptimizationx4.pdf · example 1:...
TRANSCRIPT
-
Example 1: Optimization Problem
a1
a 2
5 4 3 2 1 0 1 2 3 4 55
0
5
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 3
Overview of Multivariate Optimization Topics
Problem definition Algorithms
Cyclic coordinate method Steepest descent Conjugate gradient algorithms PARTAN Newtons method Levenberg-Marquardt
Concise, subjective summary
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 1
Example 1: Optimization Problem
a1
a 2
5 4 3 2 1 0 1 2 3 4 55
0
5
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 4
Multivariate Optimization Overview
The unconstrained optimization problem is a generalization ofthe line search problem
Find a vector a such thata = argmin
af(a)
Note that the are no constraints on a Example: Find the vector of coefficients (w Rp1) that
minimize the average absolute error of a linear model
Akin to a blind person trying to find their way to the bottom of avalley in a multidimensional landscape
We want to reach the bottom with the minimum number of canetaps
Also vaguely similar to taking core samples for oil prospecting
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 2
-
Example 1: Optimization Problem
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 7
Example 1: Optimization Problem
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 5
Example 1: MATLAB Code
function [] = OptimizationProblem ();
%==============================================================================
% User -Specified Parameters
%==============================================================================
x = -5:0.05 :5;
y = -5:0.05 :5;
%==============================================================================
% Evaluate the Function
%==============================================================================
[X,Y] = meshgrid(x,y);
[Z,G] = OptFn(X,Y);
functionName = OptimizationProblem ;
fileIdentifier = fopen([ functionName .tex],w);
%==============================================================================
% Contour Map
%==============================================================================
figure;
FigureSet(2,Slides);
contour(x,y,Z ,50);
xlabel(a_1);
ylabel(a_2);
zoom on;
AxisSet (8);
fileName = sprintf(%s-%s,functionName ,Contour );
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 8
Example 1: Optimization Problem
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 6
-
case 1, view (45 ,10);
case 2, view ( -55 ,22);
case 3, view ( -131 ,10);
otherwise , error(Not implemented. );
end
fileName = sprintf(%s-%s%d,functionName ,Surface ,c1);
print(fileName ,-depsc);
fprintf(fileIdentifier ,%%=============================================================================
fprintf(fileIdentifier ,\\ newslide\n);
fprintf(fileIdentifier ,\\ slideheading{Example \\ arabic{exc}: Optimization Problem }\n);
fprintf(fileIdentifier ,%%=============================================================================
fprintf(fileIdentifier ,\\ includegraphics[scale =1]{ Matlab/%s}\n,fileName );
fprintf(fileIdentifier ,\n);
end
%==============================================================================
% List the MATLAB Code
%==============================================================================
fprintf(fileIdentifier ,%%==============================================================================\ n
fprintf(fileIdentifier ,\\ newslide \n);
fprintf(fileIdentifier ,\\ slideheading{Example \\ arabic{exc}: MATLAB Code}\n);
fprintf(fileIdentifier ,%%==============================================================================\ n
fprintf(fileIdentifier ,\t \\ matlabcode{Matlab/%s.m}\n,functionName );
fclose(fileIdentifier );
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 11
print(fileName ,-depsc);
fprintf(fileIdentifier ,%%==============================================================================\ n
fprintf(fileIdentifier ,\\ newslide\n);
fprintf(fileIdentifier ,\\ stepcounter{exc}\n);
fprintf(fileIdentifier ,\\ slideheading{Example \\ arabic{exc}: Optimization Problem }\n);
fprintf(fileIdentifier ,%%==============================================================================\ n
fprintf(fileIdentifier ,\\ includegraphics[scale =1]{ Matlab/%s}\n,fileName );
fprintf(fileIdentifier ,\n);
%==============================================================================
% Quiver Map
%==============================================================================
figure;
FigureSet(1,Slides);
axis([-5 5 -5 5]);
contour(x,y,Z ,50);
h = get(gca ,Children );
set(h,LineWidth ,0.2);
hold on;
xCoarse = -5:0.5:5;
yCoarse = -5:0.5:5;
[X,Y] = meshgrid(xCoarse ,yCoarse );
[ZCoarse ,GCoarse] = OptFn(X,Y);
nr = size(xCoarse ,1);
dzx = GCoarse( 1:nr ,1:nr);
dzy = GCoarse(nr + (1:nr),1:nr);
quiver(xCoarse ,yCoarse ,dzx ,dzy);
hold off;
xlabel(a_1);
ylabel(a_2);
zoom on;
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 9
Global Optimization?
In general, all optimization algorithms find a local minimum in asfew steps as possible
There are also global optimization algorithms based on ideassuch as
Evolutionary computing Genetic algorithms Simulated annealing
None of these guarantee convergence in a finite number ofiterations
All require a lot of computation
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 12
AxisSet (8);
fileName = sprintf(%s-%s,functionName ,Quiver);
print(fileName ,-depsc);
fprintf(fileIdentifier ,%%==============================================================================\ n
fprintf(fileIdentifier ,\\ newslide\n);
fprintf(fileIdentifier ,\\ slideheading{Example \\ arabic{exc}: Optimization Problem }\n);
fprintf(fileIdentifier ,%%==============================================================================\ n
fprintf(fileIdentifier ,\\ includegraphics[scale =1]{ Matlab/%s}\n,fileName );
fprintf(fileIdentifier ,\n);
%==============================================================================
% 3D Maps
%==============================================================================
figure;
set(gcf ,Renderer ,zbuffer );
FigureSet(1,Slides);
h = surf(x,y,Z);
set(h,LineStyle ,None);
xlabel(a_1);
ylabel(a_2);
shading interp;
grid on;
AxisSet (8);
hl = light(Position ,[0,0,30]);
set(hl ,Style,Local);
set(h,BackFaceLighting ,unlit)
material dull
for c1=1:3
switch c1
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 10
-
Cyclic Coordinate Method
1. For i = 1 to p,
ai := argmin
f([a1, a2, . . . , ai1, , ai+1, . . . , ap])
2. Loop to 1 until convergence
+ Simple to implement
+ Each line search can be performed semi-globally to avoid shallowlocal minima
+ Can be used with nominal variables
+ f(a) can be discontinuous
+ No gradient required
Very slow compared to gradient-based optimization algorithms Usually only practical when the number of parameters, p, is small There are modified versions with faster convergence
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 15
Optimization Comments
Ideally, when we construct models we should favor those whichcan be optimized with few shallow local minima and reasonablecomputation
Graphically you can think of the function to be minimized as theelevation in a complicated high-dimensional landscape
The problem is to find the lowest point The most common approach is to go downhill The gradient points in the most uphill direction The steepest downhill direction is the opposite of the gradient Most optimization algorithms use a line search algorithm The methods mostly differ only in the way that the direction of
descent is generated
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 13
Example 2: Cyclic Coordinate Method
5 0 55
4
3
2
1
0
1
2
3
4
5
X
Y
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 16
Optimization Algorithm Outline
The basic steps of these algorithms is as follows1. Pick a starting vector a
2. Find the direction of descent, d
3. Move in that direction until a minimum is found:
:= argmin
f(a + d)
a := a + d
4. Loop to 2 until convergence
Most of the theory of these algorithms is based on quadraticsurfaces
Near local minima, this is a good approximation Note that the functions should (must) have continuous gradients
(almost) everywhere
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 14
-
Example 2: Cyclic Coordinate Method
0 5 10 15 20 250
1
2
3
4
5
6
Iteration
Euc
lidea
n Po
sitio
n E
rror
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 19
Example 2: Cyclic Coordinate Method
3 2 1 03.5
3
2.5
2
1.5
1
0.5
0
0.5
X
Y
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 17
Example 2: Relevant MATLAB Code
function [] = CyclicCoordinate ();
%clear all;
close all;
ns = 26;
x = -3;
y = 1;
b0 = -1;
ls = 30;
a = zeros(ns ,2);
f = zeros(ns ,1);
[z,dzx ,dzy] = OptFn(x,y);
a(1,:) = [x y];
f(1) = z;
for cnt = 2:ns,
if rem(cnt ,2)==1 ,
d = [1 0]; % Along x direction
else
d = [0 1]; % Along y direction
end;
[b,fmin] = LineSearch ([x y],d,b0 ,ls);
x = x + b*d(1);
y = y + b*d(2);
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 20
Example 2: Cyclic Coordinate Method
0 5 10 15 20 250
1
2
3
4
5
6
7
Iteration
Func
tion
Val
ue
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 18
-
print -depsc CyclicCoordinateContourB;
figure;
FigureSet (2,4.5 ,2.75);
k = 1:ns;
xerr = (sum (((a-ones(ns ,1)*[ xopt2 yopt2])).^2)).^(1/2);
h = plot(k-1,xerr ,b);
set(h(1),Marker,.);
set(h,MarkerSize ,6);
xlabel(Iteration );
ylabel(Euclidean Position Error);
xlim ([0 ns -1]);
ylim ([0 xerr (1)]);
grid on;
set(gca ,Box,Off);
AxisSet (8);
print -depsc CyclicCoordinatePositionError;
figure;
FigureSet (2,4.5 ,2.75);
k = 1:ns;
h = plot(k-1,f,b ,[0 ns],zopt *[1 1],r ,[0 ns],zopt2 *[1 1],g);
set(h(1),Marker,.);
set(h,MarkerSize ,6);
xlabel(Iteration );
ylabel(Function Value);
ylim ([0 f(1)]);
xlim ([0 ns -1]);
grid on;
set(gca ,Box,Off);
AxisSet (8);
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 23
a(cnt ,:) = [x y];
f(cnt) = fmin;
end;
[x,y] = meshgrid (0+( -0 .01:0.001:0.01),3+(-0.01:0 .001:0.01 ));
[z,dzx ,dzy] = OptFn(x,y);
[zopt ,id1] = min(z);
[zopt ,id2] = min(zopt);
id1 = id1(id2);
xopt = x(id1 ,id2);
yopt = y(id1 ,id2);
[x,y] = meshgrid (1 .883+(-0.02:0.001:0.02),-2.963+(-0.02:0.001:0.02 ));
[z,dzx ,dzy] = OptFn(x,y);
[zopt2 ,id1] = min(z);
[zopt2 ,id2] = min(zopt2);
id1 = id1(id2);
xopt2 = x(id1 ,id2);
yopt2 = y(id1 ,id2);
figure;
FigureSet (1,4.5 ,2.75);
[x,y] = meshgrid (-5:0.1:5,-5:0.1 :5);
z = OptFn(x,y);
contour(x,y,z ,50);
h = get(gca ,Children );
set(h,LineWidth ,0.2);
axis(square);
hold on;
h = plot(a(:,1),a(:,2),k,a(:,1),a(:,2),r);
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 21
print -depsc CyclicCoordinateErrorLinear;
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 24
set(h(1),LineWidth ,1.2);
set(h(2),LineWidth ,0.6);
h = plot(xopt ,yopt ,kx,xopt ,yopt ,rx);
set(h(1),LineWidth ,1.5);
set(h(2),LineWidth ,0.5);
set(h(1),MarkerSize ,5);
set(h(2),MarkerSize ,4);
hold off;
xlabel(X);
ylabel(Y);
zoom on;
AxisSet (8);
print -depsc CyclicCoordinateContourA;
figure;
FigureSet (1,4.5 ,2.75);
[x,y] = meshgrid (-1.5 + (-2:0.05:2),-1.5 + (-2:0.05 :2));
[z,dzx ,dzy] = OptFn(x,y);
contour(x,y,z ,75);
h = get(gca ,Children );
set(h,LineWidth ,0.2);
axis(square);
hold on;
h = plot(a(:,1),a(:,2),k,a(:,1),a(:,2),r);
set(h(1),LineWidth ,1.2);
set(h(2),LineWidth ,0.6);
hold off;
xlabel(X);
ylabel(Y);
zoom on;
AxisSet (8);
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 22
-
Example 3: Steepest Descent
5 0 55
4
3
2
1
0
1
2
3
4
5
X
Y
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 27
Steepest Descent
The gradient of the function f(a) is defined as the vector of partialderivatives:
af(a) [
f(a)a1
f(a)a2
. . . f(a)ap
]T
It can be shown that the gradient, af(a), points in thedirection of maximum ascent
The negative of the gradient, af(a), points in the directionof maximum descent
A vector d is a direction of descent if there exists a such thatf(a + d) < f(a) for all 0 < <
It can also be shown that d is a direction of descent iff(af(a))
T
d < 0
The algorithm of steepest descent uses d = af(a) The most fundamental of all algorithms for minimizing a
continuously differentiable function
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 25
Example 3: Steepest Descent
2 1.8 1.6 1.4 1.22.2
2.1
2
1.9
1.8
1.7
1.6
1.5
1.4
1.3
1.2
X
Y
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 28
Steepest Descent
+ Very stable algorithm
Can converge very slowly once near the local minima where thesurface is approximately quadratic
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 26
-
Example 3: Relevant MATLAB Code
function [] = SteepestDescent ();
%clear all;
close all;
ns = 26;
x = -3;
y = 1;
b0 = 0.01;
ls = 30;
a = zeros(ns ,2);
f = zeros(ns ,1);
[z,g] = OptFn(x, y);
a(1,:) = [x y];
f(1) = z;
d = -g/norm(g);
for cnt = 2:ns,
[b,fmin] = LineSearch ([x y],d,b0 ,ls);
x = x + b*d(1);
y = y + b*d(2);
[z,g] = OptFn(x, y);
d = -g;
d = d/norm(d);
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 31
Example 3: Steepest Descent
0 5 10 15 20 250
1
2
3
4
5
6
7
Iteration
Func
tion
Val
ue
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 29
a(cnt ,:) = [x y];
f(cnt) = z;
end;
[x,y] = meshgrid (0+( -0 .01:0.001:0.01),3+(-0.01:0 .001:0.01 ));
[z,dzx ,dzy] = OptFn(x,y);
[zopt ,id1] = min(z);
[zopt ,id2] = min(zopt);
id1 = id1(id2);
xopt = x(id1 ,id2);
yopt = y(id1 ,id2);
[x,y] = meshgrid (1 .883+(-0.02:0.001:0.02),-2.963+(-0.02:0.001:0.02 ));
[z,dzx ,dzy] = OptFn(x,y);
[zopt2 ,id1] = min(z);
[zopt2 ,id2] = min(zopt2);
id1 = id1(id2);
xopt2 = x(id1 ,id2);
yopt2 = y(id1 ,id2);
[zopt zopt2]
figure;
FigureSet (1,4.5 ,2.75);
[x,y] = meshgrid (-5:0.1:5,-5:0.1 :5);
z = OptFn(x,y);
contour(x,y,z ,50);
h = get(gca ,Children );
set(h,LineWidth ,0.2);
axis(square);
hold on;
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 32
Example 3: Steepest Descent Method
0 5 10 15 20 250
1
2
3
4
5
6
Iteration
Euc
lidea
n Po
sitio
n E
rror
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 30
-
AxisSet (8);
print -depsc SteepestDescentErrorLinear ;
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 35
h = plot(a(:,1),a(:,2),k,a(:,1),a(:,2),r);
set(h(1),LineWidth ,1.2);
set(h(2),LineWidth ,0.6);
h = plot(xopt ,yopt ,kx,xopt ,yopt ,rx);
set(h(1),LineWidth ,1.5);
set(h(2),LineWidth ,0.5);
set(h(1),MarkerSize ,5);
set(h(2),MarkerSize ,4);
hold off;
xlabel(X);
ylabel(Y);
zoom on;
AxisSet (8);
print -depsc SteepestDescentContourA;
figure;
FigureSet (1,4.5 ,2.75);
[x,y] = meshgrid (-1.6 + (-0.5:0.01:0.5),-1.7 + (-0.5:0.01:0.5));
z = OptFn(x,y);
contour(x,y,z ,75);
h = get(gca ,Children );
set(h,LineWidth ,0.2);
axis(square);
hold on;
h = plot(a(:,1),a(:,2),k,a(:,1),a(:,2),r);
set(h(1),LineWidth ,1.2);
set(h(2),LineWidth ,0.6);
hold off;
xlabel(X);
ylabel(Y);
zoom on;
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 33
Conjugate Gradient Algorithms
1. Take a steepest descent step
2. For i = 2 to p
:= argmin
f(a + d)
a := a + d gi := f(a) := g
Ti gi
gTi1gi1
d := gi + di3. Loop to 1 until convergence
Based on quadratic approximations of f Called the Fletcher-Reeves method
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 36
AxisSet (8);
print -depsc SteepestDescentContourB;
figure;
FigureSet (2,4.5 ,2.75);
k = 1:ns;
xerr = (sum (((a-ones(ns ,1)*[ xopt2 yopt2])).^2)).^(1/2);
h = plot(k-1,xerr ,b);
set(h(1),Marker,.);
set(h,MarkerSize ,6);
xlabel(Iteration );
ylabel(Euclidean Position Error);
xlim ([0 ns -1]);
ylim ([0 xerr (1)]);
grid on;
set(gca ,Box,Off);
AxisSet (8);
print -depsc SteepestDescentPositionError;
figure;
FigureSet (2,4.5 ,2.75);
k = 1:ns;
h = plot(k-1,f,b ,[0 ns],zopt *[1 1],r ,[0 ns],zopt2 *[1 1],g);
set(h(1),Marker,.);
set(h,MarkerSize ,6);
xlabel(Iteration );
ylabel(Function Value);
ylim ([0 f(1)]);
xlim ([0 ns -1]);
grid on;
set(gca ,Box,Off);
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 34
-
Example 4: Fletcher-Reeves Conjugate Gradient
0 5 10 15 20 250
1
2
3
4
5
6
7
Iteration
Func
tion
Val
ue
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 39
Example 4: Fletcher-Reeves Conjugate Gradient
5 0 55
4
3
2
1
0
1
2
3
4
5
X
Y
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 37
Example 4: Fletcher-Reeves Conjugate Gradient
0 5 10 15 20 250
1
2
3
4
5
6
Iteration
Euc
lidea
n Po
sitio
n E
rror
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 40
Example 4: Fletcher-Reeves Conjugate Gradient
1.5 2 2.53.5
3.4
3.3
3.2
3.1
3
2.9
2.8
2.7
2.6
2.5
X
Y
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 38
-
h = plot(a(:,1),a(:,2),k,a(:,1),a(:,2),r);
set(h(1),LineWidth ,1.2);
set(h(2),LineWidth ,0.6);
h = plot(xopt ,yopt ,kx,xopt ,yopt ,rx);
set(h(1),LineWidth ,1.5);
set(h(2),LineWidth ,0.5);
set(h(1),MarkerSize ,5);
set(h(2),MarkerSize ,4);
hold off;
xlabel(X);
ylabel(Y);
zoom on;
AxisSet (8);
print -depsc FletcherReevesContourA;
figure;
FigureSet (1,4.5 ,2.75);
[x,y] = meshgrid (1.5:0.01:2.5 ,-3.5:0.01:-2.5);
z = OptFn(x,y);
contour(x,y,z ,75);
h = get(gca ,Children );
set(h,LineWidth ,0.2);
axis(square);
hold on;
h = plot(a(:,1),a(:,2),k,a(:,1),a(:,2),r);
set(h(1),LineWidth ,1.2);
set(h(2),LineWidth ,0.6);
hold off;
xlabel(X);
ylabel(Y);
zoom on;
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 43
Example 4: Relevant MATLAB Code
function [] = FletcherReeves ();
%clear all;
close all;
ns = 26;
x = -3;
y = 1;
b0 = 0.01;
ls = 30;
a = zeros(ns ,2);
f = zeros(ns ,1);
[z,g] = OptFn(x, y);
a(1,:) = [x y];
f(1) = z;
d = -g/norm(g); % First direction
for cnt = 2:ns,
[b,fmin] = LineSearch ([x y],d,b0 ,ls);
x = x + b*d(1);
y = y + b*d(2);
go = g; % Old gradient
[z,g] = OptFn(x, y);
beta = (g*g)/(go *go);
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 41
AxisSet (8);
print -depsc FletcherReevesContourB;
figure;
FigureSet (2,4.5 ,2.75);
k = 1:ns;
xerr = (sum (((a-ones(ns ,1)*[ xopt2 yopt2])).^2)).^(1/2);
h = plot(k-1,xerr ,b);
set(h(1),Marker,.);
set(h,MarkerSize ,6);
xlabel(Iteration );
ylabel(Euclidean Position Error);
xlim ([0 ns -1]);
ylim ([0 xerr (1)]);
grid on;
set(gca ,Box,Off);
AxisSet (8);
print -depsc FletcherReevesPositionError;
figure;
FigureSet (2,4.5 ,2.75);
k = 1:ns;
h = plot(k-1,f,b ,[0 ns],zopt *[1 1],r ,[0 ns],zopt2 *[1 1],g);
set(h(1),Marker,.);
set(h,MarkerSize ,6);
xlabel(Iteration );
ylabel(Function Value);
ylim ([0 f(1)]);
xlim ([0 ns -1]);
grid on;
set(gca ,Box,Off);
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 44
d = -g + beta*d;
a(cnt ,:) = [x y];
f(cnt) = z;
end;
[x,y] = meshgrid (0+( -0 .01:0.001:0.01),3+(-0.01:0 .001:0.01 ));
[z,dzx ,dzy] = OptFn(x,y);
[zopt ,id1] = min(z);
[zopt ,id2] = min(zopt);
id1 = id1(id2);
xopt = x(id1 ,id2);
yopt = y(id1 ,id2);
[x,y] = meshgrid (1 .883+(-0.02:0.001:0.02),-2.963+(-0.02:0.001:0.02 ));
[z,dzx ,dzy] = OptFn(x,y);
[zopt2 ,id1] = min(z);
[zopt2 ,id2] = min(zopt2);
id1 = id1(id2);
xopt2 = x(id1 ,id2);
yopt2 = y(id1 ,id2);
figure;
FigureSet (1,4.5 ,2.75);
[x,y] = meshgrid (-5:0.1:5,-5:0.1 :5);
z = OptFn(x,y);
contour(x,y,z ,50);
h = get(gca ,Children );
set(h,LineWidth ,0.2);
axis(square);
hold on;
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 42
-
Example 5: Polak-Ribiere Conjugate Gradient
5 0 55
4
3
2
1
0
1
2
3
4
5
X
Y
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 47
AxisSet (8);
print -depsc FletcherReevesErrorLinear;
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 45
Example 5: Polak-Ribiere Conjugate Gradient
1.5 2 2.53.5
3.4
3.3
3.2
3.1
3
2.9
2.8
2.7
2.6
2.5
X
Y
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 48
Conjugate Gradient Algorithms Continued
There is also a variant called Polak-Ribiere where
:=(gi gi1)Tgi
gT
i1gi1
+ Only requires the gradient
+ Converges in a finite No. steps when f(a) is quadratic and perfectline searches are used
Less stable numerically than steepest descent Sensitive to inexact line searches
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 46
-
Example 5: MATLAB Code
function [] = PolakRibiere ();
%clear all;
close all;
ns = 26;
x = -3;
y = 1;
b0 = 0.01;
ls = 30;
a = zeros(ns ,2);
f = zeros(ns ,1);
[z,g] = OptFn(x, y);
a(1,:) = [x y];
f(1) = z;
d = -g/norm(g); % First direction
for cnt = 2:ns,
[b,fmin] = LineSearch ([x y],d,b0 ,ls);
x = x + b*d(1);
y = y + b*d(2);
go = g; % Old gradient
[z,g] = OptFn(x, y);
beta = ((g-go)*g)/(go *go);
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 51
Example 5: Polak-Ribiere Conjugate Gradient
0 5 10 15 20 250
1
2
3
4
5
6
7
Iteration
Func
tion
Val
ue
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 49
d = -g + beta*d;
a(cnt ,:) = [x y];
f(cnt) = z;
end;
[x,y] = meshgrid (0+( -0 .01:0.001:0.01),3+(-0.01:0 .001:0.01 ));
[z,dzx ,dzy] = OptFn(x,y);
[zopt ,id1] = min(z);
[zopt ,id2] = min(zopt);
id1 = id1(id2);
xopt = x(id1 ,id2);
yopt = y(id1 ,id2);
[x,y] = meshgrid (1 .883+(-0.02:0.001:0.02),-2.963+(-0.02:0.001:0.02 ));
[z,dzx ,dzy] = OptFn(x,y);
[zopt2 ,id1] = min(z);
[zopt2 ,id2] = min(zopt2);
id1 = id1(id2);
xopt2 = x(id1 ,id2);
yopt2 = y(id1 ,id2);
figure;
FigureSet (1,4.5 ,2.75);
[x,y] = meshgrid (-5:0.1:5,-5:0.1 :5);
z = OptFn(x,y);
contour(x,y,z ,50);
h = get(gca ,Children );
set(h,LineWidth ,0.2);
axis(square);
hold on;
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 52
Example 5: Polak-Ribiere Conjugate Gradient
0 5 10 15 20 250
1
2
3
4
5
6
Iteration
Euc
lidea
n Po
sitio
n E
rror
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 50
-
AxisSet (8);
print -depsc PolakRibiereErrorLinear;
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 55
h = plot(a(:,1),a(:,2),k,a(:,1),a(:,2),r);
set(h(1),LineWidth ,1.2);
set(h(2),LineWidth ,0.6);
h = plot(xopt ,yopt ,kx,xopt ,yopt ,rx);
set(h(1),LineWidth ,1.5);
set(h(2),LineWidth ,0.5);
set(h(1),MarkerSize ,5);
set(h(2),MarkerSize ,4);
hold off;
xlabel(X);
ylabel(Y);
zoom on;
AxisSet (8);
print -depsc PolakRibiereContourA;
figure;
FigureSet (1,4.5 ,2.75);
[x,y] = meshgrid (1.5:0.01:2.5 ,-3.5:0.01:-2.5);
z = OptFn(x,y);
contour(x,y,z ,75);
h = get(gca ,Children );
set(h,LineWidth ,0.2);
axis(square);
hold on;
h = plot(a(:,1),a(:,2),k,a(:,1),a(:,2),r);
set(h(1),LineWidth ,1.2);
set(h(2),LineWidth ,0.6);
hold off;
xlabel(X);
ylabel(Y);
zoom on;
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 53
Parallel Tangents (PARTAN)
1. First gradient step
d := f(a) := argmin f(a + d) sp := d a := a + sp
2. Gradient Step
dg := f(a) := argmin f(a + d) sg := d a := a + sg
3. Conjugate Step
dp := sp + sg := argmin f(a + d) sp := d a := a + sp
4. Loop to 2 until convergence
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 56
AxisSet (8);
print -depsc PolakRibiereContourB;
figure;
FigureSet (2,4.5 ,2.75);
k = 1:ns;
xerr = (sum (((a-ones(ns ,1)*[ xopt2 yopt2])).^2)).^(1/2);
h = plot(k-1,xerr ,b);
set(h(1),Marker,.);
set(h,MarkerSize ,6);
xlabel(Iteration );
ylabel(Euclidean Position Error);
xlim ([0 ns -1]);
ylim ([0 xerr (1)]);
grid on;
set(gca ,Box,Off);
AxisSet (8);
print -depsc PolakRibierePositionError;
figure;
FigureSet (2,4.5 ,2.75);
k = 1:ns;
h = plot(k-1,f,b ,[0 ns],zopt *[1 1],r ,[0 ns],zopt2 *[1 1],g);
set(h(1),Marker,.);
set(h,MarkerSize ,6);
xlabel(Iteration );
ylabel(Function Value);
ylim ([0 f(1)]);
xlim ([0 ns -1]);
grid on;
set(gca ,Box,Off);
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 54
-
Example 6: PARTAN
1.5 2 2.53.5
3.4
3.3
3.2
3.1
3
2.9
2.8
2.7
2.6
2.5
X
Y
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 59
PARTAN Concept
a0
a1
a2a3
a4
a5
a6 a7
First two steps are steepest descent Thereafter, each iteration consists of two steps
1. Search along the direction
di = ai ai2where ai is the current point and ai2 is the point from twosteps ago
2. Search in the direction of the negative gradient
di = f(ai)
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 57
Example 6: PARTAN
0 5 10 15 20 250
1
2
3
4
5
6
7
Iteration
Func
tion
Val
ue
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 60
Example 6: PARTAN
5 0 55
4
3
2
1
0
1
2
3
4
5
X
Y
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 58
-
cnt = 2;
while cnt 0, % Line search in conjugate direction was successful
fprintf(P :);
x = xg + bp*d(1);
y = yg + bp*d(2);
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 63
Example 6: PARTAN
0 5 10 15 20 250
1
2
3
4
5
6
Iteration
Euc
lidea
n Po
sitio
n E
rror
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 61
else % Could not move - do another gradient update
cnt = cnt + 1;
a(cnt ,:) = a(cnt -1 ,:);
f(cnt) = f(cnt -1);
if cnt==ns ,
break;
end;
fprintf(G2:);
[z,g] = OptFn(xg,yg);
d = -g/norm(g); % Direction
[bp ,fmin] = LineSearch ([xg yg],d,b0 ,ls);
x = xg + bp*d(1);
y = yg + bp*d(2);
end;
% Update anchor point
xa = xg;
ya = yg;
cnt = cnt + 1;
a(cnt ,:) = [x y];
f(cnt) = OptFn(x,y);
fprintf( %d %5.3f\n,cnt ,f(cnt ));
end;
[x,y] = meshgrid (0+( -0 .01:0.001:0.01),3+(-0.01:0 .001:0.01 ));
[z,dzx ,dzy] = OptFn(x,y);
[zopt ,id1] = min(z);
[zopt ,id2] = min(zopt);
id1 = id1(id2);
xopt = x(id1 ,id2);
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 64
Example 6: MATLAB Code
function [] = Partan ();
%clear all;
close all;
ns = 26;
x = -3;
y = 1;
b0 = 0.01;
ls = 30;
a = zeros(ns ,2);
f = zeros(ns ,1);
[z,g] = OptFn(x,y);
a(1,:) = [x y];
f(1) = z;
xa = x;
ya = y;
% First step - substitute for a Conjugate step
d = -g/norm(g); % First direction
[bp,fmin] = LineSearch ([x y],d,b0 ,100);
x = x + bp*d(1); % Standin for a conjugate step
y = y + bp*d(2);
a(2,:) = [x y];
f(2) = fmin;
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 62
-
xlim ([0 ns -1]);
ylim ([0 xerr (1)]);
grid on;
set(gca ,Box,Off);
AxisSet (8);
print -depsc PartanPositionError;
figure;
FigureSet (2,4.5 ,2.75);
k = 1:ns;
h = plot(k-1,f,b ,[0 ns],zopt *[1 1],r ,[0 ns],zopt2 *[1 1],g);
set(h(1),Marker,.);
set(h,MarkerSize ,6);
xlabel(Iteration );
ylabel(Function Value);
ylim ([0 f(1)]);
xlim ([0 ns -1]);
grid on;
set(gca ,Box,Off);
AxisSet (8);
print -depsc PartanErrorLinear;
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 67
yopt = y(id1 ,id2);
[x,y] = meshgrid (1 .883+(-0.02:0.001:0.02),-2.963+(-0.02:0.001:0.02 ));
[z,dzx ,dzy] = OptFn(x,y);
[zopt2 ,id1] = min(z);
[zopt2 ,id2] = min(zopt2);
id1 = id1(id2);
xopt2 = x(id1 ,id2);
yopt2 = y(id1 ,id2);
figure;
FigureSet (1,4.5 ,2.75);
[x,y] = meshgrid (-5:0.1:5,-5:0.1 :5);
z = OptFn(x,y);
contour(x,y,z ,50);
h = get(gca ,Children );
set(h,LineWidth ,0.2);
axis(square);
hold on;
h = plot(a(:,1),a(:,2),k,a(:,1),a(:,2),r);
set(h(1),LineWidth ,1.2);
set(h(2),LineWidth ,0.6);
h = plot(xopt ,yopt ,kx,xopt ,yopt ,rx);
set(h(1),LineWidth ,1.5);
set(h(2),LineWidth ,0.5);
set(h(1),MarkerSize ,5);
set(h(2),MarkerSize ,4);
hold off;
xlabel(X);
ylabel(Y);
zoom on;
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 65
PARTAN Pros and Cons
a0
a1
a2a3
a4
a5
a6 a7
+ For quadratic functions, converges in a finite number of steps
+ Easier to implement than 2nd order methods
+ Can be used with large number of parameters
+ Each (composite) step is at least as good as steepest descent
+ Tolerant of inexact line searches
Each (composite) step requires two line searches
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 68
AxisSet (8);
print -depsc PartanContourA;
figure;
FigureSet (1,4.5 ,2.75);
[x,y] = meshgrid (1.5:0.01:2.5 ,-3.5:0.01:-2.5);
z = OptFn(x,y);
contour(x,y,z ,75);
h = get(gca ,Children );
set(h,LineWidth ,0.2);
axis(square);
hold on;
h = plot(a(:,1),a(:,2),k,a(:,1),a(:,2),r);
set(h(1),LineWidth ,1.2);
set(h(2),LineWidth ,0.6);
hold off;
xlabel(X);
ylabel(Y);
zoom on;
AxisSet (8);
print -depsc PartanContourB;
figure;
FigureSet (2,4.5 ,2.75);
k = 1:ns;
xerr = (sum (((a-ones(ns ,1)*[ xopt2 yopt2])).^2)).^(1/2);
h = plot(k-1,xerr ,b);
set(h(1),Marker,.);
set(h,MarkerSize ,6);
xlabel(Iteration );
ylabel(Euclidean Position Error);
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 66
-
Example 7: Newtons with Steepest Descent Safeguard
0 0.5 1 1.5 2
3
2.5
2
1.5
X
Y
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 71
Newtons Method
ak+1 = ak H(ak)1 f(ak)where f(ak) is the gradient and H(ak) is the hessian of f(a),
H(ak)
2f(a)a21
2f(a)a1 a2
. . . 2f(a)
a1 ap2f(a)a2 a1
2f(a)a22
. . . 2f(a)
a2 ap...
.... . .
...2f(a)ap a1
2f(a)ap a2
. . . 2f(a)a2p
Based on a quadratic approximation of the function f(a) If f(a) is quadratic, converges in one step If H(a) is positive-definite, the problem is well defined near local
minima where f(a) is nearly quadratic
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 69
Example 7: Newtons with Steepest Descent Safeguard
0 10 20 30 40 50 60 70 80 900
1
2
3
4
5
6
7
Iteration
Func
tion
Val
ue
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 72
Example 7: Newtons with Steepest Descent Safeguard
5 0 55
4
3
2
1
0
1
2
3
4
5
X
Y
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 70
-
y = y + b*d(2);
[z,g,H] = OptFn(x,y);
a(cnt ,:) = [x y];
f(cnt) = z;
end;
[x,y] = meshgrid (0+( -0 .01:0.001:0.01),3+(-0.01:0 .001:0.01 ));
[z,dzx ,dzy] = OptFn(x,y);
[zopt ,id1] = min(z);
[zopt ,id2] = min(zopt);
id1 = id1(id2);
xopt = x(id1 ,id2);
yopt = y(id1 ,id2);
[x,y] = meshgrid (1 .883+(-0.02:0.001:0.02),-2.963+(-0.02:0.001:0.02 ));
[z,dzx ,dzy] = OptFn(x,y);
[zopt2 ,id1] = min(z);
[zopt2 ,id2] = min(zopt2);
id1 = id1(id2);
xopt2 = x(id1 ,id2);
yopt2 = y(id1 ,id2);
figure;
FigureSet (1,4.5 ,2.75);
[x,y] = meshgrid (-5:0.1:5,-5:0.1 :5);
z = OptFn(x,y);
contour(x,y,z ,50);
h = get(gca ,Children );
set(h,LineWidth ,0.2);
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 75
Example 7: Newtons with Steepest Descent Safeguard
0 10 20 30 40 50 60 70 80 900
1
2
3
4
5
6
Iteration
Euc
lidea
n Po
sitio
n E
rror
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 73
axis(square);
hold on;
h = plot(a(:,1),a(:,2),k,a(:,1),a(:,2),r);
set(h(1),LineWidth ,1.2);
set(h(2),LineWidth ,0.6);
h = plot(xopt ,yopt ,kx,xopt ,yopt ,rx);
set(h(1),LineWidth ,1.5);
set(h(2),LineWidth ,0.5);
set(h(1),MarkerSize ,5);
set(h(2),MarkerSize ,4);
hold off;
xlabel(X);
ylabel(Y);
zoom on;
AxisSet (8);
print -depsc NewtonsContourA;
figure;
FigureSet (1,4.5 ,2.75);
[x,y] = meshgrid (1.0 + (-1:0.02:1), -2.4 + (-1:0.02 :1));
z = OptFn(x,y);
contour(x,y,z ,75);
h = get(gca ,Children );
set(h,LineWidth ,0.2);
axis(square);
hold on;
h = plot(a(:,1),a(:,2),k,a(:,1),a(:,2),r);
set(h(1),LineWidth ,1.2);
set(h(2),LineWidth ,0.6);
hold off;
xlabel(X);
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 76
Example 7: Relevant MATLAB Code
function [] = Newtons ();
%clear all;
close all;
ns = 100;
x = -3; % Starting x
y = 1; % Starting y
b0 = 1;
a = zeros(ns ,2);
f = zeros(ns ,1);
[z,g,H] = OptFn(x, y);
a(1,:) = [x y];
f(1) = z;
for cnt = 2:ns,
d = -inv(H)*g;
if d*g>0, % Revert to steepest descent if is not direction of descent
%fprintf ((%2d of %2d) Min. Eig :%5 .3f Reverting...\n,cnt ,ns,min(eig(H)));
d = -g;
end;
d = d/norm(d);
[b,fmin] = LineSearch ([x y],d,b0 ,100);
%a(cnt ,:) = (a(cnt -1,:) - inv(H)*g); % Pure Newton s Method
x = x + b*d(1);
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 74
-
Newtons Method Pros and Cons
ak+1 = ak H(ak)1 f(ak)
+ Very fast convergence near local minima
Not guaranteed to converge (may actually diverge) Requires p p Hessian Requires a p p matrix inverse that uses O(p3) operations
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 79
ylabel(Y);
zoom on;
AxisSet (8);
print -depsc NewtonsContourB;
figure;
FigureSet (2,4.5 ,2.75);
k = 1:ns;
xerr = (sum (((a-ones(ns ,1)*[ xopt2 yopt2])).^2)).^(1/2);
h = plot(k-1,xerr ,b);
set(h(1),Marker,.);
set(h,MarkerSize ,6);
xlabel(Iteration );
ylabel(Euclidean Position Error);
xlim ([0 ns -1]);
ylim ([0 xerr (1)]);
grid on;
set(gca ,Box,Off);
AxisSet (8);
print -depsc NewtonsPositionError;
figure;
FigureSet (2,4.5 ,2.75);
k = 1:ns;
h = plot(k-1,f,b ,[0 ns],zopt *[1 1],r ,[0 ns],zopt2 *[1 1],g);
set(h(1),Marker,.);
set(h,MarkerSize ,6);
xlabel(Iteration );
ylabel(Function Value);
ylim ([0 f(1)]);
xlim ([0 ns -1]);
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 77
Levenberg-Marquardt
1. Determine if kI + H(ak) is positive definite. If not, k := 4kand repeat.
2. Solve the following equation for ak+1
[kI + H(ak)] (ak+1 ak) = f(ak)
3.
rk f(ak) f(ak+1)q(ak) q(ak+1)
where q(a) is the quadratic approximation of f(a) based on thef(a), f(a), and H(ak)
4. If rk < 0.25, then k+1 := 4kIf rk > 0.75, then k+1 := 12kIf rk 0, then ak+1 := ak
5. If not converged, k := k + 1 and loop to 1.
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 80
grid on;
set(gca ,Box,Off);
AxisSet (8);
print -depsc NewtonsErrorLinear;
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 78
-
Example 8: Levenberg-Marquardt Conjugate Gradient
1.5 2 2.53.5
3.4
3.3
3.2
3.1
3
2.9
2.8
2.7
2.6
2.5
X
Y
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 83
Levenberg-Marquardt Comments
Similar to Newtons method Has safety provisions for regions where quadratic approximation is
inappropriate
CompareNewtons: ak+1 = ak H(ak)1 f(ak)
LM : [kI + H(ak)] (ak+1 ak) = f(ak)
If = 0, these are equivalent If , ak+1 ak is chosen to ensure that the smallest eigenvalue of H(ak) is
positive and sufficiently large ( )
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 81
Example 8: Levenberg-Marquardt Conjugate Gradient
0 5 10 15 20 250
1
2
3
4
5
6
7
Iteration
Func
tion
Val
ue
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 84
Example 8: Levenberg-Marquardt Conjugate Gradient
5 0 55
4
3
2
1
0
1
2
3
4
5
X
Y
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 82
-
y = a(cnt ,2);
zo = zn; % Old function value
zn = OptFn(x,y);
xd = (a(cnt ,:)-ap);
qo = zo;
qn = zn + g*xd + 0.5*xd *H*xd;
if qo==qn , % Test for convergence
x = a(cnt ,1);
y = a(cnt ,2);
a(cnt:ns ,:) = ones(ns -cnt +1 ,1)*[x y];
f(cnt:ns ,:) = OptFn(x,y);
break;
end;
r = (zo -zn)/(qo-qn);
if r0.50 , % 0.75 is recommended , but much slower
eta = eta / 2;
end;
if zn>zo , % Back up
a(cnt ,:) = a(cnt -1 ,:);
else
ap = a(cnt ,:);
end;
x = a(cnt ,1);
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 87
Example 8: Levenberg-Marquardt Conjugate Gradient
0 5 10 15 20 250
1
2
3
4
5
6
Iteration
Euc
lidea
n Po
sitio
n E
rror
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 85
y = a(cnt ,2);
a(cnt ,:) = [x y];
f(cnt) = OptFn(x,y);
%disp([cnt a(cnt ,:) f(cnt) r eta])
end;
[x,y] = meshgrid (0+( -0 .01:0.001:0.01),3+(-0.01:0 .001:0.01 ));
[z,dzx ,dzy] = OptFn(x,y);
[zopt ,id1] = min(z);
[zopt ,id2] = min(zopt);
id1 = id1(id2);
xopt = x(id1 ,id2);
yopt = y(id1 ,id2);
[x,y] = meshgrid (1 .883+(-0.02:0.001:0.02),-2.963+(-0.02:0.001:0.02 ));
[z,dzx ,dzy] = OptFn(x,y);
[zopt2 ,id1] = min(z);
[zopt2 ,id2] = min(zopt2);
id1 = id1(id2);
xopt2 = x(id1 ,id2);
yopt2 = y(id1 ,id2);
figure;
FigureSet (1,4.5 ,2.75);
[x,y] = meshgrid (-5:0.1:5,-5:0.1 :5);
z = OptFn(x,y);
contour(x,y,z ,50);
h = get(gca ,Children );
set(h,LineWidth ,0.2);
axis(square);
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 88
Example 8: Relevant MATLAB Code
function [] = LevenbergMarquardt ();
%clear all;
close all;
ns = 26;
x = -3; % Starting x
y = 1; % Starting y
eta = 0.0001;
a = zeros(ns ,2);
f = zeros(ns ,1);
[zn,g,H] = OptFn(x, y);
a(1,:) = [x y];
f(1) = zn;
ap = [x y]; % Previous point
for cnt = 2:ns,
[zn ,g,H] = OptFn(x,y);
while min(eig(eta*eye (2)+H))
-
set(gca ,Box,Off);
AxisSet (8);
print -depsc LevenbergMarquardtErrorLinear;
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 91
hold on;
h = plot(a(:,1),a(:,2),k,a(:,1),a(:,2),r);
set(h(1),LineWidth ,1.2);
set(h(2),LineWidth ,0.6);
h = plot(xopt ,yopt ,kx,xopt ,yopt ,rx);
set(h(1),LineWidth ,1.5);
set(h(2),LineWidth ,0.5);
set(h(1),MarkerSize ,5);
set(h(2),MarkerSize ,4);
hold off;
xlabel(X);
ylabel(Y);
zoom on;
AxisSet (8);
print -depsc LevenbergMarquardtContourA ;
figure;
FigureSet (1,4.5 ,2.75);
[x,y] = meshgrid (1.5:0.01:2.5 ,-3.5:0.01:-2.5);
z = OptFn(x,y);
contour(x,y,z ,75);
h = get(gca ,Children );
set(h,LineWidth ,0.2);
axis(square);
hold on;
h = plot(a(:,1),a(:,2),k,a(:,1),a(:,2),r);
set(h(1),LineWidth ,1.2);
set(h(2),LineWidth ,0.6);
hold off;
xlabel(X);
ylabel(Y);
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 89
Levenberg-Marquardt Pros and Cons
[kI + H(ak)] (ak+1 ak) = f(ak)
Many equivalent formulations+ No line search required
+ Can be used with approximations to the hessian
+ Extremely fast convergence (2nd order)
Requires gradient and hessian (or approximate hessian) Requires O(p3) operations for each solution to the key equation
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 92
zoom on;
AxisSet (8);
print -depsc LevenbergMarquardtContourB ;
figure;
FigureSet (2,4.5 ,2.75);
k = 1:ns;
xerr = (sum (((a-ones(ns ,1)*[ xopt2 yopt2])).^2)).^(1/2);
h = plot(k-1,xerr ,b);
set(h(1),Marker,.);
set(h,MarkerSize ,6);
xlabel(Iteration );
ylabel(Euclidean Position Error);
xlim ([0 ns -1]);
ylim ([0 xerr (1)]);
grid on;
set(gca ,Box,Off);
AxisSet (8);
print -depsc LevenbergMarquardtPositionError;
figure;
FigureSet (2,4.5 ,2.75);
k = 1:ns;
h = plot(k-1,f,b ,[0 ns],zopt *[1 1],r ,[0 ns],zopt2 *[1 1],g);
set(h(1),Marker,.);
set(h,MarkerSize ,6);
xlabel(Iteration );
ylabel(Function Value);
ylim ([0 f(1)]);
xlim ([0 ns -1]);
grid on;
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 90
-
Optimization Algorithm Summary
Algorithm Convergence Stable f(a) H(a) LSCyclic Coordinate Slow Y N N YSteepest Descent Slow Y Y N YConjugate Gradient Fast N Y N YPARTAN Fast Y Y N YNewtons Method Very Fast N Y Y NLevenberg-Marquardt Very Fast Y Y Y N
J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 93