multidimensional optimization: the simplex...

44
Multidimensional Optimization: The Simplex Method Lecture 16 Biostatistics 615/815 Biostatistics 615/815

Upload: others

Post on 07-Apr-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Multidimensional Optimization:pThe Simplex Method

Lecture 16Biostatistics 615/815Biostatistics 615/815

Page 2: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Homework 5: HashingP i l ill i f diff h hiPractical illustration of different hashing strategies• For a given hash table size, double hashing isFor a given hash table size, double hashing is

much more efficient• Even greater performance savings are available

by switching to a larger hash tabley g g

Major differences:• F d bl h hi t diff t i• For double hashing, use two different primes ….• Different random number generators can affect

performance a bit …

Page 3: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Previous Topic: Previous Topic: One-Dimensional Optimization

Bracketingg

Golden SearchGolden Search

Q d ti A i tiQuadratic Approximation

Page 4: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Bracketing

Find 3 points such that • a < b < c• f(b) < f(a) and f(b) < f(c)

L t i i b d ll t i iLocate minimum by gradually trimming bracketing interval

Bracketing provides additional confidence in resultin result

Page 5: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

The Golden Ratio

Bracketing Triplet

A B C

New Point

A B CX

0.38196 0.38196

Page 6: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Parabolic Interpolation

For well behaved functions, faster than Golden Search

Page 7: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Today:Today:Multidimensional Optimization

Illustrate the method of Nelder and Mead• Simplex Method• Ni k d "A b "• Nicknamed "Amoeba"

Simple and in practice quite robustSimple and, in practice, quite robust• Counter examples are known

Discuss other standard methods

Page 8: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

C Utility Functions:C Utility Functions:Allocating Vectors

E ll i fEase allocation of vectors.Peppered through today's examples

double * alloc_vector(int cols){return (double *) malloc(sizeof(double) * cols);}}

void free_vector(double * vector, int cols){free(vector);}

Page 9: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

C Utility Functions:C Utility Functions:Allocating Matrices

double ** alloc_matrix(int rows, int cols){ int i;double ** matrix = (double **) malloc(sizeof(double *) * rows);

for (i = 0; i < rows; i++)matrix[i] = alloc_vector(cols);

return matrix;}}

void free_matrix(double ** matrix, int rows, int cols){ i iint i;for (i = 0; i < rows; i++)

free_vector(matrix[i], cols);free(matrix);}}

Page 10: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

The Simplex MethodCalculate likelihoods at simplex vertices• Geometric shape with k+1 corners• E g a triangle in k = 2 dimensionsE.g. a triangle in k = 2 dimensions

Simplex crawls p• Towards minimum• Away from maximum

Probably the most widely used optimization method

Page 11: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

A Simplex in Two Dimensions

Evaluate function at verticesx1

Note:• The highest (worst) point• Th t hi h t i t

x0 • The next highest point• The lowest (best) point

Intuition:• Move away from high point,

x2

towards low point

Page 12: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

C Code:C Code:Creating A Simplexdouble ** make_simplex(double * point, int dim)

{int i, j;double ** simplex = alloc matrix(dim + 1, dim);doub e s p e a oc_ at (d , d );

for (int i = 0; i < dim + 1; i++)for (int j = 0; j < dim; j++)

i l [i][j] i t[j]simplex[i][j] = point[j];

for (int i = 0; i < dim; i++)simplex[i][i] += 1.0;p

return simplex;}

Page 13: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

C Code:C Code:Evaluating Function at Vertices

Thi f i i i lThis function is very simple• This is a good thing!• Making each function almost trivial makes debugging easy

void evaluate_simplex(double ** simplex, int dim,double * fx double (* func)(double * int))double * fx, double (* func)(double *, int))

{for (int i = 0; i < dim + 1; i++)

fx[i] = (*func)(simplex[i] dim);fx[i] = ( func)(simplex[i], dim);}

Page 14: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

C Code:C Code:Finding Extremesvoid simplex_extremes(double *fx, int dim, int & ihi, int & ilo,

int & inhi){int i;

if (fx[0] > fx[1]){ ihi = 0; ilo = inhi = 1; }

else{ ihi = 1; ilo = inhi = 0; }{ ihi = 1; ilo = inhi = 0; }

for (i = 2; i < dim + 1; i++)if (fx[i] <= fx[ilo])

ilo = i;ilo = i;else if (fx[i] > fx[ihi])

{ inhi = ihi; ihi = i; }else if (fx[i] > fx[inhi])

inhi = i;inhi = i;}

Page 15: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Direction for Optimization

x1Line through worst pointand average of other points

x0mid

and average of other points

x2 Average of all points, excluding worst point

Page 16: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

C Code:C Code:Direction for Optimizationvoid simplex_bearings(double ** simplex, int dim,

double * midpoint, double * line, int ihi){int i, j;for (j = 0; j < dim; j++)

midpoint[j] = 0.0;

for (i = 0; i < dim + 1; i++)if (i ! i i)if (i != ihi)

for (j = 0; j < dim; j++)midpoint[j] += simplex[i][j];

f (j 0 j di j )for (j = 0; j < dim; j++){midpoint[j] /= dim;line[j] = simplex[ihi][j] - midpoint[j];}}

}

Page 17: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Reflection

x1

x0mid x'

x2

This is the default new trial point

Page 18: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Reflection and Expansion

x1If reflection results in new minimum…

x0mid x' x''

Move further alongminimization direction

x2

Page 19: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Contraction (One Dimension)

x1Try a smaller step

x0mid x'x''

x2

If x' is still the worst point…

Page 20: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

C Code: Updating The SimplexC Code: Updating The Simplexint update_simplex(double * point, int dim, double & fmax,

double * midpoint, double * line, double scale,double (* func)(double *, int))

{int i, update = 0; double * next = alloc_vector(dim), fx;

for (i = 0; i < dim; i++)for (i = 0; i < dim; i++)next[i] = midpoint[i] + scale * line[i];

fx = (*func)(next, dim);

if (fx < fmax) ( a ){for (i = 0; i < dim; i++)

point[i] = next[i];fmax = fx;update = 1;}

free_vector(next, dim);t d treturn update;

}

Page 21: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Contraction …

x1 "passing through the eye of a needle"

x0mid

If a simple contraction doesn't improve things, then try moving

all points towards the currentx2

all points towards the currentminimum

Page 22: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

C Code:C Code:Contracting The Simplexvoid contract_simplex(double ** simplex, int dim,

double * fx, int ilo, double (*func)(double *, int))

{int i, j;

for (int i = 0; i < dim + 1; i++)if (i != ilo)

{{for (int j = 0; j < dim; j++)

simplex[i][j] = (simplex[ilo][j]+simplex[i][j])*0.5;fx[i] = (*func)(simplex[i], dim);}}

}

Page 23: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Summary:Summary:The Simplex Method

highOriginal Simplex

highlow

reflection contractionreflection contraction

reflection andmultiplecontraction

expansion

Page 24: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

C Code:C Code:Minimization Routine (Part I)

D l l l i bl d llDeclares local variables and allocates memory

double amoeba(double *point, int dim,double (*func)(double *, int),double tol)

{int ihi, ilo, inhi, j;int ihi, ilo, inhi, j;double fmin;double * fx = alloc_vector(dim + 1);double * midpoint = alloc_vector(dim);d bl * li ll t (di )double * line = alloc_vector(dim);double ** simplex = make_simplex(point, dim);

evaluate simplex(simplex, dim, fx, func);_ p ( p , , , )

Page 25: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

C Code:C Code:Minimization Route (Part II)while (true)

{simplex_extremes(fx, dim, ihi, ilo, inhi);simplex_bearings(simplex, dim, midpoint, line, ihi);

if (check_tol(fx[ihi], fx[ilo], tol)) break;

update_simplex(simplex[ihi], dim, fx[ihi],i i i 0 )midpoint, line, -1.0, func);

if (fx[ihi] < fx[ilo])update_simplex(simplex[ihi], dim, fx[ihi],

id i li 2 0 f )midpoint, line, -2.0, func);else if (fx[ihi] >= fx[inhi])

if (!update_simplex(simplex[ihi], dim, fx[ihi],midpoint, line, 0.5, func))

i l ( i l di f il f )contract_simplex(simplex, dim, fx, ilo, func);}

Page 26: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

C Code:C Code:Minimization Routine (Part III)

S h l d fStore the result and free memory

for (j = 0; j < dim; j++)point[j] = simplex[ilo][j];point[j] = simplex[ilo][j];

fmin = fx[ilo];

free_vector(fx, dim);free_vector(midpoint, dim);free_vector(line, dim);free_matrix(simplex, dim + 1, dim);

return fmin;}

Page 27: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

C Code:C Code:Checking Convergence##include <math.h>

#define ZEPS 1e-10

int check_tol(double fmax, double fmin, double ftol){double delta = fabs(fmax - fmin);d bl (f b (f ) + f b (f i )) * ft ldouble accuracy = (fabs(fmax) + fabs(fmin)) * ftol;

return (delta < (accuracy + ZEPS));}

Page 28: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

amoeba()

A general purpose minimization routine• Works in multiple dimensions• U l f i l i• Uses only function evaluations• Does not require derivatives

Typical usage:• f (d bl * i t ) { }•my_func(double * x, int n) { … }•amoeba(point, dim, my_func, 1e-7);

Page 29: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Example ApplicationExample ApplicationOld Faithful Eruptions (n = 272)

Old Faithful Eruptions

ncy

1520

Freq

ue

510

1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

05

Duration (mins)

Page 30: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Fitting a Normal Distribution

Fit two parameters• Mean• V i• Variance

Requires ~165 likelihood evaluationsRequires 165 likelihood evaluations• Mean = 3.4878• Variance = 1.2979

• Maximum log-likelihood = -421.42

Page 31: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Nice fit, eh?

00.

35

Fitted DistributionOld Faithful Eruptions

0.20

0.25

0.3

ensi

ty

quen

cy

1520

0.10

0.15

De

Freq

510

1 2 3 4 5 6

0.05

Duration (mins)Duration (mins)

1 2 3 4 5 6

0

Page 32: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

A Mixture of Two NormalsFiFit 5 parameters• Proportion in the first component• Two means• Two variances

Required about ~700 evaluationsRequired about 700 evaluations• First component contributes 0.34841 of mixture• Means are 2.0186 and 4.2734• Variances are 0 055517 and 0 19102Variances are 0.055517 and 0.19102

• Maximum log-likelihood = -276.36

Page 33: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Two Components

Old Faithful Eruptions

50.

6

Fitted Distribution

quen

cy

1520

0.3

0.4

0.5

ensi

ty

Freq

510

0.1

0.2

De

Duration (mins)

1 2 3 4 5 6

0

1 2 3 4 5 6

0.0

Duration (mins)

Page 34: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

A Mixture of Three NormalsFi 8Fit 8 parameters• Proportion in the first two components• Three means• Th i• Three variances

Required about ~1400 evaluations• Did not always converge!

One of the best solutionsOne of the best solutions …• Components contributing .339, 0.512 and 0.149• Component means are 2.002, 4.401 and 3.727• Variances are 0 0455 0 106 0 2959Variances are 0.0455, 0.106, 0.2959

• Maximum log-likelihood = -267.89

Page 35: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Three Components

Old Faithful Eruptions

0.6

0.7

Fitted Distribution

quen

cy

1520

0.4

0.5

ensi

ty

Freq

510

0.1

0.2

0.3D

e

Duration (mins)

1 2 3 4 5 6

0

1 2 3 4 5 6

0.0

Duration (mins)

Page 36: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Tricky Minimization Questions

Fitting variables that are constrained• Proportions vary between 0 and 1• Variances must be positive

Selecting the number of components

Checking convergence

Page 37: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Improvements to amoeba()

Different scaling along each dimension• If parameters have different impact on the likelihood

Track total function evaluations• Avoid getting stuck if function does not cooperateAvoid getting stuck if function does not cooperate

Rotate simplexp• If the current simplex is leading to slow improvement

Page 38: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

optim() Function in R

optim(point, function, method)

• Point – starting point for minimization• Function that accepts point as argumentp p g• Method can be

• "Nelder-Mead" for simplex method (default)• "BFGS", "CG" and other options use gradient

Page 39: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

One parameter at a timeSimple but inefficient approach

Consider• Parameters θ = (θ1, θ2, …, θk)• Function f (θ)

M i i θ ith t t h θ i tMaximize θ with respect to each θi in turn• Cycle through parameters

Page 40: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

The Inefficiency…

θ2

θ1

Page 41: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Steepest Descent

Consider• Parameters θ = (θ1, θ2, …, θk)

F i f( )• Function f(θ; x)

SScore vector

⎟⎟⎞

⎜⎜⎛

==fdfdfdS lnlnln

• Find maximum along θ + δS

⎟⎟⎠

⎜⎜⎝

==kddd

Sθθθ

...,,1

Page 42: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Still inefficient…

Consecutive steps are still perpendicular!

Page 43: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Multidimensional Minimization

Typically, sophisticated methods will…

Use derivatives • May be calculated numerically. How?

Select a direction for minimization, using:• W i ht d f i di ti• Weighted average of previous directions• Current gradient• Avoid right angle turnsAvoid right angle turns

Page 44: Multidimensional Optimization: The Simplex Methodcsg.sph.umich.edu/abecasis/class/2008/615.16.pdf · 2008-11-06 · Homework 5: Hashing zP i l ill i f diff h hiPractical illustration

Recommended Reading

Numerical Recipes in C (or C++, or Fortran)• Press, Teukolsky, Vetterling, Flannery• Ch t 10 4• Chapter 10.4

Clear description of Simplex MethodClear description of Simplex Method• Other sub-chapters illustrate more sophisticated methods

Online at• http://www.numerical-recipes.com/