application of order statistics to termination of stochastic algorithms vaida bartkutė, leonidas...

APPLICATION OF ORDER STATISTICS TO TERMINATION OF STOCHASTIC ALGORITHMS Vaida Bartkut, Leonidas Sakalauskas Slide 2 Introduction; Application of order statistics to optimality testing and termination of the algorithm: Stochastic Approximation algorithms ; Simulated Annealing algorithm ; Experimental results; Conclusions. Outline Slide 3 Introduction; Application of order statistics to optimality testing and termination of the algorithm: Stochastic Approximation algorithms ; Simulated Annealing algorithm ; Experimental results; Conclusions. Outline Slide 4 Termination of the algorithm is a topical problem in stochastic and heuristic optimization. We consider the application of order statistics to establish the optimality in Markov type optimization algorithms. We build a method for the estimation of minima confidence intervals using order statistics, which is implemented for optimality testing and termination. Introduction Slide 5 The optimization problem is (minimization) as follows: where is a bounded from below locally Lipshitz function. Denote the generalized gradient of this function by Let be the sequence constructed by stochastic search algorithm, where t =f(x t ), t = 0, 1, .. Statement of the problem Slide 6 The Markovian algorithm of random searching represents a Markov chain in which the distribution of probabilities of a point x t+1 depends on a location of the previous point x t and value of function t =f(x t ) in it, that Examples: Stochastic Approximation; Simulated Annealing; Random Search (Rastrigin method) and etc. The Markovian algorithms for optimization Slide 7 Beginning of the problem: Mockus (1968) Theoretical background: Zilinskas, Zhigljavsky (1991) Application to maximum location: Chen (1996) Time-to-target-solution value: Aiex, Resende, & Ribeiro, (2002), Pardalos (2005). Order statistics and target values for optimality testing and termination Slide 8 We build a method for estimation of minimum M of the objective function using values of the function provided in optimization: Let only k+1 order statistics from the sample H to be chosen:, where. Method for optimality testing by order statistics Slide 9 Let apply linear estimators for estimation of the minimum: where. We examine a simple set (Hall (1982)): Let apply linear estimators for estimation of the minimum: where. We examine a simple set (Hall (1982)): Let apply linear estimators for estimation of the minimum: where. We examine a simple set (Hall (1982)): Slide 10 - the parameter of extreme values distribution; n dimension; - the parameter of homogeneity of the function f(x) (Zilinskas & Zhigliavsky (1991)). where, where is a confidence level. [][] The one side confidence interval for the minimum value of the objective function is: Slide 11 The smoothing is the standard way for the nondifferentiable optimization. We consider a function smoothed by Lipshitz perturbation operator: where is the value of the perturbation parameter, is a random vector distributed with density p(.). If density p(.) is locally Lipshitz then functions smoothed by this operator are twice continuously differentiable (Rubinstein & Shapiro (1993), Bartkute & Sakalauskas (2004)). Stochastic Approximation Slide 12 where stochastic gradient, This scheme is the same for different Stochastic Approximation algorithms whose distinguish only by approach to stochastic gradient estimation. The minimizing sequence converges a.s. to solution of the optimization problem under conditions typical for SA algorithms (Ermolyev (1976), Mikhalevitch et at (1987), Spall (1992), Bartkute & Sakalauskas (2004)). is a scalar multiplier. Slide 13 ALGORITHMESTIMATE OF STOCHASTIC GRADIENT SPSAL, Lipshitz smoothing density (Bartkute & Sakalauskas (2007)) - uniformly distributed in the unit ball. SPSAU uniformly distributed density in the hypercube (Michalevitch et al (1976), (1987)) - uniformly distributed in the hypercube [-1;1] n. FDSA standard finite differences (Ermoliev (1988), Mikhalevitch et al (1987)) - uniformly distributed in the unit ball, - with zero components except ith, equal to 1. - smoothing parameter. Slide 14 Let consider that the function f(x) has a sharp minimum in the point, in which the algorithm converges when Then where A>0, H>0, K>0 are certain constants, is minimum point of the smoothed function (Sakalauskas, Bartkute (2007)). Rate of Convergence Slide 15 Unimodal testing functions (SPSAL, SPSAU, FDSA) Generated funkcions with sharp minimum- CB3 - Rozen Suzuki- Multiextremal testing functions (Simulated Annealing (SA)) Branin - Beale- Rastrigin- Experimental results Slide 16 The coefficients of the optimizing sequence were chosen according to convergence conditions (Bartkute & Sakalauskas (2006)): The samples of T=500 test functions were generated, when and minimized by SPSA with Lipshitz perturbation. Slide 17 If order statistics follows from Weibull distribution, then distributed with respect to Pareto distribution (ilinskas, Zhigljavsky (1991)): Thus, statistical hypothesis tested: H0:H0:. Testing hypothesis about Pareto distribution Slide 18 The hypothesis tested by criteria 2 ( ) for various stochastic algorithms (critical value 0,46) Testing hypothesis about Pareto distribution Slide 19 [ ], =0.95 One side confidence interval V. Bartkute, L. Sakalauskas. Application of Order Statistics to termination of Stochastic lgorithms. CWU Workshop, December 10-12, 2007 Slide 20 Confidence bounds of the minimum Slide 21 Confidence bounds of the hitting probability Slide 22 To stop the algorithm when minima confidence interval becomes less admissible value : Termination criterion of the algorithms Slide 23 Number of iterations after the termination of the algorithm Slide 24 I. Choose temperature updating function neighborhood size function solution generation density function and initial solution x 0 (Yang (2000)). Simulated Annealing Algorithm II. Construct the optimizing sequence: Slide 25 Let consider results of optimality testing with Beale testing function: F(x,y) = (1.5-x+xy) 2 + (2.25-x+xy 2 ) 2 + (2.625-x+xy 3 ) 2, where search domain: -4.5 x,y 4.5. It is known that this function has few local minima and global minimum is 0 at the point (3; 0.5). Experimental results Slide 26 Confidence bounds of the minimum Slide 27 Confidence bounds of the hitting probability Slide 28 Number of iterations after the termination of the algorithm Slide 29 Linear estimator for minimum has been proposed using theory of order statistics, which was studied by experimental way; Developed procedures are simple and depend only on the parameter of extreme values distribution ; Parameter is easily estimated using a homogeneity of the objective function or by statistical way; Theoretical considerations and computer examples have shown that we can estimate the confidence interval of a function extremum with an admissible accuracy, when the number of iterations increased; Termination rule using the minimum confidence interval was proposed and implemented to Stochastic Approximation and Simulated Annealing. Conclusions

application of order statistics to termination of stochastic algorithms vaida bartkutė, leonidas...

Documents