metaheuristic optimization: algorithm analysis and open problems
DESCRIPTION
This is a keynote talk at the 10th Symposium of Experimental Algorithms (SEA2011) in Greece, 2011.TRANSCRIPT
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Metaheristics Optimization: Algorithm Analysisand Open Problems
Xin-She Yang
National Physical Laboratory, UK
@ SEA 2011
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Intro
Intro
Computational science is now the third paradigm of science,complementing theory and experiment.
- Ken Wilson (Cornell University), Nobel Laureate.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Intro
Intro
Computational science is now the third paradigm of science,complementing theory and experiment.
- Ken Wilson (Cornell University), Nobel Laureate.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Intro
Intro
Computational science is now the third paradigm of science,complementing theory and experiment.
- Ken Wilson (Cornell University), Nobel Laureate.
All models are wrong, but some are useful.
- George Box, Statistician
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Intro
Intro
Computational science is now the third paradigm of science,complementing theory and experiment.
- Ken Wilson (Cornell University), Nobel Laureate.
All models are inaccurate, but some are useful.
- George Box, Statistician
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Intro
Intro
Computational science is now the third paradigm of science,complementing theory and experiment.
- Ken Wilson (Cornell University), Nobel Laureate.
All models are inaccurate, but some are useful.
- George Box, Statistician
All algorithms perform equally well on average over all possiblefunctions.
- No-free-lunch theorems (Wolpert & Macready)
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Intro
Intro
Computational science is now the third paradigm of science,complementing theory and experiment.
- Ken Wilson (Cornell University), Nobel Laureate.
All models are inaccurate, but some are useful.
- George Box, Statistician
All algorithms perform equally well on average over all possiblefunctions. How so?
- No-free-lunch theorems (Wolpert & Macready)
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Intro
Intro
Computational science is now the third paradigm of science,complementing theory and experiment.
- Ken Wilson (Cornell University), Nobel Laureate.
All models are inaccurate, but some are useful.
- George Box, Statistician
All algorithms perform equally well on average over all possiblefunctions. Not quite! (more later)
- No-free-lunch theorems (Wolpert & Macready)
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Intro
Intro
Computational science is now the third paradigm of science,complementing theory and experiment.
- Ken Wilson (Cornell University), Nobel Laureate.
All models are inaccurate, but some are useful.
- George Box, Statistician
All algorithms perform equally well on average over all possiblefunctions. Not quite! (more later)
- No-free-lunch theorems (Wolpert & Macready)
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Overview
Overview
Introduction
Metaheuristic Algorithms
Applications
Markov Chains and Convergence Analysis
Exploration and Exploitation
Free Lunch or No Free Lunch?
Open Problems
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Metaheuristic Algorithms
Metaheuristic Algorithms
Essence of an Optimization Algorithm
To move to a new, better point xi+1 from an existing knownlocation xi .
x1
x2
xi
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Metaheuristic Algorithms
Metaheuristic Algorithms
Essence of an Optimization Algorithm
To move to a new, better point xi+1 from an existing knownlocation xi .
x1
x2
xi
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Metaheuristic Algorithms
Metaheuristic Algorithms
Essence of an Optimization Algorithm
To move to a new, better point xi+1 from an existing knownlocation xi .
x1
x2
xi
xi+1
?
Population-based algorithms use multiple, interacting paths.
Different algorithms
Different strategies/approaches in generating these moves!
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Optimization Algorithms
Optimization Algorithms
Deterministic
Newton’s method (1669, published in 1711), Newton-Raphson(1690), hill-climbing/steepest descent (Cauchy 1847),least-squares (Gauss 1795),
linear programming (Dantzig 1947), conjugate gradient(Lanczos et al. 1952), interior-point method (Karmarkar1984), etc.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Stochastic/Metaheuristic
Stochastic/Metaheuristic
Genetic algorithms (1960s/1970s), evolutionary strategy(Rechenberg & Swefel 1960s), evolutionary programming(Fogel et al. 1960s).
Simulated annealing (Kirkpatrick et al. 1983), Tabu search(Glover 1980s), ant colony optimization (Dorigo 1992),genetic programming (Koza 1992), particle swarmoptimization (Kennedy & Eberhart 1995), differentialevolution (Storn & Price 1996/1997),
harmony search (Geem et al. 2001), honeybee algorithm(Nakrani & Tovey 2004), ..., firefly algorithm (Yang 2008),cuckoo search (Yang & Deb 2009), ...
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Steepest Descent/Hill Climbing
Steepest Descent/Hill Climbing
Gradient-Based Methods
Use gradient/derivative information – very efficient for local search.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Steepest Descent/Hill Climbing
Steepest Descent/Hill Climbing
Gradient-Based Methods
Use gradient/derivative information – very efficient for local search.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Steepest Descent/Hill Climbing
Steepest Descent/Hill Climbing
Gradient-Based Methods
Use gradient/derivative information – very efficient for local search.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Steepest Descent/Hill Climbing
Steepest Descent/Hill Climbing
Gradient-Based Methods
Use gradient/derivative information – very efficient for local search.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Steepest Descent/Hill Climbing
Steepest Descent/Hill Climbing
Gradient-Based Methods
Use gradient/derivative information – very efficient for local search.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Steepest Descent/Hill Climbing
Steepest Descent/Hill Climbing
Gradient-Based Methods
Use gradient/derivative information – very efficient for local search.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Newton’s Method
xn+1 = xn −H−1∇f , H =
∂2f∂x1
2 · · · ∂2f∂x1∂xn
.... . .
...∂2f
∂xn∂x1· · · ∂2f
∂xn2
.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Newton’s Method
xn+1 = xn −H−1∇f , H =
∂2f∂x1
2 · · · ∂2f∂x1∂xn
.... . .
...∂2f
∂xn∂x1· · · ∂2f
∂xn2
.
Quasi-Newton
If H is replaced by I, we have
xn+1 = xn − αI∇f (xn).
Here α controls the step length.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Newton’s Method
xn+1 = xn −H−1∇f , H =
∂2f∂x1
2 · · · ∂2f∂x1∂xn
.... . .
...∂2f
∂xn∂x1· · · ∂2f
∂xn2
.
Quasi-Newton
If H is replaced by I, we have
xn+1 = xn − αI∇f (xn).
Here α controls the step length.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Newton’s Method
xn+1 = xn −H−1∇f , H =
∂2f∂x1
2 · · · ∂2f∂x1∂xn
.... . .
...∂2f
∂xn∂x1· · · ∂2f
∂xn2
.
Quasi-Newton
If H is replaced by I, we have
xn+1 = xn − αI∇f (xn).
Here α controls the step length.
Generation of new moves by gradient.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Simulated Annealling
Simulated Annealling
Metal annealing to increase strength =⇒ simulated annealing.
Probabilistic Move: p ∝ exp[−E/kBT ].
kB=Boltzmann constant (e.g., kB = 1), T=temperature, E=energy.
E ∝ f (x),T = T0αt (cooling schedule) , (0 < α < 1).
T → 0, =⇒p → 0, =⇒ hill climbing.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Simulated Annealling
Simulated Annealling
Metal annealing to increase strength =⇒ simulated annealing.
Probabilistic Move: p ∝ exp[−E/kBT ].
kB=Boltzmann constant (e.g., kB = 1), T=temperature, E=energy.
E ∝ f (x),T = T0αt (cooling schedule) , (0 < α < 1).
T → 0, =⇒p → 0, =⇒ hill climbing.
This is essentially a Markov chain.Generation of new moves by Markov chain.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
An Example
An Example
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Genetic Algorithms
Genetic Algorithms
crossover mutation
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Genetic Algorithms
Genetic Algorithms
crossover mutation
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Genetic Algorithms
Genetic Algorithms
crossover mutation
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Generation of new solutions by crossover, mutation and elistism.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Swarm Intelligence
Swarm Intelligence
Ants, bees, birds, fish ...
Simple rules lead to complex behaviour.
Swarming Starlings
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
PSO
PSO
xi
g∗
xj
Particle swarm optimization (Kennedy and Eberhart 1995)
vt+1i = vt
i + αǫ1(g∗ − xt
i ) + βǫ2(x∗i − xt
i ),
xt+1i = xt
i + vt+1i .
α, β = learning parameters, ǫ1, ǫ2=random numbers.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
PSO
PSO
xi
g∗
xj
Particle swarm optimization (Kennedy and Eberhart 1995)
vt+1i = vt
i + αǫ1(g∗ − xt
i ) + βǫ2(x∗i − xt
i ),
xt+1i = xt
i + vt+1i .
α, β = learning parameters, ǫ1, ǫ2=random numbers.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
PSO
PSO
xi
g∗
xj
Particle swarm optimization (Kennedy and Eberhart 1995)
vt+1i = vt
i + αǫ1(g∗ − xt
i ) + βǫ2(x∗i − xt
i ),
xt+1i = xt
i + vt+1i .
α, β = learning parameters, ǫ1, ǫ2=random numbers.
Without randomness, generation of new moves by weightedaverage or pattern search.Adding randomization to increase the diversity of new solutions.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
PSO Convergence
PSO ConvergenceConsider a 1D system without randomness (Clerc & Kennedy 2002)
v t+1i = v t
i + α(x ti − x∗
i ) + β(x ti − g), x t+1
i = x ti + v t+1
i .
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
PSO Convergence
PSO ConvergenceConsider a 1D system without randomness (Clerc & Kennedy 2002)
v t+1i = v t
i + α(x ti − x∗
i ) + β(x ti − g), x t+1
i = x ti + v t+1
i .
Considering only one particle and defining p =αx∗i +βg
α+β, φ = α + β
and setting y t = p − x ti , we have
{
v t+1 = v t + φy t ,y t+1 = −v t + (1− φ)y t .
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
PSO Convergence
PSO ConvergenceConsider a 1D system without randomness (Clerc & Kennedy 2002)
v t+1i = v t
i + α(x ti − x∗
i ) + β(x ti − g), x t+1
i = x ti + v t+1
i .
Considering only one particle and defining p =αx∗i +βg
α+β, φ = α + β
and setting y t = p − x ti , we have
{
v t+1 = v t + φy t ,y t+1 = −v t + (1− φ)y t .
This can be written as
Ut =
(
v t
y t
)
, A =
(
1 φ−1 (1− φ)
)
, =⇒Ut+1 = AUt ,
a simple dynamical system whose eigenvalues are
λ± = 1− φ
2±
√
φ2 − 4φ
2.
Periodic, quasi-periodic depending on φ. Convergence for φ ≈ 4.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Ant and Bee Algorithms
Ant and Bee Algorithms
Ant Colony Optimization (Dorigo 1992)
Bee algorithms & many variants (Nakrani & Tovey 2004,Karabogo 2005, Yang 2005, Asfhar et al. 2007, ..., others.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Ant and Bee Algorithms
Ant and Bee Algorithms
Ant Colony Optimization (Dorigo 1992)
Bee algorithms & many variants (Nakrani & Tovey 2004,Karabogo 2005, Yang 2005, Asfhar et al. 2007, ..., others.
Advantages
Very promising for combinatorial optimization, but for continuousproblems, it may not be the best choice.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Ant & Bee Algorithms
Ant & Bee Algorithms
Pheromone based
Each agent follows paths with higher pheromoneconcentration (quasi-randomly)
Pheromone evaporates (exponentially) with time
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Firefly Algorithm
Firefly Algorithm
Firefly Algorithm by Xin-She Yang (2008)(Xin-She Yang, Nature-Inspired Metaheuristic Algorithms, Luniver Press, (2008).)
Firefly Behaviour and Idealization
Fireflies are unisex and brightness varies with distance.
Less bright ones will be attracted to bright ones.
If no brighter firefly can be seen, a firefly will move randomly.
xt+1i = xt
i + β0e−γr2
ij (xj − xi ) + α ǫti .
Generation of new solutions by random walk and attraction.Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
FA Convergence
FA Convergence
For the firefly motion without the randomness term, we focus on asingle agent and replace xt
j by g
xt+1i = xt
i + β0e−γr2
i (g − xti ),
where the distance ri = ||g − xti ||2.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
FA Convergence
FA Convergence
For the firefly motion without the randomness term, we focus on asingle agent and replace xt
j by g
xt+1i = xt
i + β0e−γr2
i (g − xti ),
where the distance ri = ||g − xti ||2.
In the 1-D case, we set yt = g − xti and ut =
√γyt , we have
ut+1 = ut [1− β0e−u2
t ].
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
FA Convergence
FA Convergence
For the firefly motion without the randomness term, we focus on asingle agent and replace xt
j by g
xt+1i = xt
i + β0e−γr2
i (g − xti ),
where the distance ri = ||g − xti ||2.
In the 1-D case, we set yt = g − xti and ut =
√γyt , we have
ut+1 = ut [1− β0e−u2
t ].
Analyzing this using the same methodology for ut = λut(1− ut),we have a corresponding chaotic map, focusing on the transitionfrom periodic multiple states to chaotic behaviour.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Convergence can be achieved for β0 < 2. There is a transitionfrom periodic to chaos at β0 ≈ 4.
Chaotic characteristics can often be used as an efficientmixing technique for generating diverse solutions.
Too much attraction may cause chaos :)
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Convergence can be achieved for β0 < 2. There is a transitionfrom periodic to chaos at β0 ≈ 4.
Chaotic characteristics can often be used as an efficientmixing technique for generating diverse solutions.
Too much attraction may cause chaos :)
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Cuckoo Breeding Behaviour
Cuckoo Breeding Behaviour
Evolutionary Advantages
Dumps eggs in the nests of host birds and let these host birds raisetheir chicks.
Cuckoo Video (BBC)
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Cuckoo Search
Cuckoo Search
Cuckoo Search by Xin-She Yang and Suash Deb (2009)(Xin-She Yang and Suash Deb, Cuckoo search via Levy flights, in: Proceeings of
World Congress on Nature & Biologically Inspired Computing (NaBIC 2009, India),
IEEE Publications, USA, pp. 210-214 (2009). Also, Xin-She Yang and Suash Deb,
Engineering Optimization by Cuckoo Search, Int. J. Mathematical Modelling and
Numerical Optimisation, Vol. 1, No. 4, 330-343 (2010). )
Cuckoo Behaviour and Idealization
Each cuckoo lays one egg (solution) at a time, and dumps itsegg in a randomly chosen nest.
The best nests with high-quality eggs (solutions) will carry outto the next generation.
The egg laid by a cuckoo can be discovered by the host birdwith a probability pa and a nest will then be built.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Cuckoo Search
Cuckoo Search
Local random walk:
xt+1i = xt
i + s ⊗ H(pa − ǫ)⊗ (xtj − xt
k).
[xi , xj , xk are 3 different solutions, H(u) is a Heaviside function, ǫis a random number drawn from a uniform distribution, and s isthe step size.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Cuckoo Search
Cuckoo Search
Local random walk:
xt+1i = xt
i + s ⊗ H(pa − ǫ)⊗ (xtj − xt
k).
[xi , xj , xk are 3 different solutions, H(u) is a Heaviside function, ǫis a random number drawn from a uniform distribution, and s isthe step size.
Global random walk via Levy flights:
xt+1i = xt
i + αL(s, λ), L(s, λ) =λΓ(λ) sin(πλ/2)
π
1
s1+λ, (s ≫ s0).
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Cuckoo Search
Cuckoo Search
Local random walk:
xt+1i = xt
i + s ⊗ H(pa − ǫ)⊗ (xtj − xt
k).
[xi , xj , xk are 3 different solutions, H(u) is a Heaviside function, ǫis a random number drawn from a uniform distribution, and s isthe step size.
Global random walk via Levy flights:
xt+1i = xt
i + αL(s, λ), L(s, λ) =λΓ(λ) sin(πλ/2)
π
1
s1+λ, (s ≫ s0).
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Cuckoo Search
Cuckoo Search
Local random walk:
xt+1i = xt
i + s ⊗ H(pa − ǫ)⊗ (xtj − xt
k).
[xi , xj , xk are 3 different solutions, H(u) is a Heaviside function, ǫis a random number drawn from a uniform distribution, and s isthe step size.
Global random walk via Levy flights:
xt+1i = xt
i + αL(s, λ), L(s, λ) =λΓ(λ) sin(πλ/2)
π
1
s1+λ, (s ≫ s0).
Generation of new moves by Levy flights, random walk and elitism.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Applications
Applications
Design optimization: structural engineering, product design ...
Scheduling, routing and planning: often discrete,combinatorial problems ...
Applications in almost all areas (e.g., finance, economics,engineering, industry, ...)
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Pressure Vessel Design Optimization
Pressure Vessel Design Optimization
r
d1
r
L d2
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Optimization
Optimization
This is a well-known test problem for optimization (e.g., seeCagnina et al. 2008) and it can be written as
minimize f (x) = 0.6224d1rL+1.7781d2r2+3.1661d2
1 L+19.84d21 r ,
subject to
g1(x) = −d1 + 0.0193r ≤ 0g2(x) = −d2 + 0.00954r ≤ 0g3(x) = −πr2L− 4π
3 r3 + 1296000 ≤ 0g4(x) = L− 240 ≤ 0.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Optimization
Optimization
This is a well-known test problem for optimization (e.g., seeCagnina et al. 2008) and it can be written as
minimize f (x) = 0.6224d1rL+1.7781d2r2+3.1661d2
1 L+19.84d21 r ,
subject to
g1(x) = −d1 + 0.0193r ≤ 0g2(x) = −d2 + 0.00954r ≤ 0g3(x) = −πr2L− 4π
3 r3 + 1296000 ≤ 0g4(x) = L− 240 ≤ 0.
The simple bounds are
0.0625 ≤ d1, d2 ≤ 99× 0.0625, 10.0 ≤ r , L ≤ 200.0.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Optimization
Optimization
This is a well-known test problem for optimization (e.g., seeCagnina et al. 2008) and it can be written as
minimize f (x) = 0.6224d1rL+1.7781d2r2+3.1661d2
1 L+19.84d21 r ,
subject to
g1(x) = −d1 + 0.0193r ≤ 0g2(x) = −d2 + 0.00954r ≤ 0g3(x) = −πr2L− 4π
3 r3 + 1296000 ≤ 0g4(x) = L− 240 ≤ 0.
The simple bounds are
0.0625 ≤ d1, d2 ≤ 99× 0.0625, 10.0 ≤ r , L ≤ 200.0.
The best solution found so far
f∗ = 6059.714, x∗ = (0.8125, 0.4375, 42.0984, 176.6366).
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Dome Design
Dome Design
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Dome Design
Dome Design
120-bar dome: Divided into 7 groups, 120 design elements, about 200
constraints (Kaveh and Talatahari 2010; Gandomi and Yang 2011).
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Tower Design
Tower Design
26-storey tower: 942 design elements, 244 nodal links, 59 groups/types,
> 4000 nonlinear constraints (Kaveh & Talatahari 2010; Gandomi & Yang 2011).
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Monte Carlo Methods
Monte Carlo Methods
Random walk – A drunkard’s walk:
ut+1 = µ + ut + wt ,
where wt is a random variable, and µ is the drift.
For example, wt ∼ N(0, σ2) (Gaussian).
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Monte Carlo Methods
Monte Carlo Methods
Random walk – A drunkard’s walk:
ut+1 = µ + ut + wt ,
where wt is a random variable, and µ is the drift.
For example, wt ∼ N(0, σ2) (Gaussian).
-10
-5
0
5
10
15
20
25
0 100 200 300 400 500
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Monte Carlo Methods
Monte Carlo Methods
Random walk – A drunkard’s walk:
ut+1 = µ + ut + wt ,
where wt is a random variable, and µ is the drift.
For example, wt ∼ N(0, σ2) (Gaussian).
-10
-5
0
5
10
15
20
25
0 100 200 300 400 500-20
-15
-10
-5
0
5
10
-15 -10 -5 0 5 10 15 20
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Markov Chains
Markov Chains
Markov chain: the next state only depends on the current stateand the transition probability.
P(i , j) ≡ P(Vt+1 = Sj
∣
∣
∣V0 = Sp, ...,Vt = Si)
= P(Vt+1 = Sj
∣
∣
∣Vt = Sj),
=⇒Pijπ∗i = Pjiπ
∗j , π∗ = stionary probability distribution.
Examples: Brownian motion
ui+1 = µ + ui + ǫi , ǫi ∼ N(0, σ2).
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Markov Chains
Markov Chains
Monopoly (board games)
Monopoly Animation
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Markov Chain Monte Carlo
Markov Chain Monte Carlo
Landmarks: Monte Carlo method (1930s, 1945, from 1950s) e.g.,Metropolis Algorithm (1953), Metropolis-Hastings (1970).
Markov Chain Monte Carlo (MCMC) methods – A class ofmethods.
Really took off in 1990s, now applied to a wide range of areas:physics, Bayesian statistics, climate changes, machine learning,finance, economy, medicine, biology, materials and engineering ...
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Convergence Behaviour
Convergence Behaviour
As the MCMC runs, convergence may be reached
When does a chain converge? When to stop the chain ... ?
Are multiple chains better than a single chain?
0
100
200
300
400
500
600
0 100 200 300 400 500 600 700 800 900
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Convergence Behaviour
Convergence Behaviour
t=2
t=0
t=−2U
1
2
3
−∞← t
t=−n
converged
Multiple, interacting chains
Multiple agents trace multiple, interacting Markov chains duringthe Monte Carlo process.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Analysis
Analysis
Classifications of Algorithms
Trajectory-based: hill-climbing, simulated annealing, patternsearch ...
Population-based: genetic algorithms, ant & bee algorithms,artificial immune systems, differential evolutions, PSO, HS,FA, CS, ...
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Analysis
Analysis
Classifications of Algorithms
Trajectory-based: hill-climbing, simulated annealing, patternsearch ...
Population-based: genetic algorithms, ant & bee algorithms,artificial immune systems, differential evolutions, PSO, HS,FA, CS, ...
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Analysis
Analysis
Classifications of Algorithms
Trajectory-based: hill-climbing, simulated annealing, patternsearch ...
Population-based: genetic algorithms, ant & bee algorithms,artificial immune systems, differential evolutions, PSO, HS,FA, CS, ...
Ways of Generating New Moves/Solutions
Markov chains with different transition probability.
Trajectory-based =⇒ a single Markov chain;Population-based =⇒ multiple, interacting chains.
Tabu search (with memory) =⇒ self-avoiding Markov chains.Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Ergodicity
Ergodicity
Markov Chains & Markov Processes
Most theoretical studies uses Markov chains/process as aframework for convergence analysis.
A Markov chain is said be to regular if some positive power k
of the transition matrix P has only positive elements.
A chain is call time-homogeneous if the change of itstransition matrix P is the same after each step, thus thetransition probability after k steps become Pk .
A chain is ergodic or irreducible if it is aperiodic and positiverecurrent – it is possible to reach every state from any state.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Convergence Behaviour
Convergence Behaviour
As k →∞, we have the stationary probability distribution π
π = πP, =⇒ thus the first eigenvalue is always 1.
Asymptotic convergence to optimality:
limk→∞
θk → θ∗, (with probability one).
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Convergence Behaviour
Convergence Behaviour
As k →∞, we have the stationary probability distribution π
π = πP, =⇒ thus the first eigenvalue is always 1.
Asymptotic convergence to optimality:
limk→∞
θk → θ∗, (with probability one).
The rate of convergence is usually determined by the secondeigenvalue 0 < λ2 < 1.
An algorithm can converge, but may not be necessarily efficient,as the rate of convergence is typically low.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Convergence of GA
Convergence of GA
Important studies by Aytug et al. (1996)1, Aytug and Koehler(2000)2, Greenhalgh and Marschall (2000)3, Gutjahr (2010),4 etc.5
The number of iterations t(ζ) in GA with a convergenceprobability of ζ can be estimated by
t(ζ) ≤⌈
ln(1− ζ)
ln
{
1−min[(1− µ)Ln, µLn]
}
⌉
,
where µ=mutation rate, L=string length, and n=population size.
1H. Aytug, S. Bhattacharrya and G. J. Koehler, A Markov chain analysis of genetic algorithms with power of
2 cardinality alphabets, Euro. J. Operational Research, 96, 195-201 (1996).2H. Aytug and G. J. Koehler, New stopping criterion for genetic algorithms, Euro. J. Operational research,
126, 662-674 (2000).3D. Greenhalgh & S. Marshal, Convergence criteria for genetic algorithms, SIAM J. Computing, 30, 269-282
(2000).4W. J. Gutjahr, Convergence Analysis of Metaheuristics Annals of Information Systems, 10, 159-187 (2010).
5 ´
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Multiobjective Metaheuristics
Multiobjective Metaheuristics
Asymptotic convergence of metaheuristic for multiobjectiveoptimization (Villalobos-Arias et al. 2005)6
The transition matrix P of a metaheuristic algorithm has astationary distribution π such that
|Pkij − πj | ≤ (1− ζ)k−1, ∀i , j , (k = 1, 2, ...),
where ζ is a function of mutation probability µ, string length L
and population size. For example, ζ = 2nLµnL, so µ < 0.5.
6M. Villalobos-Arias, C. A. Coello Coello and O. Hernandez-Lerma, Asymptotic convergence of metaheuristics
for multiobjective optimization problems, Soft Computing, 10, 1001-1005 (2005).
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Multiobjective Metaheuristics
Multiobjective Metaheuristics
Asymptotic convergence of metaheuristic for multiobjectiveoptimization (Villalobos-Arias et al. 2005)6
The transition matrix P of a metaheuristic algorithm has astationary distribution π such that
|Pkij − πj | ≤ (1− ζ)k−1, ∀i , j , (k = 1, 2, ...),
where ζ is a function of mutation probability µ, string length L
and population size. For example, ζ = 2nLµnL, so µ < 0.5.
6M. Villalobos-Arias, C. A. Coello Coello and O. Hernandez-Lerma, Asymptotic convergence of metaheuristics
for multiobjective optimization problems, Soft Computing, 10, 1001-1005 (2005).
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Multiobjective Metaheuristics
Multiobjective Metaheuristics
Asymptotic convergence of metaheuristic for multiobjectiveoptimization (Villalobos-Arias et al. 2005)6
The transition matrix P of a metaheuristic algorithm has astationary distribution π such that
|Pkij − πj | ≤ (1− ζ)k−1, ∀i , j , (k = 1, 2, ...),
where ζ is a function of mutation probability µ, string length L
and population size. For example, ζ = 2nLµnL, so µ < 0.5.
Note: An algorithm satisfying this condition may not converge (formultiobjective optimization)However, an algorithm with elitism, obeying the above condition,does converge!.
6M. Villalobos-Arias, C. A. Coello Coello and O. Hernandez-Lerma, Asymptotic convergence of metaheuristics
for multiobjective optimization problems, Soft Computing, 10, 1001-1005 (2005).
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Other results
Other results
Limited results on convergence analysis exist, concerning (finitestates/domains)
ant colony optimization
generalized hill-climbers and simulated annealing,
best-so-far convergence of cross-entropy optimization,
nested partition method, Tabu search, and
of course, combinatorial optimization.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Other results
Other results
Limited results on convergence analysis exist, concerning (finitestates/domains)
ant colony optimization
generalized hill-climbers and simulated annealing,
best-so-far convergence of cross-entropy optimization,
nested partition method, Tabu search, and
of course, combinatorial optimization.
However, more challenging tasks for infinite states/domains andcontinuous problems.
Many, many open problems needs satisfactory answers.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Converged?
Converged?
Converged, often the ‘best-so-far’ convergence, not necessarily atthe global optimality
In theory, a Markov chain can converge, but the number ofiterations tends to be large.
In practice, a finite (hopefully, small) number of generations, if thealgorithm converges, it may not reach the global optimum.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Converged?
Converged?
Converged, often the ‘best-so-far’ convergence, not necessarily atthe global optimality
In theory, a Markov chain can converge, but the number ofiterations tends to be large.
In practice, a finite (hopefully, small) number of generations, if thealgorithm converges, it may not reach the global optimum.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Converged?
Converged?
Converged, often the ‘best-so-far’ convergence, not necessarily atthe global optimality
In theory, a Markov chain can converge, but the number ofiterations tends to be large.
In practice, a finite (hopefully, small) number of generations, if thealgorithm converges, it may not reach the global optimum.
How to avoid premature convergence
Equip an algorithm with the ability to escape a local optimum
Increase diversity of the solutions
Enough randomization at the right stage
....(unknown, new) ....
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
All
All
So many algorithms – what are the common characteristics?
What are the key components?
How to use and balance different components?
What controls the overall behaviour of an algorithm?
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Exploration and Exploitation
Exploration and Exploitation
Characteristics of Metaheuristics
Exploration and Exploitation, or Diversification and Intensification.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Exploration and Exploitation
Exploration and Exploitation
Characteristics of Metaheuristics
Exploration and Exploitation, or Diversification and Intensification.
Exploitation/Intensification
Intensive local search, exploiting local information.E.g., hill-climbing.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Exploration and Exploitation
Exploration and Exploitation
Characteristics of Metaheuristics
Exploration and Exploitation, or Diversification and Intensification.
Exploitation/Intensification
Intensive local search, exploiting local information.E.g., hill-climbing.
Exploration/Diversification
Exploratory global search, using randomization/stochasticcomponents. E.g., hill-climbing with random restart.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Summary
Summary
Exploitation
Exp
lora
tion
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Summary
Summary
Exploitation
Exp
lora
tion
uniformsearch
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Summary
Summary
Exploitation
Exp
lora
tion
uniformsearch
steepestdescent
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Summary
Summary
Exploitation
Exp
lora
tion
uniformsearch
steepestdescent
Tabu Nelder-Mead
CS
PSO/FAEP/ESSA Ant/Bee
Genetic algorithms
Newton-Raphson
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Summary
Summary
Exploitation
Exp
lora
tion
uniformsearch
steepestdescent
Tabu Nelder-Mead
CS
PSO/FAEP/ESSA Ant/Bee
Genetic algorithms
Newton-Raphson
Best?
Free lunch?
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
No-Free-Lunch (NFL) Theorems
No-Free-Lunch (NFL) Theorems
Algorithm Performance
Any algorithm is as good/bad as random search, when averagedover all possible problems/functions.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
No-Free-Lunch (NFL) Theorems
No-Free-Lunch (NFL) Theorems
Algorithm Performance
Any algorithm is as good/bad as random search, when averagedover all possible problems/functions.
Finite domains
No universally efficient algorithm!
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
No-Free-Lunch (NFL) Theorems
No-Free-Lunch (NFL) Theorems
Algorithm Performance
Any algorithm is as good/bad as random search, when averagedover all possible problems/functions.
Finite domains
No universally efficient algorithm!
Any free taster or dessert?
Yes and no. (more later)
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
NFL Theorems (Wolpert and Macready 1997)
NFL Theorems (Wolpert and Macready 1997)
Search space is finite (though quite large), thus the space ofpossible “cost” values is also finite. Objective functionf : X 7→ Y, with F = YX (space of all possible problems).Assumptions: finite domain, closed under permutation (c.u.p).
For m iterations, m distinct visited points form a time-ordered
set dm ={(
dxm(1), dy
m(1))
, ...,(
dxm(m), dy
m(m))}
.
The performance of an algorithm a iterated m times on a costfunction f is denoted by P(dy
m|f ,m, a).
For any pair of algorithms a and b, the NFL theorem states∑
f
P(dym|f ,m, a) =
∑
f
P(dym|f ,m, b).
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
NFL Theorems (Wolpert and Macready 1997)
NFL Theorems (Wolpert and Macready 1997)
Search space is finite (though quite large), thus the space ofpossible “cost” values is also finite. Objective functionf : X 7→ Y, with F = YX (space of all possible problems).Assumptions: finite domain, closed under permutation (c.u.p).
For m iterations, m distinct visited points form a time-ordered
set dm ={(
dxm(1), dy
m(1))
, ...,(
dxm(m), dy
m(m))}
.
The performance of an algorithm a iterated m times on a costfunction f is denoted by P(dy
m|f ,m, a).
For any pair of algorithms a and b, the NFL theorem states∑
f
P(dym|f ,m, a) =
∑
f
P(dym|f ,m, b).
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
NFL Theorems (Wolpert and Macready 1997)
NFL Theorems (Wolpert and Macready 1997)
Search space is finite (though quite large), thus the space ofpossible “cost” values is also finite. Objective functionf : X 7→ Y, with F = YX (space of all possible problems).Assumptions: finite domain, closed under permutation (c.u.p).
For m iterations, m distinct visited points form a time-ordered
set dm ={(
dxm(1), dy
m(1))
, ...,(
dxm(m), dy
m(m))}
.
The performance of an algorithm a iterated m times on a costfunction f is denoted by P(dy
m|f ,m, a).
For any pair of algorithms a and b, the NFL theorem states∑
f
P(dym|f ,m, a) =
∑
f
P(dym|f ,m, b).
Any algorithm is as good (bad) as a random search!Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Proof Sketch
Proof Sketch
Wolpert and Macready’s original proof by inductionFor m = 1, d1 = {dx
1 , dy1 }, so the only possible value of d
y1 is f (dx
1 ), and thusδ(dy
1 , f (dx1 )). This means
∑
f
P(dy1 |f ,m = 1, a) =
∑
f
δ(dy1 , f (dx
1 )) = |Y||X|−1,
which is independent of algorithm a. [|Y| is the size of Y .]If it is true for m, or
∑
f P(dym |f , m, a) is independent of a, then for m + 1, we
have dm+1 = dm ∪ {x , f (x)} with dxm+1(m + 1) = x and d
ym+1(m + 1) = f (x).
Thus, we get (Bayesian approach)
P(dym+1|f ,m + 1, a) = P(dy
m+1(m + 1)|dm , f ,m + 1, a)P(dym |f , m + 1, a).
So∑
f P(dym+1|f ,m + 1, a) =
∑
f ,x δ(dmm+1(m + 1), f (x))P(x |dy
m , f ,m + 1, a)P(dym |f ,m + 1, a).
Using P(x |dm, a) = δ(x , a(dm)) and P(dm |f ,m + 1, a) = P(dm |f , m, a), thisleads to
∑
f
P(dym+1|f , m + 1, a) =
1
|Y|
∑
f
P(dym |f ,m, a),
which is also independent of a.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Proof Sketch
Proof Sketch
Wolpert and Macready’s original proof by inductionFor m = 1, d1 = {dx
1 , dy1 }, so the only possible value of d
y1 is f (dx
1 ), and thusδ(dy
1 , f (dx1 )). This means
∑
f
P(dy1 |f ,m = 1, a) =
∑
f
δ(dy1 , f (dx
1 )) = |Y||X|−1,
which is independent of algorithm a. [|Y| is the size of Y .]If it is true for m, or
∑
f P(dym |f , m, a) is independent of a, then for m + 1, we
have dm+1 = dm ∪ {x , f (x)} with dxm+1(m + 1) = x and d
ym+1(m + 1) = f (x).
Thus, we get (Bayesian approach)
P(dym+1|f ,m + 1, a) = P(dy
m+1(m + 1)|dm , f ,m + 1, a)P(dym |f , m + 1, a).
So∑
f P(dym+1|f ,m + 1, a) =
∑
f ,x δ(dmm+1(m + 1), f (x))P(x |dy
m , f ,m + 1, a)P(dym |f ,m + 1, a).
Using P(x |dm, a) = δ(x , a(dm)) and P(dm |f ,m + 1, a) = P(dm |f , m, a), thisleads to
∑
f
P(dym+1|f , m + 1, a) =
1
|Y|
∑
f
P(dym |f ,m, a),
which is also independent of a.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Proof Sketch
Proof Sketch
Wolpert and Macready’s original proof by inductionFor m = 1, d1 = {dx
1 , dy1 }, so the only possible value of d
y1 is f (dx
1 ), and thusδ(dy
1 , f (dx1 )). This means
∑
f
P(dy1 |f ,m = 1, a) =
∑
f
δ(dy1 , f (dx
1 )) = |Y||X|−1,
which is independent of algorithm a. [|Y| is the size of Y .]If it is true for m, or
∑
f P(dym |f , m, a) is independent of a, then for m + 1, we
have dm+1 = dm ∪ {x , f (x)} with dxm+1(m + 1) = x and d
ym+1(m + 1) = f (x).
Thus, we get (Bayesian approach)
P(dym+1|f ,m + 1, a) = P(dy
m+1(m + 1)|dm , f ,m + 1, a)P(dym |f , m + 1, a).
So∑
f P(dym+1|f ,m + 1, a) =
∑
f ,x δ(dmm+1(m + 1), f (x))P(x |dy
m , f ,m + 1, a)P(dym |f ,m + 1, a).
Using P(x |dm, a) = δ(x , a(dm)) and P(dm |f ,m + 1, a) = P(dm |f , m, a), thisleads to
∑
f
P(dym+1|f , m + 1, a) =
1
|Y|
∑
f
P(dym |f ,m, a),
which is also independent of a.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Proof Sketch
Proof Sketch
Wolpert and Macready’s original proof by inductionFor m = 1, d1 = {dx
1 , dy1 }, so the only possible value of d
y1 is f (dx
1 ), and thusδ(dy
1 , f (dx1 )). This means
∑
f
P(dy1 |f ,m = 1, a) =
∑
f
δ(dy1 , f (dx
1 )) = |Y||X|−1,
which is independent of algorithm a. [|Y| is the size of Y .]If it is true for m, or
∑
f P(dym |f , m, a) is independent of a, then for m + 1, we
have dm+1 = dm ∪ {x , f (x)} with dxm+1(m + 1) = x and d
ym+1(m + 1) = f (x).
Thus, we get (Bayesian approach)
P(dym+1|f ,m + 1, a) = P(dy
m+1(m + 1)|dm , f ,m + 1, a)P(dym |f , m + 1, a).
So∑
f P(dym+1|f ,m + 1, a) =
∑
f ,x δ(dmm+1(m + 1), f (x))P(x |dy
m , f ,m + 1, a)P(dym |f ,m + 1, a).
Using P(x |dm, a) = δ(x , a(dm)) and P(dm |f ,m + 1, a) = P(dm |f , m, a), thisleads to
∑
f
P(dym+1|f , m + 1, a) =
1
|Y|
∑
f
P(dym |f ,m, a),
which is also independent of a.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Free Lunches
Free Lunches
NFL – not true for continuous domains (Auger and Teytaud 2009)
Continuous free lunches =⇒ some algorithms are better than others!
For example, for a 2D sphere function, an efficient algorithm onlyneeds 4 iterations/steps to reach the optimality (global minimum).7
7A. Auger and O. Teytaud, Continuous lunches are free plus the design of optimal optimization algorithms,
Algorithmica, 57, 121-146 (2010).8J. A. Marshall and T. G. Hinton, Beyond no free lunch: realistic algorithms for arbitrary problem classes,
WCCI 2010 IEEE World Congress on Computational Intelligence, July 1823, Barcelona, Spain, pp. 1319-1324.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Free Lunches
Free Lunches
NFL – not true for continuous domains (Auger and Teytaud 2009)
Continuous free lunches =⇒ some algorithms are better than others!
For example, for a 2D sphere function, an efficient algorithm onlyneeds 4 iterations/steps to reach the optimality (global minimum).7
7A. Auger and O. Teytaud, Continuous lunches are free plus the design of optimal optimization algorithms,
Algorithmica, 57, 121-146 (2010).8J. A. Marshall and T. G. Hinton, Beyond no free lunch: realistic algorithms for arbitrary problem classes,
WCCI 2010 IEEE World Congress on Computational Intelligence, July 1823, Barcelona, Spain, pp. 1319-1324.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Free Lunches
Free Lunches
NFL – not true for continuous domains (Auger and Teytaud 2009)
Continuous free lunches =⇒ some algorithms are better than others!
For example, for a 2D sphere function, an efficient algorithm onlyneeds 4 iterations/steps to reach the optimality (global minimum).7
Revisiting algorithms
NFL assumes that the time-ordered set has m distinct points(non-revisiting). For revisiting points, it breaks the closed underpermutation, so NFL does not hold (Marshall and Hinton 2010)8
7A. Auger and O. Teytaud, Continuous lunches are free plus the design of optimal optimization algorithms,
Algorithmica, 57, 121-146 (2010).8J. A. Marshall and T. G. Hinton, Beyond no free lunch: realistic algorithms for arbitrary problem classes,
WCCI 2010 IEEE World Congress on Computational Intelligence, July 1823, Barcelona, Spain, pp. 1319-1324.
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
More Free Lunches
More Free Lunches
Coevolutionary algorithms
A set of players (agents?) in self-play problems work together toproduce a champion – like training a chess champion– free lunches exist (Wolpert and Macready 2005).9
[A single player tries to pursue the best next move, or for twoplayers, the fitness function depends on the moves of both players.]
9D. H. Wolpert and W. G. Macready, Coevolutonary free lunches, IEEE Trans. Evolutionary Computation, 9,
721-735 (2005).10
D. Corne and J. Knowles, Some multiobjective optimizers are better than others, Evolutionary Computation,CEC’03, 4, 2506-2512 (2003).Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
More Free Lunches
More Free Lunches
Coevolutionary algorithms
A set of players (agents?) in self-play problems work together toproduce a champion – like training a chess champion– free lunches exist (Wolpert and Macready 2005).9
[A single player tries to pursue the best next move, or for twoplayers, the fitness function depends on the moves of both players.]
9D. H. Wolpert and W. G. Macready, Coevolutonary free lunches, IEEE Trans. Evolutionary Computation, 9,
721-735 (2005).10
D. Corne and J. Knowles, Some multiobjective optimizers are better than others, Evolutionary Computation,CEC’03, 4, 2506-2512 (2003).Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
More Free Lunches
More Free Lunches
Coevolutionary algorithms
A set of players (agents?) in self-play problems work together toproduce a champion – like training a chess champion– free lunches exist (Wolpert and Macready 2005).9
[A single player tries to pursue the best next move, or for twoplayers, the fitness function depends on the moves of both players.]
Multiobjective
“Some multiobjective optimizers are better than others” (Corneand Knowles 2003).10 [results for finite domains only]Free lunches due to archiver and generator.
9D. H. Wolpert and W. G. Macready, Coevolutonary free lunches, IEEE Trans. Evolutionary Computation, 9,
721-735 (2005).10
D. Corne and J. Knowles, Some multiobjective optimizers are better than others, Evolutionary Computation,CEC’03, 4, 2506-2512 (2003).Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Open Problems
Open Problems
Framework: Need to develop a unified framework foralgorithmic analysis (e.g.,convergence).
Exploration and exploitation: What is the optimal balancebetween these two components? (50-50 or what?)
Performance measure: What are the best performancemeasures ? Statistically? Why ?
Convergence: Convergence analysis of algorithms for infinite,continuous domains require systematic approaches?
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Open Problems
Open Problems
Framework: Need to develop a unified framework foralgorithmic analysis (e.g.,convergence).
Exploration and exploitation: What is the optimal balancebetween these two components? (50-50 or what?)
Performance measure: What are the best performancemeasures ? Statistically? Why ?
Convergence: Convergence analysis of algorithms for infinite,continuous domains require systematic approaches?
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Open Problems
Open Problems
Framework: Need to develop a unified framework foralgorithmic analysis (e.g.,convergence).
Exploration and exploitation: What is the optimal balancebetween these two components? (50-50 or what?)
Performance measure: What are the best performancemeasures ? Statistically? Why ?
Convergence: Convergence analysis of algorithms for infinite,continuous domains require systematic approaches?
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Open Problems
Open Problems
Framework: Need to develop a unified framework foralgorithmic analysis (e.g.,convergence).
Exploration and exploitation: What is the optimal balancebetween these two components? (50-50 or what?)
Performance measure: What are the best performancemeasures ? Statistically? Why ?
Convergence: Convergence analysis of algorithms for infinite,continuous domains require systematic approaches?
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
More Open Problems
More Open Problems
Free lunches: Unproved for infinite or continuous domains formultiobjective optimization. (possible free lunches!)What are implications of NFL theorems in practice?If free lunches exist, how to find the best algorithm(s)?
Knowledge: Problem-specific knowledge always helps to findappropriate solutions? How to quantify such knowledge?
Intelligent algorithms: Any practical way to design trulyintelligent, self-evolving algorithms?
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
More Open Problems
More Open Problems
Free lunches: Unproved for infinite or continuous domains formultiobjective optimization. (possible free lunches!)What are implications of NFL theorems in practice?If free lunches exist, how to find the best algorithm(s)?
Knowledge: Problem-specific knowledge always helps to findappropriate solutions? How to quantify such knowledge?
Intelligent algorithms: Any practical way to design trulyintelligent, self-evolving algorithms?
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
More Open Problems
More Open Problems
Free lunches: Unproved for infinite or continuous domains formultiobjective optimization. (possible free lunches!)What are implications of NFL theorems in practice?If free lunches exist, how to find the best algorithm(s)?
Knowledge: Problem-specific knowledge always helps to findappropriate solutions? How to quantify such knowledge?
Intelligent algorithms: Any practical way to design trulyintelligent, self-evolving algorithms?
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Thanks
Thanks
Yang X. S., Engineering Optimization: An Introduction with Metaheuristic
Applications, Wiley, (2010).Yang X. S., Introduction to Computational Mathematics, World Scientific,(2008).Yang X. S., Nature-Inspired Metaheuristic Algorithms, Luniver Press, (2008).Yang X. S., Introduction to Mathematical Optimization: From Linear
Programming to Metaheuristics, Cambridge Int. Science Publishing, (2008).Yang X. S., Applied Engineering Optimization, Cambridge Int. SciencePublishing, (2007).
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
IJMMNO
IJMMNO
International Journal of Mathematical Modelling and NumericalOptimization (IJMMNO)
http://www.inderscience.com/ijmmno
Thank you!
Xin-She Yang 2011
Metaheuristics and Optimization
Intro Metaheuristic Algorithms Applications Markov Chains Analysis All NFL Open Problems Thanks
Thank you!
Questions ?
Xin-She Yang 2011
Metaheuristics and Optimization