natural computing
TRANSCRIPT
Natural Computing
Lecture 15
Michael [email protected]: 0131 6 517177Informatics Forum 1.42
12/10/2010
12/11/2010 M. Herrmann
Overview
Discrete PSO
Bees, frogs, �re�ies, bats, cukoos, eagles
Comparison of metaheuristic algorithms
12/11/2010 M. Herrmann
Discrete Particle Swarm Optimization
A particle in a swarm
has a position and a velocity
knows its position & objective function value for this position
knows its neighbours, best previous position and objectivefunction value (or: current position & objective function value)
remember its best previous position
Its behaviour is determined by a compromise between 3 possiblechoices
To follow its own way (self-con�dence)
To go towards its best previous position (experience)
To go towards the best neighbour's best previous position, ortowards the best neighbour (�peer pressure�)
(see Maurice Clerc ([email protected]) http://www.mauriceclerc.net)
12/11/2010 M. Herrmann
Canonical PSO
xi , vi ∈ Rd , 1 ≤ i ≤ n, r1, r2∈ Rd ,ω, α1, α2 ∈ R+,
f : Rd → R+ to be minimized
For all member of the swarm
vi := ωvi + α1r1 ◦ (xi − xi ) + α2r2 ◦ (g − xi )◦: component-wise multiplication
xi := xi + vi
xi := xi if f (xi ) < f (xi )
g := xi if f (xi ) < f (g)
until termination criterion is met
12/11/2010 M. Herrmann
Discrete PSO
States x are implied by optimisation problem, e.g. states s ∈ Zd
Option 1: Run the algorithm for continuous states x anddiscretize [s = (int)x ] after a solution has been found
Option 2: If the objective function does not accept continuousvalues then discretize before the evaluation of the swarmmembers
Option 3: Use discrete states s = x . The velocities are stillcontinuous but are incremented by discrete steps. Whenupdating s with a small velocity there is no e�ect, only from acertain threshold s is actually changed. This could beadvisable if continuous values of the state have no meaning
Option 4: Use discrete states s = x and continuous velocities,but smoothen the e�ect of the states onto the velocities. Thiscould be advisable for binary states.
Option 5: Use a more systematic approach (cf. below)
12/11/2010 M. Herrmann
Discrete PSO
For all options, adaptive discretisation schemes might beuseful.
The parameters ω, α1, α2 may have optimal values di�erentfrom the standard values for the continuous case.
Theoretical predictions about the behaviour of the algorithmcan hardly be made
Practically, dPSO performs competes well withgenuine-discrete algorithms (ACO, GA)
12/11/2010 M. Herrmann
Example: Sequence alignment
Time-warped sequences qm(t), m = 1, . . . ,M, t ∈ [0,T ]
If we had the correct warping functions wm (t) for each sequencethen for all t
q1 (t + w1 (t)) = · · · = qM (t + wM (t))
More generally, we cannot assume exact equality, so we minimise
f [w ] =M∑
i ,j=1
∫(qi (t + wi (t))− qj (t + wj (t)))
2 dt
by choosing appropriate wm (t) subject to a simultaneousminimization of ‖wm‖2!
This is an in�nite-dimensional problem.
12/11/2010 M. Herrmann
Example: Sequence alignment
Choose a discretization t = 1, . . . ,T (or use the naturaldiscretization of the data)
f [w ] =M∑
i ,j=1
T∑t=1
(qi (t + wi (t))− qj (t + wj (t)))2
w is a M × T dimensional vector that can be used as state x inPSO.
However, having discretized t only discrete values of w aremeaningful. Nevertheless, the above options 1 - 4 are applicable.
If the the �tness function is evaluated w.r.t. to given data thenoptions 1 - 3 are applicable.
12/11/2010 M. Herrmann
An algorithm for binary states
Initialize the v and the discrete particles x , choose ω,α1, α2, a, b
For discrete particles x , calculate the �tness f (x)
Calculate vpb and vgb (× is standard multiplication)
vpb = a × xpb + b × (1− xpb)
vgb = a × xgb + b × (1− xgb)
Update v (usually with relatively small αi )
v = w × v + α1vpb + α2vgb
If rand > vki then xki = 1 else xki = 0 (i : particles, k dimension)[e.g. rand = U [0, 1] for a = 0.3, b = 0.7]
Repeat until termination criterion is satis�ed.
Yang, S.Y., Wang, M., Jiao, L.C., 2004. A Quantum Particle Swarm Optimization. Proc. 2004 IEEECongress on Evolutionary Computation, 1:320-324.
12/11/2010 M. Herrmann
An algorithm for binary states
Note: The velocities represented a tendency to move either to0 or to 1, i.e. an estimate of a probability.
Analoguous to �compact GA� (see GA as MBS, lect. 12)
It appears to be less �dynamic�: Induce diversity (exploration)by combination with GA operators
Exploitation can be imporved by local search, e.g. simulatedannealing
Yang, S.Y., Wang, M., Jiao, L.C., 2004. A Quantum Particle Swarm Optimization. Proc. 2004 IEEECongress on Evolutionary Computation, 1:320-324.
12/11/2010 M. Herrmann
General case: Operator formalism
What if velocities also need to be discrete? 'Overload' the requiredoperations.
Subtraction (position � position) operator:two positions x1 and x2: x2 − x1 = v (velocity)
Addition (position + velocity) operator:position x and v velocity: x + v = x1 (position)
Addition (velocity + velocity) operator:two velocities: v1 and v2: v1 + v2 (velocity)
Multiplication (Coe�cient Ö velocity) operator:learning coe�cient: α, velocity v : c × v (velocity)
M Clerc: Discrete Particle Swarm Optimization, illustrated by the Traveling Salesman Problem. In:Godfrey C. Onwubolu, B. V. Babu (eds.) New optimization techniques in engineering , p 219-239.
12/11/2010 M. Herrmann
Discrete PSO for TSP
Search space of positions/states S = {si} → graph:
Hamilton cycles in a weighted graph G = {EG ,VG}Cost/objective function f on X maps into a set of valuesS → C = {ci}For TSP: f (s) =
∑Ni=1 wni ,ni+1
with nN+1 ≡ n0, w denotingdistances.
Order on C , or, more generally, a semi-order: either ci < cj orci ≥ cj (if comparable)
Enumber EG and serch for sequences of N + 1 nodes with �rstand last identical, otherwise di�erent.
12/11/2010 M. Herrmann
TSP: Discrete velocities
What is a state?
A vector containing N nodes
What is a velocity?
De�ne it as a permutation, but only using pairs:Simplest case: the exchange of two nodes:(..., i , ..., j , ...) → (..., j , ..., i , ...), i.e. the cycle (ij).More generally {(ik , jk)}k=1,...,|v |: A sequence of pairwiseexchanges.
12/11/2010 M. Herrmann
TSP: Discrete velocities
A negative velocity?
De�ne−v ={(
i|v |−k+1, j|v |−k+1
)}k=1,...,|v |
Adding a velocity to a state
applying a permutation (v) to a set of objects (x)
Di�erence between states?
The permuation that transforms x1 into x2
Sum of velocities?
⊕ perform �rst the pair exchances of v1 than those of v2 (notcommutative; may be contracted into fewer pairs)
Multiplication by a scalar?
ω = 0: ωv = Id
ω ∈ (0, 1]: remove all pairs from v above (�oor)ω |v |ω > 1 concatenate (�oor)ω-times and add (�oor)ω |v | pairsfrom the beginning of v
12/11/2010 M. Herrmann
Algorithm for discrete velocities
vt+1 = ωvt ⊕ α1 (x − xt)⊕ α2 (g − xt)
xt+1 = xt + vt+1
Fitness evaluation, and update of personal bet and global bestare standard.
Performance is not great unless parameters are adapted inorder to revive the swarm when diversity is too small.
A GA in the disguise of a PS?
12/11/2010 M. Herrmann
Bees, frogs, �re�ies, bats, cukoos, eagles
Honey bee algorithm: A bee directs others to nectar sources independence on its previous success (cf. ACO)
Fire�ies allgorithm: Fire�ies attract others by an inversesquare law of the �light intensity� (i.e. �tness) (cf. ACO)
Bat algorithm: Bats �y with a velocity that depends on their�wave length� (i.e. �tness), but can change also loudness andduration of the pulse etc. (cf. PSO)
Frog leaping algorithm: Out of several subgroups of frogs thebest ones are allowed to �jump�, i.e. to exchange di�erencevectors (cf. DE)
see X-S Yang: Nature-inspired metaheuristic algorithms. Luniver Press 2010.
12/11/2010 M. Herrmann
Comparison among ME algorithms
Global bests in a standard set of benchmark problems basedon a standard solution quality metrics (neither is agreed upon)
Comparisons are not always meaningful
Standard data sets are simple,Data sets are pragmatically selectedEven with best intentions one's own algorithm will be bettertuned than the algorithm of a competitor
Open competitions are an option
Preparation: Parameter adaptation on a given datasetCompetion: Test on a similar but unknown data set withmanual readjustment of parameters
Asymptotic space and time complexity (e.g. runtime growth rate)
Dimension and sensitivity of the parameter space
J. Silberholz and B: Golden: Comparison of Metaheuristics. in Handbook of Metaheuristics 2010,Vol. 146, 625-640.
12/11/2010 M. Herrmann
Principles for comparisons
First experimental principle: The problems used for assessing theperformance of an algorithm cannot be used in the development ofthe algorithm itself.
Second experimental principle: The designer can take into accountany available domain-speci�c knowledge as well as make use ofpilot studies on similar problems.
Third experimental principle: When comparing several algorithms,all the algorithms should make use of the available domain-speci�cknowledge, and equal computational e�ort should be invested in allthe pilot studies. Similarly, in the test phase, all the algorithmsshould be compared on an equal computing time basis.
Mauro Birattari, Mark Zlochinand Marco Dorigo: Toward a theory of practice in metaheuristicdesign: Amachine learning perspective. RAIRO-Inf. Theor. Appl. 40 (2006) 353-369.
12/11/2010 M. Herrmann
Emad Elbeltagi, TarekHegazy, Donald Grierson (2005) Advanced Comparison among�ve evolutionary-based optimization algorithms. Engineering Informatics 19, 43�53.
Advanced Comparison
Emad Elbeltagi, TarekHegazy, Donald Grierson (2005) Advanced Comparison among�ve evolutionary-based optimization algorithms. Engineering Informatics 19, 43�53.
12/11/2010 M. Herrmann
Result of the comparison
. . . and the winner is
(check back for the next competition)
12/11/2010 M. Herrmann
Example: Power Generation Expansion Planning
Long-term behaviour of electricity markets
Minimize the total investment and the operating cost of thegenerating units
Meet the demand criteria, fuel mix ratio, and the reliabilitycriteria
Highly constrained, nonlinear, discrete optimization problem
Solution through complete enumeration in the entire planninghorizon
System dynamics models for system behaviour: Detailledrelationships between the main variables of the system withexplicit recognition of feedbacks and delays.
S. Kannan, S. Mary Raja Slochanal, and Narayana Prasad Padhy: Application and Comparison ofMetaheuristic Techniques to Generation Expansion Planning Problem. IEEE TRANSACTIONS ONPOWER SYSTEMS 20:1, 2005.
12/11/2010 M. Herrmann
Another competition
Algorithms:
Genetic algorithmDi�erential evolutionEvolutionary programmingEvolution strategiesAnt colony optimizationPartical swarmsTaboo searchsimulated annealingHybrid approach(GA+direct search in linear span)
12/11/2010 M. Herrmann
Medium term results
(Cost, # �tness evaluations, # generations, error range,success rate, execution time)
12/11/2010 M. Herrmann
Conclusions
Tuning of all algorithm by generic methods
virtual mapping procedure (e.g. n1× type A power station,n2× type B: use variable n that Cantor anumerates the arrayformed by the pairs(n1, n2))intelligent initial population generation (does not observeconstraints but meets the demand plus a reserve margin)penalty factor approach (constraints are penalised, but notdeselected)
Dynamic programming (DP) is optimal when computable
Hybrid approach wins!
Among the others: DE is best (perhaps because 4 vectors are�crossed over� instead of 2 in GA etc.)
12/11/2010 M. Herrmann