academic report de-shuang huang intelligent computing lab, hefei institute of intelligent machines,...
Post on 21-Dec-2015
220 views
TRANSCRIPT
Academic ReportAcademic Report
De-Shuang HuangDe-Shuang Huang
Intelligent Computing Lab, Hefei Institute of Intelligent Machines, Chinese Academy of Sciences
Department of Automation, University of Science and Technology of China
http://www.intelengine.cn/
29 March 200629 March 2006
Part IPart I ::A Multi-Sub-Swarm PSO Algorithm for A Multi-Sub-Swarm PSO Algorithm for
Multimodal Function OptimizationMultimodal Function Optimization
Part IIPart II ::A Brief Introduction to ICLA Brief Introduction to ICL
Outlines
1. Particle Swarm Optimization (PSO)
2. Niche Techniques and Development
3. Novel Adaptive Sequential Niche Technique
4. Multi-Sub-Swarm PSO Algorithm
5. Conclusions
Part IPart I ::A Multi-Sub-Swarm PSO Algorithm for A Multi-Sub-Swarm PSO Algorithm for
Multimodal Function OptimizationMultimodal Function Optimization
1. Particle Swarm Optimization
Particle Swarm OptimizationParticle Swarm Optimization (PSO) algorithm was developed in 1995 by James Kennedy and Russ Eberhart
It was inspired by social behavior of bird flocking or fish schooling
PSO was applied to the concept of social interaction to problem solving
W is called as inertia weight, C1 and C2 are positive constants, referred to as cognitive and social parameters; rand1 (*) and rand2 (*) are random numbers, respectively, uniformly distributed in [0..1]
5 10 15 20 25
5
10
15
20
25
pbest
gbest
v(k)v(k+1)
))((*()2*2))((*()1*1)(*)1( kxgbestrandCkxpbestrandCkvWkv
)1()()1( kvkxkx
(1)(2)
1.1 PSO Algorithm
1.2 The Current Researches in PSO
The researches on PSO generally can be categorized into five parts: Algorithms (Binary PSO algorithms) Topology (Design different types of neighborhood structur
es with PSO) Parameters (The inertia weight W, constriction coefficient
factor and the impact of these parameters on the performance of PSO algorithm)
Hybrid PSO algorithms (Combine the PSO with the other techniques)
Applications (Constrained optimization, multiobjective optimization, neural network training, etc.)
Swarm Topology
Two general types of neighborhood structures were investigated, gbest and lbest (Eberhart, Simpson, and Dobbins, 1996)
I4
I0
I1
I2 I3
I4
I0
I1
I2 I3
The global model converges fast, but with potential to converge to the local minimum, while the local model might have more chances to find better solutions slowly (Kennedy 1999, Kennedy, Eberhart and Shi 2001)
A lot of researchers have worked on improving its performance by designing or implementing different types of neighborhood structures in PSO Kennedy and Mendes tested PSO with different
neighborhoods Mendes and Kennedy proposed a fully informed particle
swarm optimization algorithm
Parameters
Velocity changes of a PSO consist of three parts, the “social” part, the “cognitive” part, and the momentum part
A PSO with well-selected parameter set can have good performance
Shi and Eberhart (Shi and Eberhart 1998, 1999) introduced a linearly decreasing inertia weight to the PSO, then they further designed fuzzy systems to nonlinearly change the inertia weight (Shi and Eberhart 2001)
Constriction coefficient factor was developed by Clearc with the hope that it can insure a PSO to converge (Clerc 1999, Clerc and Kennedy 2002)
)](*()2*2)(*()1*1[* idgdidididid XPrandCXPrandCVKV
42
22
K
(3)
(4)
where 4,21 CC
Hybrid PSO algorithms
Some evolutionary computation techniques were merged into PSO algorithm
Applying selection operation to PSO (Angeline, 1998) Applying crossover operation to PSO (Løvbjerg, Rasmussen and Kri
nk 2001) Applying mutation operation to PSO (Miranda and Fonseca 2002,Lø
vbjerg and Krink, 2002 ) Other evolutionary operations were incorporated into PSO algorith
m Either PSO algorithm, GA, or hill-climbing search algorithm can be a
pplied to a different sub-population of individuals (Krink and Løvbjerg, 2002)
Differential evolution(DE) was combined with PSO (Hendtlass,2001).
Non-evolutionary techniques have been incorporated into PSO A Cooperative Particle Swarm Optimizer (CPSO) was d
eveloped by Van Den Bergh and Engelbrecht (2004) The population of particles is divided into subpopulation
s which would breed within their own sub-population so that the diversity of the population can be increased (Løvbjerg,Rasmussen and Krink 2001)
Deflation and stretching techniques(Parsopoulos and Vrahatis, 2004).
Applications
Constrained optimization problems A straight forward approach that is used to convert the c
onstrained optimization problem into a non-constrained optimization (Parsopoulos and Vrahatis, 2002)
Preserve feasible solutions and repair the infeasible solutions (Hu and Eberhart, 2002)
Hybrid algorithms that usually employs some information decoding strategy (Ray and Liew, 2001).
Multiobjective optimization problems(MOP) Convert a MOP to a single objective optimization problem
using weight (Parsopoulos and Vrahatis, 2002) Record a set of better performing particles and then move
towards particles randomly selected from the set instead of the neighborhood best in the original PSO to maintain a diversity of population and therefore maintain a well distribution along the Pareto front (Ray and Liew, 2002, Coello Coello and Lechuga, 2002)
Optimizes one objective at a time (Hu and Eberhart, 2002)
Evolve weights and structures of neural networks
Evolve neural networks (Eberhart and Shi, 1998,Kennedy, Eberhart and Shi, 2001)
Analyze human tumor (Eberhart and Hu, 1999) Leaf shape Matching (J.X. Du and D. S. Huang, 2005) A Hybrid PSO-backpropagation Algorithm for Feedforward
Neural Network Training (J.R. Zhang and D. S. Huang, et.al, 2005).
2. Niche Techniques
The definition of Niche The term Niche is borrowed from Ecology Horn’s definition: form of cooperation around finite,
limited resources, resulting in the lack of competition between such areas, and causing the formation of species for each niche
The target of niche technique is to attempt to find multiple solutions to optimization problems
Each resource can be considered as a niche in optimization problem, and each subpopulation exploiting a niche can be considered as a species.
Aim: find all optima (global and/or local) of the objective function
Motivation: Provide the decision maker not a single optimal
solution but also a set of good solutions Find all solutions with local optimal style
Applications: Systems design DNA sequence analysis Multimodal function optimization
Ordinary optimization techniques
Aim: find a global optimum
Evolutionary approach: populationconcentrates on the global optima(single powerful species)
Premature convergence: bad
-10 -5 0 5 10
-10
-5
0
5
10
15
Niche optimization techniques
Aim: find all (global/ local) optima
Evolutionary approach: different species are formed, each one of which identifies an optimum
Premature convergence: not so bad
-10 -5 0 5 10
-10
-5
0
5
10
15
2.1 The Origin of Niche Techniques
Preselection (Cavicchio, 1970)Preselection (Cavicchio, 1970) Modification to the replacement step of a classical GA In preselection, not all the generated offsprings are cho
sen for the new population. Only the offspring with higher fitness than their parents replaces their parents in the next generation
Like the other traditional GAs, this technique does not keep stable species or subpopulations for many generations and it only converges to one optimum.
2.2 Crowding (De Jong, 1975)
The offspring replaces similar individuals from the population. An offspring is inserted into the population according to: First, a group of crowding factor (CF: it indicates the size
of the group) individuals are selected at random from the population.
Second, the bit strings in the offspring chromosome are compared with those of the CF individuals in the group using the Hamming distance or other measurement.
The group member that is most similar to the offspring is replaced by the offspring.
2.3 Sharing (Goldberg and Richardson 1987)
New fitness is calculated by
0
)(1)( share
share
ij
ij
dijifd
dSH
(5)
In sharing, the fitness value of an individual is adjusted according to the number of individuals in its neighborhood or niche
A sharing function assumes a value between 0 and 1 for any distance value between any two individuals i and j in the population
Sharing function counts the individuals in a given neighborhood:
n
j ij
ii
dSh
FF
1
'
)( (6)
ijd
2.4 The Flaws of Crowding and Sharing
Crowding techniqueCrowding technique The algorithm will be likely to lose some of the optima
because of the replacement error Sharing techniqueSharing technique
The computational complexity is in the order of O(n2) Depend on some prior knowledge of multimodal
problem (a single niche radius must be specified in advance)
2.5 Improvements to Crowding Technique
Deterministic Crowding (Mahfoud, 1992) The individuals to mate at random Each of the two offsprings is first paired with one of the par
ents; this pairing is not done randomly, rather the pairing is done in such a manner that the offspring is paired with the most similar parent
Then each offspring is compared with its paired parent and the individual with the higher fitness is allowed to stay in the population and the other is eliminated.
Restricted Tournament Selection (Harik, 1995) Both parents are selected at random from the population Offspring are inserted into the population by choosing a
group of individuals from the population at random with replacement. Then the individual in the group which is most similar to the offspring is selected
The offspring replaces the chosen individual in the population if its fitness value is higher, otherwise, the offspring is eliminated.
Cluster analysis (Yin and Germay, 1993) Reduce the complexity of sharing to O(n) After each generation, an adaptive clustering algorithm
groups the individuals in the population into a number of clusters dynamically
These clusters are then used to compute the sharing value and iterate the fitness value for each individual in the population.
2.6 Improvements to Sharing Technique
2.7 Niche Techniques Based on Other Algorithms
Multipopulation Differential Evolution Algorithm (Zaharie, 2004) A multi- subpopulations are carefully initialized Then the multi-resolution approach is used to avoid specif
ying a niche radius. Niching Particle Swarm Optimization (Brits, 2002)
The algorithm uses only cognitive model to train a main swarm
When a particle fitness shows very little change over a small number of iterations, the algorithm then will create a sub-swarm around the particle in a small area so that the sub-swarm can be trained to locate multiple solutions.
2.8 Sequential and Parallel Niche Techniques
Sequential niche technique (Beasley, 1993) Iterative application of a GA At each iteration an optimum is identified The fitness function is iterated based on those
already found optima Parallel niche techniques
Divide the population into communicating subpopulations which evolves in parallel
Each subpopulation corresponds to a species whose aim is to populate a niche in the fitness landscape and to identify an optimum.
2.9 Disadvantages to Sequential Niche (SN)
Some disadvantages of the sequential niche techniques (Mahfoud, 1995) Parallel niche techniques are faster than SN technique SN technique iterates optima, the rest optima become i
ncreasingly difficult to locate SN technique is likely to locate the same solutions repe
atedly Parallel niche techniques can easily be implemented on
parallel machines, but SN techniques can not.
2.10 Defects to Current Niche Techniques
The drawbacks of these niche methods (including improvement versions) cannot be completely avoided. The replacement errors (Crowding techniques) Dependent on some prior knowledge (Sharing
techniques) The reason:
Lack of an effective niche identification technique (NIT)
What will happen if there is an effective niche identification technique? Some of the problems can be easily solved .
2.11 Niche Identification Technique (NIT)
Analyze the topology structure of a multimodal function for identifying a niche Hill valley function (Ursem, 1999)
This function uses multi-samples between any two points of the search space. If the fitness of any interior samples is smaller than the minimal fitness of two points, then the function will determine that the two points are to belong to different niches.
Niche identification techniques (Lin, et al, 2002)
2.12 Defects of NIT
Defects of NIT Key defects: these NITs usually need plenty of
extra function evaluations The false judgment may be happen
These NITs cannot be directly applied in two popular niche techniques mentioned above These niche techniques use a whole large
population to explore the search space If the NIT is employed, the whole function
evaluation number will increase astronomically In general, an excessive function evaluation is
not possibly accepted.
Open ProblemsOpen Problems
How to find a more effective and efficient NIT ( The next re
search objective)
How to decrease extra function evaluations using existing
NITs?
Divide a larger population into multi-sub-population
Some new methods must be employed, which can effe
ctively reduce the function evaluations.
3. Novel Adaptive SN Technique
On one hand, the niche technique must sacrifice some global search ability through confining individuals to only explore their own niches The sharing technique guarantees to set niche radius. The crowding technique only replaces the most similar
individual. On the other hand, the niche technique also needs some
global exploring information to guide every individual to explore the whole search space.
A Dilemma to Current Niche Techniques
Exploring Information Exchange
If the population size is adequately large, then the exploring information exchange is not necessary.
A good exploring information communicating among the whole population is very important for difficult problems.
The Unique Advantages of SN
The traditional parallel niche techniques seem to behave inadequate exploring information capability among the population
On the contrary, the SN technique has a good exploring information exchange mechanism, where one sub-population cannot explore a space searched by another sub-population
From this point, it can be seen that the SN technique has its unique advantages.
3.1 Novel Adaptive SN Technique
The basic idea of adaptive SN technique The technique uses multiple sub-swarm to detect
optimal solutions sequentially To encourage a new sub-swarm flying to a new place in
search space, the algorithm modified the raw fitness function
The hill valley function was used to determine how to change the fitness of a particle in a sub-swarm run currently
Sequential dynamic update niche radius algorithm is used to decrease extra function evaluations.
3.2 Sequential Dynamic Niche Radius Algorithm
1.Create and initialize a sub-swarm PSO algorithm with a larger niche radius
2.Train the sub-swarm until convergence 3.Repeat4.Create and initialize a new sub-swarm with a larger niche radius 5. For every particle in this new sub-swarm6. If the distance between the particle and the best particle of sub-swarm
launched before is smaller than that niche radius7. Use hill valley function to judge whether or not they belong to one niche9. If two particles are not belong to one niche, update that the niche
radius, otherwise modified the fitness of the particle10. Train the sub-swarm11. Until all sub-swarm convergence
3.3 Experimental results
The final niche radius of
Scekel’s Foxhole functio
n, where 25 sub-swarms
are used
The adaptive SN PSO algorithm can find the optima of some multimodal test functions without any prior knowledge
As Mahfoud pointed out, the running speed is slow
Then we implement a parallel niche PSO algorithm
Neurocomputing (accepted)Neurocomputing (accepted)
4. Multi-Sub-Swarm PSO Algorithm
Multi-sub-swarm is running simultaneously
The different sub-swarms can compete with each other, and the winner after competing will continue to explore the original district while the loser will be obliged to explore another district.
To avoid two sub-swarm to detect a same optimum, the particle intruding other niche will be punished.
4.1 MSSPSO Algorithm
1. Create and initialize N sub-swarm of PSO algorithm with a larger
niche radius
2. If two best particle of different sub-swarm locating in the same
niche,comparing their fitness, the smaller one sub-swam will be re-
initialized
3. For every particle and the memorial position X in other sub-swarm,
decreasing the fitness of the particle
4. Train every sub-swarm and update the best particle of each sub-
swarm
5. Update and Compensate the new niche radius
4.2 Experimental Functions
(7)
(8)
(9)
(10) and
))05.04/3(5(6sin)(1 xxF
))05.0(5(sin)()(2 4/36)
8.0
1.0()2log(2 2
xexF
x
24
066 ))(())((1
1002.0
1500),(3
i ibyiaxi
yxF
5
1
5
1
])1cos[(*])1cos[(),(4ji
jyjjixiiyxF
]2)5mod[(16)( iia )25/(16)( iib
4.3 Performance Criteria
Maximum Peak Ratio: The sum of the fitness values of the
local optima identified by the niche technique divided by the
sum of the fitness values of the actual optima of a
multimodal problem (Miller and Shaw, 1996)
N
ii
p
ii
ratio
f
fPeak
1
1
Chi-Square-Like Performance Criteria: The chi-square-lik
e criteria measures the deviation between the population di
stribution and an ideal proportionally populated distribution
(Deb and Glodberg, 1989)
1
1
22
)(q
i i
ii uXtionlike-deviachi-squre-
q
kk
ii
f
fNu
1
)/1( Nuu iii
01 qu
q
iiq
1
21
Number of Fitness Function Evaluations In most real world applications, the computational cost
of fitness function is very expensive The niche technique must confine the number of the
function evaluations in certain range The adaptive ability
Most niche technique is very sensitive to some parameters. An inappropriate parameter possibly depresses the performance of a niche technique
So, the adaptive ability is another important measure for niche technique.
Maximum Peak Ratio
Test function Maximum Peak Ratio
F1 1
F2 1
F3 0.99916
F4 0.9689
Chi-square-like deviation
IEEE TEC (will submit)IEEE TEC (will submit)
Number of Function Evaluations (NFE)
Test Function
Population size
Nb. Of the sub-swarm
Ordinary
NFE NFE
F1 10 5 11120 10000
F2 10 5 11958 10000
F3 20 25 116362 100000
F4 40 18 150495 144000
5. Conclusions
The proposed method can well imitate the ecosystem of nature, and the different sub-population can compete with each other
The proposed method constructs a dynamic niche radius algorithm (DNRA), which can hugely reduce the extra function evaluation
The proposed method integrates the sequential technique with the parallel one
The proposed method has a good performance The proposed method has strongly adaptive ability.
Future Works:
How to choose a more efficient and effective niche
identification technique
A constructive method must be rebuilt to cover any
shape niche
How to apply the proposed method to many hard
real-world multimodal problems.
ALL OVER THE END