academic report de-shuang huang intelligent computing lab, hefei institute of intelligent machines,...

Academic ReportAcademic Report

De-Shuang HuangDe-Shuang Huang

Intelligent Computing Lab, Hefei Institute of Intelligent Machines, Chinese Academy of Sciences

Department of Automation, University of Science and Technology of China

　　 http://www.intelengine.cn/

29 March 200629 March 2006

Part IPart I ：：A Multi-Sub-Swarm PSO Algorithm for A Multi-Sub-Swarm PSO Algorithm for

Multimodal Function OptimizationMultimodal Function Optimization

Part IIPart II ：：A Brief Introduction to ICLA Brief Introduction to ICL

Outlines

1. Particle Swarm Optimization (PSO)

2. Niche Techniques and Development

3. Novel Adaptive Sequential Niche Technique

4. Multi-Sub-Swarm PSO Algorithm

5. Conclusions

Part IPart I ：：A Multi-Sub-Swarm PSO Algorithm for A Multi-Sub-Swarm PSO Algorithm for

Multimodal Function OptimizationMultimodal Function Optimization

1. Particle Swarm Optimization

Particle Swarm OptimizationParticle Swarm Optimization (PSO) algorithm was developed in 1995 by James Kennedy and Russ Eberhart

It was inspired by social behavior of bird flocking or fish schooling

PSO was applied to the concept of social interaction to problem solving

W is called as inertia weight, C1 and C2 are positive constants, referred to as cognitive and social parameters; rand1 (*) and rand2 (*) are random numbers, respectively, uniformly distributed in [0..1]

5 10 15 20 25

5

10

15

20

25

pbest

gbest

v(k)v(k+1)

))((*()2*2))((*()1*1)(*)1( kxgbestrandCkxpbestrandCkvWkv

)1()()1( kvkxkx

(1)(2)

1.1 PSO Algorithm

1.2 The Current Researches in PSO

The researches on PSO generally can be categorized into five parts: Algorithms (Binary PSO algorithms) Topology (Design different types of neighborhood structur

es with PSO) Parameters (The inertia weight W, constriction coefficient

factor and the impact of these parameters on the performance of PSO algorithm)

Hybrid PSO algorithms (Combine the PSO with the other techniques)

Applications (Constrained optimization, multiobjective optimization, neural network training, etc.)

Swarm Topology

Two general types of neighborhood structures were investigated, gbest and lbest (Eberhart, Simpson, and Dobbins, 1996)

I4

I0

I1

I2 I3

I4

I0

I1

I2 I3

The global model converges fast, but with potential to converge to the local minimum, while the local model might have more chances to find better solutions slowly (Kennedy 1999, Kennedy, Eberhart and Shi 2001)

A lot of researchers have worked on improving its performance by designing or implementing different types of neighborhood structures in PSO Kennedy and Mendes tested PSO with different

neighborhoods Mendes and Kennedy proposed a fully informed particle

swarm optimization algorithm

Parameters

Velocity changes of a PSO consist of three parts, the “social” part, the “cognitive” part, and the momentum part

A PSO with well-selected parameter set can have good performance

Shi and Eberhart (Shi and Eberhart 1998, 1999) introduced a linearly decreasing inertia weight to the PSO, then they further designed fuzzy systems to nonlinearly change the inertia weight (Shi and Eberhart 2001)

Constriction coefficient factor was developed by Clearc with the hope that it can insure a PSO to converge (Clerc 1999, Clerc and Kennedy 2002)

)](*()2*2)(*()1*1[* idgdidididid XPrandCXPrandCVKV

42

22

K

(3)

(4)

where 4,21 CC

Hybrid PSO algorithms

Some evolutionary computation techniques were merged into PSO algorithm

Applying selection operation to PSO (Angeline, 1998) Applying crossover operation to PSO (Løvbjerg, Rasmussen and Kri

nk 2001) Applying mutation operation to PSO (Miranda and Fonseca 2002,Lø

vbjerg and Krink, 2002 ) Other evolutionary operations were incorporated into PSO algorith

m Either PSO algorithm, GA, or hill-climbing search algorithm can be a

pplied to a different sub-population of individuals (Krink and Løvbjerg, 2002)

Differential evolution(DE) was combined with PSO (Hendtlass,2001).

Non-evolutionary techniques have been incorporated into PSO A Cooperative Particle Swarm Optimizer (CPSO) was d

eveloped by Van Den Bergh and Engelbrecht (2004) The population of particles is divided into subpopulation

s which would breed within their own sub-population so that the diversity of the population can be increased (Løvbjerg,Rasmussen and Krink 2001)

Deflation and stretching techniques(Parsopoulos and Vrahatis, 2004).

Applications

Constrained optimization problems A straight forward approach that is used to convert the c

onstrained optimization problem into a non-constrained optimization (Parsopoulos and Vrahatis, 2002)

Preserve feasible solutions and repair the infeasible solutions (Hu and Eberhart, 2002)

Hybrid algorithms that usually employs some information decoding strategy (Ray and Liew, 2001).

Multiobjective optimization problems(MOP) Convert a MOP to a single objective optimization problem

using weight (Parsopoulos and Vrahatis, 2002) Record a set of better performing particles and then move

towards particles randomly selected from the set instead of the neighborhood best in the original PSO to maintain a diversity of population and therefore maintain a well distribution along the Pareto front (Ray and Liew, 2002, Coello Coello and Lechuga, 2002)

Optimizes one objective at a time (Hu and Eberhart, 2002)

Evolve weights and structures of neural networks

Evolve neural networks (Eberhart and Shi, 1998,Kennedy, Eberhart and Shi, 2001)

Analyze human tumor (Eberhart and Hu, 1999) Leaf shape Matching (J.X. Du and D. S. Huang, 2005) A Hybrid PSO-backpropagation Algorithm for Feedforward

Neural Network Training (J.R. Zhang and D. S. Huang, et.al, 2005).

2. Niche Techniques

The definition of Niche The term Niche is borrowed from Ecology Horn’s definition: form of cooperation around finite,

limited resources, resulting in the lack of competition between such areas, and causing the formation of species for each niche

The target of niche technique is to attempt to find multiple solutions to optimization problems

Each resource can be considered as a niche in optimization problem, and each subpopulation exploiting a niche can be considered as a species.

Aim: find all optima (global and/or local) of the objective function

Motivation: Provide the decision maker not a single optimal

solution but also a set of good solutions Find all solutions with local optimal style

Applications: Systems design DNA sequence analysis Multimodal function optimization

Ordinary optimization techniques

Aim: find a global optimum

Evolutionary approach: populationconcentrates on the global optima(single powerful species)

Premature convergence: bad

-10 -5 0 5 10

-10

-5

0

5

10

15

Niche optimization techniques

Aim: find all (global/ local) optima

Evolutionary approach: different species are formed, each one of which identifies an optimum

Premature convergence: not so bad

-10 -5 0 5 10

-10

-5

0

5

10

15

2.1 The Origin of Niche Techniques

Preselection (Cavicchio, 1970)Preselection (Cavicchio, 1970) Modification to the replacement step of a classical GA In preselection, not all the generated offsprings are cho

sen for the new population. Only the offspring with higher fitness than their parents replaces their parents in the next generation

Like the other traditional GAs, this technique does not keep stable species or subpopulations for many generations and it only converges to one optimum.

2.2 Crowding (De Jong, 1975)

The offspring replaces similar individuals from the population. An offspring is inserted into the population according to: First, a group of crowding factor (CF: it indicates the size

of the group) individuals are selected at random from the population.

Second, the bit strings in the offspring chromosome are compared with those of the CF individuals in the group using the Hamming distance or other measurement.

The group member that is most similar to the offspring is replaced by the offspring.

2.3 Sharing (Goldberg and Richardson 1987)

New fitness is calculated by

0

)(1)( share

share

ij

ij

dijifd

dSH

(5)

In sharing, the fitness value of an individual is adjusted according to the number of individuals in its neighborhood or niche

A sharing function assumes a value between 0 and 1 for any distance value between any two individuals i and j in the population

Sharing function counts the individuals in a given neighborhood:

n

j ij

ii

dSh

FF

1

'

)( (6)

ijd

2.4 The Flaws of Crowding and Sharing

Crowding techniqueCrowding technique The algorithm will be likely to lose some of the optima

because of the replacement error Sharing techniqueSharing technique

The computational complexity is in the order of O(n2) Depend on some prior knowledge of multimodal

problem (a single niche radius must be specified in advance)

2.5 Improvements to Crowding Technique

Deterministic Crowding (Mahfoud, 1992) The individuals to mate at random Each of the two offsprings is first paired with one of the par

ents; this pairing is not done randomly, rather the pairing is done in such a manner that the offspring is paired with the most similar parent

Then each offspring is compared with its paired parent and the individual with the higher fitness is allowed to stay in the population and the other is eliminated.

Restricted Tournament Selection (Harik, 1995) Both parents are selected at random from the population Offspring are inserted into the population by choosing a

group of individuals from the population at random with replacement. Then the individual in the group which is most similar to the offspring is selected

The offspring replaces the chosen individual in the population if its fitness value is higher, otherwise, the offspring is eliminated.

Cluster analysis (Yin and Germay, 1993) Reduce the complexity of sharing to O(n) After each generation, an adaptive clustering algorithm

groups the individuals in the population into a number of clusters dynamically

These clusters are then used to compute the sharing value and iterate the fitness value for each individual in the population.

2.6 Improvements to Sharing Technique

2.7 Niche Techniques Based on Other Algorithms

Multipopulation Differential Evolution Algorithm (Zaharie, 2004) A multi- subpopulations are carefully initialized Then the multi-resolution approach is used to avoid specif

ying a niche radius. Niching Particle Swarm Optimization (Brits, 2002)

The algorithm uses only cognitive model to train a main swarm

When a particle fitness shows very little change over a small number of iterations, the algorithm then will create a sub-swarm around the particle in a small area so that the sub-swarm can be trained to locate multiple solutions.

2.8 Sequential and Parallel Niche Techniques

Sequential niche technique (Beasley, 1993) Iterative application of a GA At each iteration an optimum is identified The fitness function is iterated based on those

already found optima Parallel niche techniques

Divide the population into communicating subpopulations which evolves in parallel

Each subpopulation corresponds to a species whose aim is to populate a niche in the fitness landscape and to identify an optimum.

2.9 Disadvantages to Sequential Niche (SN)

Some disadvantages of the sequential niche techniques (Mahfoud, 1995) Parallel niche techniques are faster than SN technique SN technique iterates optima, the rest optima become i

ncreasingly difficult to locate SN technique is likely to locate the same solutions repe

atedly Parallel niche techniques can easily be implemented on

parallel machines, but SN techniques can not.

2.10 Defects to Current Niche Techniques

The drawbacks of these niche methods (including improvement versions) cannot be completely avoided. The replacement errors (Crowding techniques) Dependent on some prior knowledge (Sharing

techniques) The reason:

Lack of an effective niche identification technique (NIT)

What will happen if there is an effective niche identification technique? Some of the problems can be easily solved .

2.11 Niche Identification Technique (NIT)

Analyze the topology structure of a multimodal function for identifying a niche Hill valley function (Ursem, 1999)

This function uses multi-samples between any two points of the search space. If the fitness of any interior samples is smaller than the minimal fitness of two points, then the function will determine that the two points are to belong to different niches.

Niche identification techniques (Lin, et al, 2002)

2.12 Defects of NIT

Defects of NIT Key defects: these NITs usually need plenty of

extra function evaluations The false judgment may be happen

These NITs cannot be directly applied in two popular niche techniques mentioned above These niche techniques use a whole large

population to explore the search space If the NIT is employed, the whole function

evaluation number will increase astronomically In general, an excessive function evaluation is

not possibly accepted.

Open ProblemsOpen Problems

How to find a more effective and efficient NIT ( The next re

search objective)

How to decrease extra function evaluations using existing

NITs?

Divide a larger population into multi-sub-population

Some new methods must be employed, which can effe

ctively reduce the function evaluations.

3. Novel Adaptive SN Technique

On one hand, the niche technique must sacrifice some global search ability through confining individuals to only explore their own niches The sharing technique guarantees to set niche radius. The crowding technique only replaces the most similar

individual. On the other hand, the niche technique also needs some

global exploring information to guide every individual to explore the whole search space.

A Dilemma to Current Niche Techniques

Exploring Information Exchange

If the population size is adequately large, then the exploring information exchange is not necessary.

A good exploring information communicating among the whole population is very important for difficult problems.

The Unique Advantages of SN

The traditional parallel niche techniques seem to behave inadequate exploring information capability among the population

On the contrary, the SN technique has a good exploring information exchange mechanism, where one sub-population cannot explore a space searched by another sub-population

From this point, it can be seen that the SN technique has its unique advantages.

3.1 Novel Adaptive SN Technique

The basic idea of adaptive SN technique The technique uses multiple sub-swarm to detect

optimal solutions sequentially To encourage a new sub-swarm flying to a new place in

search space, the algorithm modified the raw fitness function

The hill valley function was used to determine how to change the fitness of a particle in a sub-swarm run currently

Sequential dynamic update niche radius algorithm is used to decrease extra function evaluations.

3.2 Sequential Dynamic Niche Radius Algorithm

1.Create and initialize a sub-swarm PSO algorithm with a larger niche radius

2.Train the sub-swarm until convergence 3.Repeat4.Create and initialize a new sub-swarm with a larger niche radius 5. For every particle in this new sub-swarm6. If the distance between the particle and the best particle of sub-swarm

launched before is smaller than that niche radius7. Use hill valley function to judge whether or not they belong to one niche9. If two particles are not belong to one niche, update that the niche

radius, otherwise modified the fitness of the particle10. Train the sub-swarm11. Until all sub-swarm convergence

3.3 Experimental results

The final niche radius of

Scekel’s Foxhole functio

n, where 25 sub-swarms

are used

The adaptive SN PSO algorithm can find the optima of some multimodal test functions without any prior knowledge

As Mahfoud pointed out, the running speed is slow

Then we implement a parallel niche PSO algorithm

Neurocomputing (accepted)Neurocomputing (accepted)

4. Multi-Sub-Swarm PSO Algorithm

Multi-sub-swarm is running simultaneously

The different sub-swarms can compete with each other, and the winner after competing will continue to explore the original district while the loser will be obliged to explore another district.

To avoid two sub-swarm to detect a same optimum, the particle intruding other niche will be punished.

4.1 MSSPSO Algorithm

1. Create and initialize N sub-swarm of PSO algorithm with a larger

niche radius

2. If two best particle of different sub-swarm locating in the same

niche,comparing their fitness, the smaller one sub-swam will be re-

initialized

3. For every particle and the memorial position X in other sub-swarm,

decreasing the fitness of the particle

4. Train every sub-swarm and update the best particle of each sub-

swarm

5. Update and Compensate the new niche radius

4.2 Experimental Functions

(7)

(8)

(9)

(10) and

))05.04/3(5(6sin)(1 xxF

))05.0(5(sin)()(2 4/36)

8.0

1.0()2log(2 2

xexF

x

24

066 ))(())((1

1002.0

1500),(3

i ibyiaxi

yxF

5

1

5

1

])1cos[(*])1cos[(),(4ji

jyjjixiiyxF

]2)5mod[(16)( iia )25/(16)( iib

4.3 Performance Criteria

Maximum Peak Ratio: The sum of the fitness values of the

local optima identified by the niche technique divided by the

sum of the fitness values of the actual optima of a

multimodal problem (Miller and Shaw, 1996)

N

ii

p

ii

ratio

f

fPeak

1

1

Chi-Square-Like Performance Criteria: The chi-square-lik

e criteria measures the deviation between the population di

stribution and an ideal proportionally populated distribution

(Deb and Glodberg, 1989)

1

1

22

)(q

i i

ii uXtionlike-deviachi-squre-

q

kk

ii

f

fNu

1

)/1( Nuu iii

01 qu

q

iiq

1

21

Number of Fitness Function Evaluations In most real world applications, the computational cost

of fitness function is very expensive The niche technique must confine the number of the

function evaluations in certain range The adaptive ability

Most niche technique is very sensitive to some parameters. An inappropriate parameter possibly depresses the performance of a niche technique

So, the adaptive ability is another important measure for niche technique.

Maximum Peak Ratio

Test function Maximum Peak Ratio

F1 1

F2 1

F3 0.99916

F4 0.9689

Chi-square-like deviation

IEEE TEC (will submit)IEEE TEC (will submit)

Number of Function Evaluations (NFE)

Test Function

Population size

Nb. Of the sub-swarm

Ordinary

NFE NFE

F1 10 5 11120 10000

F2 10 5 11958 10000

F3 20 25 116362 100000

F4 40 18 150495 144000

5. Conclusions

The proposed method can well imitate the ecosystem of nature, and the different sub-population can compete with each other

The proposed method constructs a dynamic niche radius algorithm (DNRA), which can hugely reduce the extra function evaluation

The proposed method integrates the sequential technique with the parallel one

The proposed method has a good performance The proposed method has strongly adaptive ability.

Future Works:

How to choose a more efficient and effective niche

identification technique

A constructive method must be rebuilt to cover any

shape niche

How to apply the proposed method to many hard

real-world multimodal problems.

ALL OVER THE END

academic report de-shuang huang intelligent computing lab, hefei institute of intelligent machines,...

Documents

pso kennedy

pso algorithm slide

pso parameters

pso angeline

pso lvbjerg

multisubswarm pso algorithm

fish schooling pso

icl slide