optimizing with genetic algorithms

62
Optimizing with Genetic Algorithms by Benjamin J. Lynch Feb 23, 2006 T C A G T T G C G A C T G A C T

Upload: buikiet

Post on 31-Dec-2016

219 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Optimizing with Genetic Algorithms

Optimizing with GeneticAlgorithms

byBenjamin J. Lynch

Feb 23, 2006

TCAGTTGC

GACTGACT

Page 2: Optimizing with Genetic Algorithms

2

Outline• What are genetic algorithms?

– Biological origins– Shortcomings of Newton-type optimizers

• How do we apply genetic algorithms?– Options to include

• Encoding• Selection• Recombination• Mutation

– Strategies• What programs can we use?

– How do we parallelize?• MPI, fork/wait

Page 3: Optimizing with Genetic Algorithms

3

What are genetic algorithms?(GAs)

• A class of stochastic search strategiesmodeled after evolutionary mechanisms

• a popular strategy to optimize non-linearsystems with a large number of variables

Page 4: Optimizing with Genetic Algorithms

4

What are genetic algorithms?(GAs)

• A major difference between natural GAsand our GAs is that we do not need tofollow the same laws observed in nature.– Although modeled after natural processes, we

can design our own encoding of information,our own mutations, and our own selectioncriteria.

Page 5: Optimizing with Genetic Algorithms

5

Definitions for today

• Parameter – a variable in the system of interest• Gene – encoded form of a parameter being

optimized• Chromosome – the complete set of genes

(parameters) which uniquely describe anindividual

• Locus – the position of a piece of data within achromosome

• Fitness – A value we are trying to maximize

Page 6: Optimizing with Genetic Algorithms

6

Why would we use genetic algorithms?Isn’t there a simple solution we learned in

Calculus?

• Newton-Raphson and it’s many relativesand variants are based on the use of localinformation.

• The function value and the derivatives withrespect to the parameters optimized areused to take a step in an appropriatedirection towards a local maximum orminimum.

Page 7: Optimizing with Genetic Algorithms

7

Newton-Raphson

• Perfect for aparabola!– H is the Hessian (2nd

derivative with respectto all parametersoptimized)

– g is the gradient– x is the step to

minimize the gradient

gxH !=

x

Page 8: Optimizing with Genetic Algorithms

8

Where Newton-Raphson Fails

• A local method will only find localextrema.

gxH !=

If we start our search hereWe’ll end up here

Page 9: Optimizing with Genetic Algorithms

9

How do we use GAs to optimize theparameters we’re interested in?

• Choose parameters to optimize• Determine chromosomal representation of

parameters• Generate initial population of individuals

(chromosomes)• Evaluate fitness of each individual to

reproduce• Allow selection rules and random behavior

to select next population

Page 10: Optimizing with Genetic Algorithms

10

Genetic Algorithm

Create new population

Select the parents

based on fitness

Evaluate the fitness

of each individual

Create Initial Population

Evaluation

Selection

Recombination

Enter

Page 11: Optimizing with Genetic Algorithms

11

Setting up the problem

• This is the most difficult step!

• Choose the parameters you want to optimize

Setup

Page 12: Optimizing with Genetic Algorithms

12

Airfoil Example:• Constant wing area• Variable camber• Variable chord at root• Variable chord at tip• Span (function of chords

and wing area)

Setup

Determine the Parameters

Page 13: Optimizing with Genetic Algorithms

13

Density FunctionalExample:

• Choose parameters tobe all the variables inthe gradient-correctedexchange terms.

drr

A

dxxax

dxebxaxE

x

f

fcxGGA

X

3/4

1

223/1

)(

sinh614

3

2

32

!!

"#

$%&&&&&

'

(

)))))

*

+

,+

,,+&

'

()*

+,=

,

,

Setup

Page 14: Optimizing with Genetic Algorithms

14

• For every problem, there is something youwant to maximize or minimize.

• The standard convention is to maximize afunction with a GA.

Setup

Evaluate Your Fitness

Page 15: Optimizing with Genetic Algorithms

15

• We need to evaluate fitness of eachindividual.

• The fitness will be used to bias the nextgeneration towards better genes.

Setup

Fitness

Page 16: Optimizing with Genetic Algorithms

16

• We must first define what fitness is! Youmust come up with a single metric that willbe used to compare 2 possible solutionsand decide which is better.

Setup

Fitness

Page 17: Optimizing with Genetic Algorithms

17

• For an airfoil, this might be a function of drag and lift

)()( 21 PDragwPLiftwFitness !=

Setup

• It may depend on a set of simulations at differentspeeds, different angles of attack, etc.

!! "=ssimulation

i

i

ssimulation

i

i PDragwPLiftwFitness )()(

Page 18: Optimizing with Genetic Algorithms

18

• For an empirical density functional, the fitnessmight be a weighted RMS deviation fromexperimental values.

Setup

! ""=dataset

i

Exp

i

Calc

i EPEFitness 2))((

Note: we toss in a “–” because we alwaysWant to maximize the fitness.

Page 19: Optimizing with Genetic Algorithms

19

• Check that your problem is well-suited foroptimization with a GA.

• Your fitness function will need to be evaluatedthousands of times. Make sure you have theresources.

• If a GA is too expensive, you still might be ableto simplify your problem and use a GA to findregions in the parameter space of interest.

Setup

Page 20: Optimizing with Genetic Algorithms

20

• Determine chromosomal representation ofparameters

• Parameters can be encoded in binary, base-4 base-10, etc.

encoding

Page 21: Optimizing with Genetic Algorithms

21

• After you decide how to encode the parameters,you must decide on the domain of yourparameters. This is entirely dependent on yourproblem. You will want to allow your parametersto be anything physically reasonable (if you’resolving a physical problem)

• Create an initial population with randomizedchromosomes

encoding

Page 22: Optimizing with Genetic Algorithms

22

Create Initial Population• Population size is chosen (1-10 individuals/parameter

optimized for most applications)• Parameters to be optimized are encoded.

– Binary, Base 10– Let’s say we have 2 parameters with initial values of 32

and 13. In binary and base 10 they would look like:

100000001101

3213

Chromosome of theindividual

encoding

Page 23: Optimizing with Genetic Algorithms

23

How binary genes translate intoparameters

101011101001010111010000100110

698 349 38

minminmax10)(

12

38PPPP +!

!=

You need to understand the system you areoptimizing inorder to determinethe proper parameter range

encoding

Page 24: Optimizing with Genetic Algorithms

24

Create Initial Population

• After we choose a population size andencoding method, we must choose a maximumrange for each parameter. Ranges forparameters should be determined based onwhat would be physically reasonable (if you’reinterested in solving a physical problem).

encoding

Page 25: Optimizing with Genetic Algorithms

25

• Generate initial population of individuals(chromosomes)

• The initial population can be generated byrandomizing the genes for each chromosome ofthe initial population

• You can set the parameters for a few individualsif you want. This might speed up the process.

encoding

Page 26: Optimizing with Genetic Algorithms

26

1.Initialization of population2. Evaluation of fitness3. Selection4. Recombination5. Repeat 2-4

Review Steps in a GA

Page 27: Optimizing with Genetic Algorithms

27

Evaluation

• Assign a fitness value to each individualbased on the parameters derived from itschromosome

Evaluate Fitness

Page 28: Optimizing with Genetic Algorithms

28

• The fitness function is somehow based onthe genes of the individual and shouldreflect how good a given set of parametersis.– Lift-drag , low drag airfoil– Ability of a density functional to better predict

chemical phenomena– Swimming speed of a robotic fish– Power output of a chemical laser

Evaluate Fitness

Page 29: Optimizing with Genetic Algorithms

29

• Evaluation of the fitness is the computationally-intensive portion of a GA optimization

• Each chromosome holds the information thatuniquely describes an individual.

• Each chromosome/(parameters set)/individual canbe evaluated separate from the other individuals.– GA optimizations are typically described as

embarrassingly parallelizable• The evaluation of the chromosomes reduces down

to a fitness value for each individual which willbe used in the next step

Evaluate Fitness

Page 30: Optimizing with Genetic Algorithms

30

Selection

• Allow selection rules and random behaviorto select next population

Selection

Page 31: Optimizing with Genetic Algorithms

31

• The parents must be selected based on theirfitness.

• The individuals with a higher fitness must have ahigher probability of having offspring.

• There are several methods for selection.

Selection

Page 32: Optimizing with Genetic Algorithms

32

Roulette WheelSelection

• Probability of parenthoodis proportional to fitness.

• The wheel is spun untiltwo parents are selected.

• The two parents createone offspring.

• The process is repeated tocreate a new populationfor the next generation.

Fit(#1)

Fit(#2)

Fit(#3)

Fit(#4)

Fit(#5)

Selection

Page 33: Optimizing with Genetic Algorithms

33

• Roulette wheel selection has problems if thefitness changes by orders of magnitude.

• If two individuals have a much higherfitness, they could be the parents forevery child in the next generation.

Fit(#1)

Fit(#2)

Fit(#3)

Fit(#4)

Fit(#5)

Selection

Page 34: Optimizing with Genetic Algorithms

34

Another Reason Not to Use theRoulette Wheel

• If the fitness value for allindividuals is very close,the parents will be chosenwith equal probability,and the function willcease to optimize.

• Roulette selection is verysensitive to the form ofthe fitness function andgenerally requiresmodifications to work atall.

Fit(#1)

Fit(#2)

Fit(#3)

Fit(#4)

Fit(#5)

Selection

Page 35: Optimizing with Genetic Algorithms

35

Rank Selection

• All individuals inthe population areranked accordingto fitness

• Each individual isassigned a weightinverselyproportional to therank (or othersimilar scheme).

rank

1

( ) 7.11

rank

Fit(#1)

Fit(#2)

Fit(#3)

Fit(#4)

Fit(#5)

Fit(#1)

Fit(#2)

Fit(#3)

Fit(#4)

Fit(#5)

Selection

Page 36: Optimizing with Genetic Algorithms

36

Tournament Selection• 4 individuals (A,B,C,D) are randomly selected from

the population. Two are eliminated and twobecome the parents of a child in the next generation

A

B

C

D

A

DFitness(D) > Fitness(C)

Fitness(A) > Fitness(B)

Selection

Page 37: Optimizing with Genetic Algorithms

37

Tournament Selection

A

B

C

D

A

DFitness(D) > Fitness(C)

Fitness(A) > Fitness(B)

Selection

• Selection of parents continues until a newpopulation is completed.

• Individuals might be the parent to several children,or no children.

Page 38: Optimizing with Genetic Algorithms

38

Similarities Between Tournamentand Rank Selection

• Tournamentselection is verysimilar to rankselection in thelimit of a largepopulation whenwe assign aweight of 1/rank.

16

9

16

1

16

6

Both parents wereabove the median

One parent wasabove the median

Neither parent wasabove the median

Fraction of population

Selection

Fitness

Page 39: Optimizing with Genetic Algorithms

39

Recombination

• Using the chromosomes of the parents, wecreate the chromosome of the child

Recombination

Page 40: Optimizing with Genetic Algorithms

40

Recombination with crossover points• We can choose a

number ofcrossover points

ABCDEFGH

abcdefgh

ABCDefgh

Param. 1(eyes)

Param. 2(nose)

Parent 1Parent 2child

In this case, the parameters remained intact, and thechild inherited the same eyesas parent1 and the same nose as parent2.

Recombination

Page 41: Optimizing with Genetic Algorithms

41

crossover point occurs within a parameter

ABCDEFGH

abcdefgh

ABCDEFgh

Param. 1(eyes)

Param. 2(nose)

parent1parent2child

In this case the child will have a new nose that is not the same as parent1 or parent2.

Recombination

Page 42: Optimizing with Genetic Algorithms

42

representation of parameters becomes important.

11110000

10101111

11110011

Param. 1

Param. 2

parent1parent2child

15

00

10

15

03

15

Not possible if we used base 10encoding

Recombination

Page 43: Optimizing with Genetic Algorithms

43

Recombination – Uniform Crossover

• Uniform crossover– No limit to crossover points

ABCDEFGH

abcdefgh

aBcDefgH

Allows more variation in offspringand decreases need for randommutations

Recombination

Page 44: Optimizing with Genetic Algorithms

44

• Mutations canbe applied afterrecombination

ABCDEFGH

abcdefgh

ABCDefgh

Param. 1

Param. 2

parent1parent2child

A random mutationhas been applied tothe child

ABCDefgh

Randommutations

Page 45: Optimizing with Genetic Algorithms

45

• Creep mutations are a specialtype of random mutation.

• Creep mutations cause aparameter to change by a smallamount, rather than randomizingany one element.

11100000

11111000

11101000

Param. 1

Param. 2

11011000

11111000

OR

Possible creepmutations for param. 1

Randommutations

Page 46: Optimizing with Genetic Algorithms

46

• The desirable frequency of mutations dependsgreatly on the other GA options chosen.

ABCDEFGH

abcdefgh

ABCDefgh

Param. 1

Param. 2

ABCDefgH

Randommutations

Page 47: Optimizing with Genetic Algorithms

47

Other Operators for Recombination• Other rearrangements of

information are possible

• Swap locus 04285903

24085903

• Swap entire genes

04285903

59030428

Randommutations

Page 48: Optimizing with Genetic Algorithms

48

Elitism• Elitism refers to the safeguarding of the

chromosome of the most fit individual in a givengeneration.

• If elitism is used, only N-1 individuals are producedby recombining the information from parents. Thelast individual is a copy of the most fit individualfrom the previous generation.

• This ensures that the best chromosome is never lostin the optimization process due to random events.

Page 49: Optimizing with Genetic Algorithms

49

More GA Options• Separate “islands” with populations that interact

infrequently.• Use “male” & “female” populations• “Alpha male” selection• Use 3 parents for each offspring• Use 3 sexes• Recessive genes• …. and many more. Most of which are only useful

for very specific types of fitness functions.

Page 50: Optimizing with Genetic Algorithms

50

Micro-GA• Small population size of 1 individual per

parameter optimized.• No random mutations.• When the genetic variance is below a

certain threshold (~5%), the most fitindividual goes on, while the chromosomesof all other individuals are randomized

• Cycle this process

Page 51: Optimizing with Genetic Algorithms

51

Overview of a micro-GA

Is the variance in genes

< than 5%

Randomize the chromosomes

for all but the most fit individual

Recombine parents to

create new population

Select the parents

based on fitness

Evaluate the fitness

of each individual

Create initial population

No

Yes

Page 52: Optimizing with Genetic Algorithms

52

Micro-GA Pros and Cons• Tends to be very efficient in terms of total

CPU time.• Robust algorithm with no need to tinker

with random mutation parameters

• It uses a smaller population, andtherefore has less potential forparallelization.

Page 53: Optimizing with Genetic Algorithms

53Create new population

Select the parents

based on fitness

Evaluate the fitness

of each individual

Create Initial Population

Evaluation

Selection

Recombination

ReviewEvaluation of the fitness isthe time-consuming portion

Page 54: Optimizing with Genetic Algorithms

54

How do you know if you’vefound the absolute maximum?

• You don’t• Even GAs are not a “black-box” optimizer

for any function• You can gain confidence by running

several optimizations with different startingparameters, different algorithm options,and different parameter ranges.

Page 55: Optimizing with Genetic Algorithms

55

Which GA options are good picks formy system?

• Start with robust algorithms– Micro-GA– Binary encoding– Tournament selection– Uniform crossover at gene boundaries

• If you are unsatisfied with the progress you canchange a couple options– Allow crossover everywhere to introduce more variety

between each generation.• Change to base-10 encoding as well, so it isn’t too random

– Use rank selection with a weight of 1/rank3 to heavilyfavor the best individuals.

Page 56: Optimizing with Genetic Algorithms

56

How might one parallelize a GA?• The GA calculations are minimal. An

optimization might require 1000 generations,and each generation is dominated by the costto evaluate the fitness.

• Standard serial GA programs can handle theGA routines with ~1 ms of cpu time, whileyour fitness routine can parallelize the fitnessevaluations.

Page 57: Optimizing with Genetic Algorithms

57

How might one parallelize a GA?

• Evaluating the fitness of an individuals isindependent of the rest of the population.– You can easily run your GA on N processors

where N is your population size.

• Each individual can often be run in parallelas well– This depends on the program you are using to

evaluate the fitness.

Page 58: Optimizing with Genetic Algorithms

58

Method to Parallelize a GA

• MPI is always a good choice if you’realready familiar the language. This optionwould also enable GA algorithms with“islands” on heterogeneous clusters.

• Fork/wait would be very easy– Can be done in several languages

Page 59: Optimizing with Genetic Algorithms

59

#!/usr/bin/perluse Parallel::ForkManager;$max_process=4;

$pm = new Parallel::ForkManager($max_process);@all_chromosomes=(@ARGV)

# enter main loop, no more than $max_process# children will run at a timeforeach $chromosome (@all_chromosomes) { my $pid = $pm->start and next;# prepare an appropriate input file with# this set of parameters &writeinput($chromosome);# run the program to evaluate the fitness &evalfitness($chromosome); $pm->finish;}

# make sure that all the processes are done$pm->wait_all_children;

#parse output and return to our GA&parse_output;

Fork method toparallelize with Perlon a shared-memorymachine

Page 60: Optimizing with Genetic Algorithms

60

GA Programs Available

• David Carrol’s GA Driver– http://cuaerospace.com/carroll/ga.html

• GAUL– http://gaul.sourceforge.net/

• GALOPPS (already parallelized)– http://garage.cps.msu.edu/software/galopps/index.html

• Simple Generalized GA– http://www.skamphausen.de/software/AI/ga.html

Page 61: Optimizing with Genetic Algorithms

61

Other GA Resources Available

• Me– [email protected]– 612-624-4122

Page 62: Optimizing with Genetic Algorithms

62

Questions?

[email protected] [email protected] 6-0802