parallel genetic algorithms and the science of asteroseismology

Post on 25-Feb-2016

28 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

PARALLEL GENETIC ALGORITHMS AND THE SCIENCE OF ASTEROSEISMOLOGY. A Review of the Doctoral Dissertation Research of Dr. Travis Metcalfe. Outline. Introduction The Science of Asteroseismology The Genetic Algorithm Parallel Computing Conclusion. Introduction. - PowerPoint PPT Presentation

TRANSCRIPT

PARALLEL GENETIC PARALLEL GENETIC ALGORITHMS AND THE ALGORITHMS AND THE

SCIENCE OF SCIENCE OF ASTEROSEISMOLOGYASTEROSEISMOLOGY

A Review of the Doctoral A Review of the Doctoral Dissertation Research of Dr. Travis Dissertation Research of Dr. Travis

MetcalfeMetcalfe

OutlineOutline IntroductionIntroduction The Science of AsteroseismologyThe Science of Asteroseismology The Genetic AlgorithmThe Genetic Algorithm Parallel ComputingParallel Computing ConclusionConclusion

IntroductionIntroductionAstronomers observe the universe and Astronomers observe the universe and gather information about it. They then fit gather information about it. They then fit this information into mathematical models. this information into mathematical models. The process of “fitting” involves adjusting The process of “fitting” involves adjusting the many parameters of the model. When the many parameters of the model. When they have a good fit, they use the parameter they have a good fit, they use the parameter settings to tell them something about the settings to tell them something about the object or phenomenon they are studying. object or phenomenon they are studying. The author uses a parallel genetic algorithm The author uses a parallel genetic algorithm to solve this problem of optimization.to solve this problem of optimization.

The Goal of the ResearchThe Goal of the Research

To Further the Understanding of the Composition To Further the Understanding of the Composition and Characteristics of White Dwarvesand Characteristics of White Dwarves

More Generally, Since White Dwarves are the More Generally, Since White Dwarves are the Endpoint for all but the most massive stars, this Endpoint for all but the most massive stars, this research can lead to a better understanding of research can lead to a better understanding of stellar evolutionstellar evolution

* Source

Traditional TechniqueTraditional Technique Make an initial “guess” for parameter Make an initial “guess” for parameter

valuesvalues

Use some iterative technique to Use some iterative technique to improve upon the initial guesses.improve upon the initial guesses.

Adjustable Input ParametersAdjustable Input Parameters MassMass TemperatureTemperature H and He layer massesH and He layer masses Convective EfficiencyConvective Efficiency Core compositionCore composition

Problem with this techniqueProblem with this technique Results often depend on the initial Results often depend on the initial

guessguess

The initial guess is inherently The initial guess is inherently subjective, often the result of subjective, often the result of intuition or past experienceintuition or past experience

The Genetic AlgorithmThe Genetic Algorithm A genetic algorithm provides a more A genetic algorithm provides a more

systematic approach to optimizing the systematic approach to optimizing the resultsresults

The genetic algorithm used was PIKAIAThe genetic algorithm used was PIKAIA PIKAIA is a general purpose “function PIKAIA is a general purpose “function

optimization” genetic algorithmoptimization” genetic algorithm Public domain softwarePublic domain software Fortran-77Fortran-77

OutlineOutline IntroductionIntroduction The Science of AsteroseismologyThe Science of Asteroseismology The Genetic AlgorithmThe Genetic Algorithm Parallel ComputingParallel Computing ConclusionConclusion

White dwarves which show a regular White dwarves which show a regular variation in light intensity are known as variation in light intensity are known as pulsating white dwarvespulsating white dwarves

Using photometric techniques, this Using photometric techniques, this variation in intensity can be very variation in intensity can be very accurately measured with such accurately measured with such instruments as the Whole Earth Telescope instruments as the Whole Earth Telescope (WET)(WET)

The pulsation is the result of seismic The pulsation is the result of seismic activity within the white dwarfactivity within the white dwarf

Just as seismological information can be Just as seismological information can be used to study the internal nature of the used to study the internal nature of the earth, seismological data, as expressed in earth, seismological data, as expressed in varying stellar luminosity, can be used to varying stellar luminosity, can be used to determine the characteristics of these determine the characteristics of these pulsating white dwarves.pulsating white dwarves.

Observed Light Curve for the Observed Light Curve for the White Dwarf GD 358.White Dwarf GD 358.

OutlineOutline IntroductionIntroduction The Science of AsteroseismologyThe Science of Asteroseismology The Genetic AlgorithmThe Genetic Algorithm Parallel ComputingParallel Computing ConclusionConclusion

Initial ConditionsInitial Conditions Population size: 1000 ( in later work this Population size: 1000 ( in later work this

was reduced to 128).was reduced to 128).

No rationale was given for how the initial No rationale was given for how the initial population value was chosen, or why it population value was chosen, or why it was changed.was changed.

For each member of the initial population, For each member of the initial population, parameter values are randomly setparameter values are randomly set

DurationDuration Until the difference between the Until the difference between the

average fitness and the best fitness average fitness and the best fitness in the population were less than 1%.in the population were less than 1%.

In later work, he used a constant 200 In later work, he used a constant 200 generations.generations.

Fitness MeasurementFitness Measurement The model is then run using these The model is then run using these

initial valuesinitial values

Fitness is based on the root-mean-Fitness is based on the root-mean-square differences between the square differences between the observed and calculated pulsation observed and calculated pulsation periods periods

Fitness MeasurementFitness Measurement The fitness value is converted to a The fitness value is converted to a

survival probability by normalizing survival probability by normalizing with respect to the most fit memberwith respect to the most fit member

The next generation is chosen The next generation is chosen randomly. This random selection is randomly. This random selection is weighted, based on each member’s weighted, based on each member’s survivability ratiosurvivability ratio

CrossoverCrossover Numerical encodingNumerical encoding

Each of the initial parameter values are Each of the initial parameter values are concatenated into one long stringconcatenated into one long string

A single point crossover technique is A single point crossover technique is used. The position along the string is used. The position along the string is picked randomlypicked randomly

MutationMutation Mutation is achieved by randomly Mutation is achieved by randomly

selecting a number in the string and selecting a number in the string and changing it to a new, randomly changing it to a new, randomly chosen valuechosen value

IllustrationIllustration Consider two members, each with Consider two members, each with

two parameters. two parameters. MM11 has X=2.573 and Y= 4.457. has X=2.573 and Y= 4.457. MM22 has parameter values X=3.547 has parameter values X=3.547

and Y=2.332. and Y=2.332. After encoding, MAfter encoding, M11=25734457 and =25734457 and

MM22=35472332=35472332

IllustrationIllustration The crossover point is randomly chosen, and the string The crossover point is randomly chosen, and the string

segments swappedsegments swapped

MM1 1 2573425734||457 457 25734 25734332332MM2 2 3547235472||332 332 35472 35472457457

IllustrationIllustration Mutating MMutating M11 involves picking a random spot involves picking a random spot

along the string, and changing that value:along the string, and changing that value: MM11 257257||33||4332 4332 257 2578843324332

Illustration*Illustration* The strings would then be parsed back into The strings would then be parsed back into

parameter values. For Mparameter values. For M11, this would be:, this would be:

MM11 X= 2.578X= 2.578 Y=4.332 Y=4.332

* Modified from [1]* Modified from [1]

Crossover and Mutation Crossover and Mutation RateRate

The cross over rate: 65% The cross over rate: 65% The mutation rate: 0.3%. The mutation rate: 0.3%.

In later work, the author increased the In later work, the author increased the crossover rate to 85% and varied the crossover rate to 85% and varied the mutation rate from 0.1% to 16.6%, mutation rate from 0.1% to 16.6%, depending on the variation between the depending on the variation between the mean fitness value, and the best fitness mean fitness value, and the best fitness valuevalue

ElitismElitism The most fit solution was passed The most fit solution was passed

unaltered the next generationunaltered the next generation

RationaleRationale The idea behind the relatively low The idea behind the relatively low

crossover and mutation rate is to crossover and mutation rate is to prevent removing promising prevent removing promising solutions from each generation too solutions from each generation too rapidlyrapidly

RepetitionRepetition The paper states: “Repeating this The paper states: “Repeating this

procedure many times with different procedure many times with different random number seeds helps to ensure random number seeds helps to ensure that the minimum found is truly that the minimum found is truly global”global”

It does not elaborate on how many It does not elaborate on how many Many timesMany times is, though is, though

RepetitionRepetition In a later paper, he uses 5 repetitionsIn a later paper, he uses 5 repetitions

This result was obtained in the This result was obtained in the following way…following way…

Values were put in for the model, and Values were put in for the model, and pulsation periods generated.pulsation periods generated.

The genetic algorithm attempted to The genetic algorithm attempted to find the original parameters based on find the original parameters based on the output of the modelthe output of the model

This was done 20 times, and the This was done 20 times, and the results were as follows…results were as follows…

Results (second paper)Results (second paper) First Order Solution…First Order Solution…

Run Teff M/Ms log(MHE/M*) rms Generation Found

1 26,800 0.560 -5.70 0.67 2452 25,000 0.600 -5.96 0.00 1593 24,800 0.605 -5.96 0.52 1454 25,000 0.600 -5.96 0.00 685 22,500 0.660 -6.33 1.11 976 25,000 0.600 -5.96 0.00 1427 25,000 0.600 -5.96 0.00 978 25,000 0.600 -5.96 0.00 1949 25,200 0.595 -5.91 0.42 11610 26,100 0.575 -5.80 0.54 8711 23,900 0.625 -6.12 0.79 7912 25,000 0.600 -5.96 0.00 16513 26,100 0.575 -5.80 0.54 9214 25,000 0.600 -5.96 0.00 9515 24,800 0.605 -5.96 0.52 4216 26,600 0.565 -5.70 0.72 24617 24,800 0.605 -5.96 0.52 18018 25,000 0.600 -5.96 0.00 6219 24,100 0.620 -6.07 0.76 22820 25,000 0.600 -5.96 0.00 167

The genetic algorithm found the The genetic algorithm found the exact result 9/20 times, and was exact result 9/20 times, and was close enough on four other occasions close enough on four other occasions for the correct result to be for the correct result to be determined by the addition of some determined by the addition of some other iterative technique, for a total other iterative technique, for a total of 65% accuracy.of 65% accuracy.

If the GA was rerun, and the best result If the GA was rerun, and the best result selected, the accuracy increased to 88%selected, the accuracy increased to 88%

After 5 runs, the accuracy was over 99%After 5 runs, the accuracy was over 99%

Because no correct answer was found Because no correct answer was found after 200 iterations, the number of after 200 iterations, the number of generations was reduced to 200generations was reduced to 200

Output CurveOutput Curve

OutlineOutline IntroductionIntroduction The Science of AsteroseismologyThe Science of Asteroseismology The Genetic AlgorithmThe Genetic Algorithm Parallel ComputingParallel Computing ConclusionConclusion

Problem DivisionProblem Division

Part one: running the numerical Part one: running the numerical model using a large number of model using a large number of different initial parameters. different initial parameters.

Part two: determining fitness, Part two: determining fitness, selecting the next generation, and selecting the next generation, and performing crossover/mutationperforming crossover/mutation

Master-Slave ParadigmMaster-Slave Paradigm Part one – running the model with a Part one – running the model with a

given set of parameters was given set of parameters was performed by the slave nodesperformed by the slave nodes

Part two – fitness evaluation, Part two – fitness evaluation, selection/crossover/mutation was selection/crossover/mutation was performed by the master nodeperformed by the master node

PVMPVM PVM was used as the message PVM was used as the message

passing librarypassing library

ExecutionExecution The master machine generates a job pool The master machine generates a job pool

of parameter values that it passes to the of parameter values that it passes to the slave machines. slave machines.

The slave machines in turn run the model The slave machines in turn run the model and return the results to the master. and return the results to the master.

If there are more parameter sets If there are more parameter sets available, the node is given another job. available, the node is given another job.

ExecutionExecution The master calculates variance. The master calculates variance. Determines fitness. Determines fitness. After the models have been run for a given After the models have been run for a given

generation, the master determines the generation, the master determines the members of the next generation and runs members of the next generation and runs the crossover/mutation methods on the the crossover/mutation methods on the appropriate portion of the new population. appropriate portion of the new population.

As the new parameters are created, they As the new parameters are created, they are sent to the workstations.are sent to the workstations.

The NetworkThe Network The Cluster is composed of one The Cluster is composed of one

master computer and 64 slave nodesmaster computer and 64 slave nodes The cluster of computers is divided The cluster of computers is divided

into three subnetsinto three subnets Each subnet is connected to the Each subnet is connected to the

master serially, using coaxial cable master serially, using coaxial cable and a 10base-2 (thin Ethernet) systemand a 10base-2 (thin Ethernet) system

DarwinDarwin Pentium-II 333 MHz system with 128 Pentium-II 333 MHz system with 128

MB RAMMB RAM Two 8.4 GB hard disks. Two 8.4 GB hard disks. Three NE-2000 compatible network Three NE-2000 compatible network

cards, one for each of the segmentscards, one for each of the segments

DarwinDarwin

NodesNodes MotherboardMotherboard ProcessorProcessor Single 32 MB RAM chipSingle 32 MB RAM chip NE-2000 compatible network cardNE-2000 compatible network card No Hard drive!No Hard drive!

NodesNodes Half of the nodes contain Pentium-II Half of the nodes contain Pentium-II

300 MHz processors, while the other 300 MHz processors, while the other half are AMD K6-II 450 MHz chips half are AMD K6-II 450 MHz chips

The ClusterThe Cluster

ConclusionConclusion Based on initial results, the use of Based on initial results, the use of

genetic algorithms appears to be a genetic algorithms appears to be a promising method for minimizing the promising method for minimizing the residual difference between residual difference between observational data and the Wilson—observational data and the Wilson—Devinney model Devinney model

ConclusionConclusion It is also a wonderful example of how It is also a wonderful example of how

parallel computing, open source parallel computing, open source software and clusters of workstations software and clusters of workstations can have a profound impact on the can have a profound impact on the course of research.course of research.

PIKAIA NamesakePIKAIA Namesake

““Pikaia Gracilens, a little worm-like beast that crawled in the mud of a Pikaia Gracilens, a little worm-like beast that crawled in the mud of a long gone seafloor of the Cambrian era, 530 million years ago. While long gone seafloor of the Cambrian era, 530 million years ago. While not particularly impressive in the tooth and claw department, Pikaia not particularly impressive in the tooth and claw department, Pikaia

is believed to be the founder of the phylum Chordata, whose is believed to be the founder of the phylum Chordata, whose subsequent evolution had consequences still very much felt today by subsequent evolution had consequences still very much felt today by

the rest of the ecosystem”the rest of the ecosystem”

ReferencesReferences1.1. Metcalfe, T. S. (1999), Metcalfe, T. S. (1999), Genetic-Algorithm Based Light-Curve Genetic-Algorithm Based Light-Curve

Optimization Applied to Observations of the W Ursae Majoris Optimization Applied to Observations of the W Ursae Majoris Star Bh CassiopeiaeStar Bh Cassiopeiae, The Astronomical Journal, Vol. 117, No. 5, , The Astronomical Journal, Vol. 117, No. 5, pp. 2503-2510pp. 2503-2510

  2.2. Metcalfe, T. S., R. E. Nather, and D. E. Winget (2000), Metcalfe, T. S., R. E. Nather, and D. E. Winget (2000), Genetic-Genetic-

Algorithm-Based Asteroseismological Analysis of the DBV White Algorithm-Based Asteroseismological Analysis of the DBV White Dwarf GD 358Dwarf GD 358, The Astrophysical Journal, Vol. 545, No. 2, pp. , The Astrophysical Journal, Vol. 545, No. 2, pp. 974-981 974-981

  3.3. Metcalfe, T. S. (2000), Metcalfe, T. S. (2000), The Asteroseismology MetacomputerThe Asteroseismology Metacomputer, ,

Baltic Astronomy, Vol. 9, pp. 479-483Baltic Astronomy, Vol. 9, pp. 479-483

ReferencesReferencesAuthor’s Web page:Author’s Web page:http://www.whitedwarf.orghttp://www.whitedwarf.org

Wilson-Devinney:Wilson-Devinney:http://cdsads.u-strasbg.fr/cgi-bin/nph-bib_quhttp://cdsads.u-strasbg.fr/cgi-bin/nph-bib_query?1971ApJ...166..605Wery?1971ApJ...166..605W

PIKAIA Web Page:PIKAIA Web Page:http://www.hao.ucar.edu/public/research/si/phttp://www.hao.ucar.edu/public/research/si/pikaia/pikaia.htmlikaia/pikaia.html

ReferencesReferencesImage SourcesImage Sources

All images were taken from: All images were taken from: http://www.whitedwarf.orghttp://www.whitedwarf.org

Except… Except…

H-R DiagramH-R Diagramhttp://www.astunit.com/tutorials/stellar.htmhttp://www.astunit.com/tutorials/stellar.htm

Pikaia Gracilens: PIKAIA WebsitePikaia Gracilens: PIKAIA Website

top related