parallel memetic algorithm quick design guide quick tips...
TRANSCRIPT
RESEARCH POSTER PRESENTATION DESIGN © 2011
www.PosterPresentations.com
QUICK DESIGN GUIDE (--THIS SECTION DOES NOT PRINT--)
This PowerPoint 2007 template produces a 91cm x 122cm
professional poster. It will save you valuable time placing
titles, subtitles, text, and graphics.
Use it to create your presentation. Then send it to
PosterPresentations.com for premium quality, same day
affordable printing.
We provide a series of online tutorials that will guide you
through the poster design process and answer your poster
production questions.
View our online tutorials at:
http://bit.ly/Poster_creation_help
(copy and paste the link into your web browser).
For assistance and to order your printed poster call
PosterPresentations.com at 1.866.649.3004
Object Placeholders
Use the placeholders provided below to add new elements
to your poster: Drag a placeholder onto the poster area,
size it, and click it to edit.
Section Header placeholder
Move this preformatted section header placeholder to the
poster area to add another section header. Use section
headers to separate topics or concepts within your
presentation.
Text placeholder
Move this preformatted text placeholder to the poster to
add a new body of text.
Picture placeholder
Move this graphic placeholder onto your poster, size it
first, and then click it to add a picture to the poster.
Student discounts are available on our Facebook page.
Go to PosterPresentations.com and click on the FB icon.
QUICK TIPS
(--THIS SECTION DOES NOT PRINT--)
This PowerPoint template requires basic PowerPoint
(version 2007 or newer) skills. Below is a list of commonly
asked questions specific to this template.
If you are using an older version of PowerPoint some
template features may not work properly.
Using the template
Verifying the quality of your graphics
Go to the VIEW menu and click on ZOOM to set your
preferred magnification. This template is at 100% the size
of the final poster. All text and graphics will be printed at
100% their size. To see what your poster will look like
when printed, set the zoom to 100% and evaluate the
quality of all your graphics before you submit your poster
for printing.
Using the placeholders
To add text to this template click inside a placeholder and
type in or paste your text. To move a placeholder, click on
it once (to select it), place your cursor on its frame and
your cursor will change to this symbol: Then, click
once and drag it to its new location where you can resize
it as needed. Additional placeholders can be found on the
left side of this template.
Modifying the layout
This template has four different
column layouts. Right-click your
mouse on the background and
click on “Layout” to see the
layout options. The columns in
the provided layouts are fixed and cannot be moved but
advanced users can modify any layout by going to VIEW
and then SLIDE MASTER.
Importing text and graphics from external sources
TEXT: Paste or type your text into a pre-existing
placeholder or drag in a new placeholder from the left
side of the template. Move it anywhere as needed.
PHOTOS: Drag in a picture placeholder, size it first, click
in it and insert a photo from the menu.
TABLES: You can copy and paste a table from an external
document onto this poster template. To adjust the way
the text fits within the cells of a table that has been
pasted, right-click on the table, click FORMAT SHAPE then
click on TEXT BOX and change the INTERNAL MARGIN
values to 0.25
Modifying the color scheme
To change the color scheme of this template go to the
“Design” menu and click on “Colors”. You can choose from
the provide color combinations or you can create your
own.
© 2011 PosterPresentations.com 2117 Fourth Street , Unit C Berkeley CA 94710 [email protected]
• Genome size is a parameter that determines the length of the genetic information carried by individuals.
• Increasing the genome size causes the algorithm to execute longer for both CPU and GPU algorithms since genome size is directly proportional to the size of data processed.
In this paper, a parallel memetic algorithm implementation for CUDA platform is described. The conventional genetic operators are adapted to the GPU considering the GPU architecture. In this population based optimization technique, there are one more islands and each island consists of constant number of individuals. Each CUDA thread is responsible for evolution of one individual, and islands are mapped as CUDA blocks to benefit from the shared memory. The results show up to 38x speedup compared to the CPU implementation.
ABSTRACT
Populations reside in thread blocks, so the shared memory is effectively used for genetic operators including crossover, mutation and local search. In each thread block, each thread is responsible for an individual.
At the beginning of the kernel code, each thread reads its genome data from global memory and puts them into shared memory. After that point all of the calculations done by threads are made on shared memory except the migration operator. Since migration operator requires communication between distinct islands (thread blocks), global memory has to be employed for this operation.
GENERAL DESIGN
KERNEL PSEUDO CODE
EXPERIMENTS
TESTING CONFIGURATION
The performance of our implementation has been evaluated on a PC having Intel Core i7 870 @ 2.93 GHz processor and NVDIA GeForce GTX 460 (336 cores) GPU. All of the experiments have been repeated 20 times using the same configuration..
The parallel memetic algorithm configuration:
• population size: 16 to 512
• number of islands: 16 to 1024
• genome size: 4
• tournament selection with winning probability of 0.8
• mutation probability: 0.05
• local search iteration number: 10
• local search step size: 0.001
• migration interval: 10
CONCLUSION
• While it is dependent on the fitness functions, GPU implementation shows better performance for all types of fitness functions.
• The best speedup is achieved with Griewangk function where 38x speedup is observed.
• The future work will be focusing on more complex optimization problems providing solutions to real world problems.
• Moreover, different genetic operators (local search, mutation and crossover) are planned to be implemented on CUDA to achieve better optimization results.
Parallel Memetic Algorithm
Implementation on CUDA Umut Cinar, Alptekin Temizel
Graduate School of Informatics, Middle East Technical University, Ankara, Turkey
Function Formula Global Minimum Properties
Rosenbrock 𝑓𝑅𝑜𝑠 𝑥 = [100 𝑥𝑖+1 − 𝑥𝑖2 2 + (𝑥𝑖 − 1)
2]
𝑝−1
𝑖=1
𝑥∗ = 1,1,… , 1 ; 𝑓𝑅𝑜𝑠 𝑥∗ = 0
Not
seperable
Rastrigin 𝑓𝑅𝑎𝑠 𝑥 = 10𝑝 + [𝑥𝑖2 − 10cos(2𝜋𝑥𝑖)]
𝑝
𝑖=1
𝑥∗ = 0,0,… , 0 ; 𝑓𝑅𝑎𝑠 𝑥∗ = 0
Seperable
and
Multimodal
Griewangk 𝑓𝐺𝑟𝑖 𝑥 = 1 + 𝑥𝑖2
4000
𝑝
𝑖=1
− cos(𝑥𝑖
𝑖)
𝑝
𝑖=1 𝑥∗ = 0,0,… , 0 ; 𝑓𝐺𝑟𝑖 𝑥
∗ = 0
Not
seperable
and
Multimodal
Schwefel 𝑓𝑆𝑐ℎ 𝑥 = ( 𝑥𝑗
𝑖
𝑗=1
)2
𝑝
𝑖=1
𝑥∗ = 0,0,… , 0 ; 𝑓𝑆𝑐ℎ 𝑥∗ = 0
Not
seperable
Initialize shared memory buffers
WHILE termination criterion is not met; DO
evaluate individual
begin local search
tournament selection
IF thread_id < warp_size
offsprings = crossover(parents)
offsprings = mutation(offsprings)
replace offsprings with worst_individuals
ELSE IF thread_id < warp_size*2 AND
(MIGRATION_INTERVAL is true)
migrate(best_individuals)
END IF END WHILE
• Threads within a block continues in parallel until the variation and migration operators take place.
• At this point, two warp of threads executes crossover/mutation and migration operations. First warp is assigned to crossover/mutation while the second warp is assigned to migration (when it is migration interval).
• Hence, this part of the implementation is completely unrolled by using a warp for each operation. This approach decreases the cost of warp divergence significantly.
• The experiments show that the results are heavily dependent on the fitness function type. In a memetic algorithm, the most computationally complex routine is the fitness function, and this is reflected in the dependency of performance on fitness function type.
• The maximum speedups achieved with Griewangk function as x38.
16
64
2560
2
4
6
8
10
12
14
Spe
ed
up
Population Size
Number of Islands Rosenbrock function on GPU (max speedup is x13)
12-14
10-12
8-10
6-8
4-6
2-4
0-2
16
64
2560
5
10
15
20
25
Spe
ed
up
Population Size
Number of Islands
Rastrigin function on GPU (max speedup is x25)
20-25
15-20
10-15
5-10
0-5
16
64
256
0
2
4
6
8
10
Spe
ed
up
Population Size
Number of Islands
Schwefel function on GPU (max speedup is x9)
8-10
6-8
4-6
2-4
0-2
16
64
2560
10
20
30
40
Spe
ed
up
Population Size
Number of Islands
Grienwangk function on GPU (max speedup is x38)
30-40
20-30
10-20
0-10
Speedup has been investigated using several benchmarking functions including Rosenbrock, Rastrigin, Schwefel and Griewangk and various parameters.
INFLUENCE OF GENOME SIZE ON EXECUTION TIME
53 97
141 185 229
1989 3408
5027 6626 8536
1
10
100
1000
10000
2 4 6 8 10
Exe
cuti
on
Tim
e (
ms)
Genome Size
Genome Size vs Execution Time
CPU
GPU
REFERENCES
[1] Pablo Moscato, Carlos Cotta, “A Gentle Introduction to Memetic Algorithms”, Handbook of Metaheuristics, pp. 105-144, 2003. [2] Wojciech Bozejko, Mieczyslaw Wodecki, “The Methodology of Parallel Memetic Algorithms Designing”, in Proc. ICAART 2011.