study of a paper about genetic algorithm for cs8995 parallel programming yanhua li
Post on 19-Dec-2015
214 views
TRANSCRIPT
The paper to be studied:
VLSI Circuit Synthesis using a Parallel Genetic Algorithm
Mike Davis, Luoping Liu, and John G. Elias
Department of Electrical Engineering
University of Delaware
Newark, DE. 19716
Introduction
This paper talks about a parallel implementation of a genetic algorithm used to evolve simple analog VLSI circuits.
Linda coordination language is used to implement the parallel GA.
Linda coordination language
Extends the syntax of conventional programming language to support parallel operating processesIts shared-memory programming model compares well to other parallel environments.The shared-memory programming model consists of a Tuple-Space(TS).
What is Tuple-Space?
Tuple-Space(TS) is a virtual shared-memory data storage area that processes can read and write.
Tuple-Space(TS), physically, is widely distributed over the processors in the system but logically, is just a simple memory.
C-language Linda consists
four programming language extensions:
out( ) : move data into TS
in() : move data out of TS
rd( ) : get a copy of data from TS
eval( ) : add a new C procedure to the TS which joins the ranks of executing processes
Using Linda…Data are read from the TS in an associative matching scheme.Matching criteria is sent to the TS via in() or rd() functions.The first tuple to match is returned to the requesting processThe matching scheme gives a good load balancing since processes continuously probe the TS for job to do. Therefore faster processes get more job to do.
The parallel GA applicationParallel Environment
Consists of a network of 20 Sun IPC workstations. 16 of the machines are connected to the FDDI
network and to the Ethernet. Worker processors run on FDDI connected
machines. Master processors run on non-FDDI-connected
machines. Commercial Linda package was used
The parallel GA application…
The parallel GA is mapped onto the TS using two schemes:
The centralized scheme
consists of a single breeding population
The distributed scheme
consists of multiple breeding populations which may interact with each other from time to time
The TS consists of…
four types of tuples:
Members of the breeding population, Ci,p
Each describes a particular circuit (e.g. transistor parameters: type, number, dimension, placement, connections).
Offspring of breeding pairs, Oj,p
Average fitness of the population, AFp
Best individual in the population, BIp
Master Processes
Create worker processes
Initialize and manage the population
Compute the average fitness AFp
Keep track of best-individual in the population BIp, and average-fitness of the population AFp
Add new offspring to the population—
Master Processes
Whenever a worker writes an offspring Oj,p to TS, the Master will randomly select a member of the population, then compare its fitness value with the average fitness AFp, until find an individual whose fitness value is less than AFp, then replace this individual with the new offspring. By this way, the population size remains constant.
Worker ProcessesRepeat the following:
Randomly select two breeding individuals Ci,p from the population using rd() function.Combine those two individuals using crossover and mutation to produce two new offspring and evaluate their performance.Compare the better performing offspring’s fitness value with AFp. If it is larger than AFp, the offspring is written to TS using out() function.
Centralized and Distributed scheme
Centralized Scheme One master, multiple workers (in this report, 16
workers)Distributed Scheme
Divide TS into several equivalent regions, each has the same tuple arrangements as in the centralized scheme.
Divide population into subpopulations, each has one master and several worker processes (in this report, 4 subpopulations, each has a master and 4 workers).
Analysis for results
For criterion avg_fit >=29.96The centralized algorithm shows nearly linear speedup (16.5/17).The distributed-with-isolated-populations GA fails to reach fitness criterion.The distributed-with-communicating-Masters GA has speedup much less than the theoretical maximum (17/20).
Analysis for results
For criterion to gen# = 40
The time to reach a specific generation is largely related to the worker processes, so theoretical maximum speedup = 16
All 3 methods have similar speedup and take about the same time to reach the criterion
Development rules
Why use development rules? The GA method creates possible connection
patterns but most of them result in non-viable circuits which cannot produce offspring.
Use development rules to eliminate connection patterns which are not sensible or cannot produce offspring.