study of a paper about genetic algorithm for cs8995 parallel programming yanhua li

Study of a Paper about Genetic Algorithm

For CS8995

Parallel Programming

Yanhua Li

The paper to be studied:

VLSI Circuit Synthesis using a Parallel Genetic Algorithm

Mike Davis, Luoping Liu, and John G. Elias

Department of Electrical Engineering

University of Delaware

Newark, DE. 19716

Introduction

This paper talks about a parallel implementation of a genetic algorithm used to evolve simple analog VLSI circuits.

Linda coordination language is used to implement the parallel GA.

Linda coordination language

Extends the syntax of conventional programming language to support parallel operating processesIts shared-memory programming model compares well to other parallel environments.The shared-memory programming model consists of a Tuple-Space(TS).

What is Tuple-Space?

Tuple-Space(TS) is a virtual shared-memory data storage area that processes can read and write.

Tuple-Space(TS), physically, is widely distributed over the processors in the system but logically, is just a simple memory.

C-language Linda consists

four programming language extensions:

out( ) : move data into TS

in() : move data out of TS

rd( ) : get a copy of data from TS

eval( ) : add a new C procedure to the TS which joins the ranks of executing processes

Using Linda…Data are read from the TS in an associative matching scheme.Matching criteria is sent to the TS via in() or rd() functions.The first tuple to match is returned to the requesting processThe matching scheme gives a good load balancing since processes continuously probe the TS for job to do. Therefore faster processes get more job to do.

The parallel GA applicationParallel Environment

Consists of a network of 20 Sun IPC workstations. 16 of the machines are connected to the FDDI

network and to the Ethernet. Worker processors run on FDDI connected

machines. Master processors run on non-FDDI-connected

machines. Commercial Linda package was used

The parallel GA application…

The parallel GA is mapped onto the TS using two schemes:

The centralized scheme

consists of a single breeding population

The distributed scheme

consists of multiple breeding populations which may interact with each other from time to time

The TS consists of…

four types of tuples:

Members of the breeding population, Ci,p

Each describes a particular circuit (e.g. transistor parameters: type, number, dimension, placement, connections).

Offspring of breeding pairs, Oj,p

Average fitness of the population, AFp

Best individual in the population, BIp

Master Processes

Create worker processes

Initialize and manage the population

Compute the average fitness AFp

Keep track of best-individual in the population BIp, and average-fitness of the population AFp

Add new offspring to the population—

Master Processes

Whenever a worker writes an offspring Oj,p to TS, the Master will randomly select a member of the population, then compare its fitness value with the average fitness AFp, until find an individual whose fitness value is less than AFp, then replace this individual with the new offspring. By this way, the population size remains constant.

Worker ProcessesRepeat the following:

Randomly select two breeding individuals Ci,p from the population using rd() function.Combine those two individuals using crossover and mutation to produce two new offspring and evaluate their performance.Compare the better performing offspring’s fitness value with AFp. If it is larger than AFp, the offspring is written to TS using out() function.

Centralized and Distributed scheme

Centralized Scheme One master, multiple workers (in this report, 16

workers)Distributed Scheme

Divide TS into several equivalent regions, each has the same tuple arrangements as in the centralized scheme.

Divide population into subpopulations, each has one master and several worker processes (in this report, 4 subpopulations, each has a master and 4 workers).

Centralized and Distributed scheme

Results for population size 1000

Analysis for results

For criterion avg_fit >=29.96The centralized algorithm shows nearly linear speedup (16.5/17).The distributed-with-isolated-populations GA fails to reach fitness criterion.The distributed-with-communicating-Masters GA has speedup much less than the theoretical maximum (17/20).

Analysis for results

For criterion to gen# = 40

The time to reach a specific generation is largely related to the worker processes, so theoretical maximum speedup = 16

All 3 methods have similar speedup and take about the same time to reach the criterion

The four types of inverter amplifiers discovered by GA

The relationship between # of different types and some parameters

Development rules

Why use development rules? The GA method creates possible connection

patterns but most of them result in non-viable circuits which cannot produce offspring.

Use development rules to eliminate connection patterns which are not sensible or cannot produce offspring.

Questions?

Thanks!

study of a paper about genetic algorithm for cs8995 parallel programming yanhua li

Documents

population af p

bi p slide

population bi p

time slide

p average fitness

parallel operating processes

master processes

worker processes