coevolution and performance evaluation of graph based...

Darmstadt University of Technology

Institut fur Datentechnik

Fachgebiet Mikroelektronische Systeme

Prof. Dr. Dr. h.c. mult. Manfred Glesner

Fachbereich Informatik

Fachgebiet Algorithmik

Prof. Dr. Karsten Weihe

Diploma Thesis

Coevolution and Performance Evaluation of

Graph Based Cellular Automata Rules for the

Density Classification Task

Author : Andre Schumacher

Advisor : Dr.-Ing Peter Zipf, Prof. Dr. Karsten Weihe

Start : May 2005

End : October 2005

Ehrenwortliche Erklarung

Hiermit versichere ich, die vorliegende Diplomarbeit ohne Hilfe Dritter und nur mitden angegebenen Quellen und Hilfsmitteln angefertigt zu haben. Alle Stellen, die ausden Quellen entnommen wurden, sind als solche kenntlich gemacht worden. DieseArbeit hat in gleicher oder ahnlicher Form noch keiner Prufungsbehorde vorgelegen.

Herewith I declare that I have made the presented paper myself and solely with theaid of the means permitted by the examination regulations of the Darmstadt Universityof Technology. The literature used is indicated in the bibliography. I have indicatedliterally or correspondingly assumed contents as such.

Darmstadt, Oktober 2005

Andre Schumacher

iii

Contents

1 Overview 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Introduction 3

2.1 Cellular Automata and the Density Classification Task . . . . . . . . . 32.1.1 A Short Review of Cellular Automata . . . . . . . . . . . . . . . 32.1.2 The Density Classification Task . . . . . . . . . . . . . . . . . . 6

2.2 Coevolution of CA rules for the Density Classification Task . . . . . . . 102.2.1 A Short Review of Coevolutionary Algorithms . . . . . . . . . . 102.2.2 Coevolution and the Density Classification Task . . . . . . . . . 132.2.3 Juille and Pollack’s Approach to the Density Classification Task 15

2.3 Small-World Graphs and the Density Classification Task . . . . . . . . 202.3.1 A Short Review of Small-World Graphs . . . . . . . . . . . . . . 202.3.2 Applications to the Density Classification Task . . . . . . . . . 26

3 The Coevolution Framework 33

3.1 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.1.1 Graph Based Cellular Automaton Model . . . . . . . . . . . . . 333.1.2 Coevolutionary Model . . . . . . . . . . . . . . . . . . . . . . . 34

3.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413.2.1 Distributed Architecture . . . . . . . . . . . . . . . . . . . . . . 423.2.2 The Coevolutionary System . . . . . . . . . . . . . . . . . . . . 453.2.3 The Logging System . . . . . . . . . . . . . . . . . . . . . . . . 503.2.4 Composed System . . . . . . . . . . . . . . . . . . . . . . . . . 513.2.5 Example System Run . . . . . . . . . . . . . . . . . . . . . . . . 52

4 Coevolution Run and Evaluation of Rules and Graphs 55

4.1 Analysis of System Runs . . . . . . . . . . . . . . . . . . . . . . . . . . 554.1.1 Fitness and Performance . . . . . . . . . . . . . . . . . . . . . . 564.1.2 Evolutionary Changes in Rule and Configuration Populations . 574.1.3 Evolutionary Changes in Graph Population . . . . . . . . . . . 59

4.2 Evaluation of Evolved Rules and Graphs . . . . . . . . . . . . . . . . . 614.2.1 Evaluation Performance and Graph Properties . . . . . . . . . . 614.2.2 Ring-Based Small-World Graph Model . . . . . . . . . . . . . . 624.2.3 Evaluation Performance Depending on Graph Models . . . . . . 664.2.4 Evaluation Performance Depending on φ . . . . . . . . . . . . . 684.2.5 Significance Value S . . . . . . . . . . . . . . . . . . . . . . . . 704.2.6 Evaluation Performance Depending on S . . . . . . . . . . . . . 72

v

Contents

4.2.7 Performance Development in Noisy Environment . . . . . . . . 754.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

5 Conclusion 79

Bibliography 83

A Implementation 87

A.1 Class Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87A.2 Population Data File Format . . . . . . . . . . . . . . . . . . . . . . . . 95A.3 Random Generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

B Results 99

B.1 Coevolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99B.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100B.3 Gbca rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100B.4 Gbca graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100B.5 Density-Performance Diagram . . . . . . . . . . . . . . . . . . . . . . . 103

C Tools and Utilities 105

C.1 Configuration Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105C.1.1 client.xml . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105C.1.2 genetic.xml . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106C.1.3 server.xml . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

C.2 Main Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109C.2.1 CAClient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109C.2.2 CAServer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

C.3 Evaluation Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111C.3.1 GbcaEvaluator . . . . . . . . . . . . . . . . . . . . . . . . . . . 111C.3.2 GbcaGraphComparator . . . . . . . . . . . . . . . . . . . . . . 112C.3.3 GbcaGraphTestBench . . . . . . . . . . . . . . . . . . . . . . . 113C.3.4 GbcaNoiseTestBench . . . . . . . . . . . . . . . . . . . . . . . . 114C.3.5 GbcaTestBench . . . . . . . . . . . . . . . . . . . . . . . . . . . 115C.3.6 GenericCaTestBench . . . . . . . . . . . . . . . . . . . . . . . . 116

C.4 Visualisation Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116C.4.1 GbcaViewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116C.4.2 GenericCaViewer . . . . . . . . . . . . . . . . . . . . . . . . . . 117

C.5 Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117C.5.1 ConfigDumper . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117C.5.2 IndividualScanner . . . . . . . . . . . . . . . . . . . . . . . . . . 117

vi

List of Tables

2.1 Parameters of Mitchell’s genetic algorithm . . . . . . . . . . . . . . . . 82.2 Overview of rules evolved for the density classification task . . . . . . . 102.3 Parameters used by Juille and Pollack . . . . . . . . . . . . . . . . . . 16

3.1 Adaptations for different evolutionary strategies . . . . . . . . . . . . . 50

4.1 Coevolution run parameters . . . . . . . . . . . . . . . . . . . . . . . . 564.2 Performance comparison between Gbca rule and Gbca graph, majority

rule, Coevolution2 and GKL . . . . . . . . . . . . . . . . . . . . . . . . 674.3 Performance comparison between Gbca rule and majority rule for dif-

ferent graph classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

vii

List of Figures

2.1 CA for the density classification task . . . . . . . . . . . . . . . . . . . 62.2 Space-time diagrams of the GKL rule . . . . . . . . . . . . . . . . . . . 82.3 Evolutionary versus coevolutionary optimisation . . . . . . . . . . . . . 132.4 Informational value E() depending on p . . . . . . . . . . . . . . . . . . 182.5 Space-time diagrams of the Coevolution2 rule by Juille and Pollack . . 192.6 Small-world graph: Interpolating between lattice and random structure 212.7 Comparison of L and γ for the β and φ-graph model . . . . . . . . . . 252.8 Incoming vertex degree distribution in directed scale-free graph model . 272.9 Influence of the coupling structure on the behaviour of the majority rule:

Space-time diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312.10 Consecutive block of cells tending to keep minority consensus . . . . . . 32

3.1 Coevolutionary optimisation of graph based cellular automata . . . . . 353.2 Execution modes for distributed architecture . . . . . . . . . . . . . . . 443.3 GenerationManager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473.4 Evolution and GenerationManager . . . . . . . . . . . . . . . . . . . . 493.5 GBCA computation and coevolutionary system integration . . . . . . . 513.6 Exemplary coevolution run . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.1 Development of fitness and performance within each population . . . . 574.2 Changes in generation performance depending on density for GBCA

rules and initial configurations . . . . . . . . . . . . . . . . . . . . . . . 584.3 Development of the distribution of clustering coefficients γv and the

length of shortest paths . . . . . . . . . . . . . . . . . . . . . . . . . . . 604.5 Construction of RingSWGraph . . . . . . . . . . . . . . . . . . . . . . 644.6 Comparison of graph models: Dependency of rule performance on φ . . 694.7 Comparison of graph models: Dependency of S on φ . . . . . . . . . . 714.8 Comparison of graph models: Dependency of rule performance on S . . 724.9 Space-time diagrams of Gbca rule for increasing φ values . . . . . . . . 744.10 Space-time diagrams of Gbca rule for the RingSWGraph-model . . . . 754.11 Rule performance comparison under the influence of noise . . . . . . . 76

A.1 Class Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87A.1 File Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

B.1 Performance versus ρ . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

ix

1 Overview

1.1 Motivation

Graph theory and its application to communication networks have become steadilymore important since the beginning of the telecommunication age. New communica-tion systems have emerged, starting from the telegraph, followed by the telephone,systems of satellite and relay radio stations and more recently, wireless local area com-munications systems, which all have in common that their behaviour and efficiencystrongly depend on their basic underlying principles, such as the protocols to drivethem.

The study of network systems that have evolved over a period of time unaffectedby a central planning authority, such as the Internet for example, and the questionhow their nature determines their structure are still quite new and have only beenaddressed recently. The surprising insight that in fact networks that have developedin some sense “naturally” do not fit into one of the traditional models used for suchnetworks so far, regular or complete random structures, was discovered by researcherslike Watts, Strogatz and Barabasi, to note just a few. Instead, these networks usuallylie somewhere in the middle, within the small-world region.

The next step would be now to find out in which way structure and dynamic be-haviour of small-world shaped systems relate to one another. For this reason, simpleproblem cases like the density classification task originating from the field of cellularautomata can be useful. This task includes the need of an efficient communicationstructure to compute a global automaton property based on highly local informationonly. It requires long range coordination and synchronisation of distributed computingelements, which have to process information efficiently because the amount of localmemory is strictly limited. For these reasons, one can directly observe the impact oflocal changes within the connection structure on the complete system, making it a suit-able test environment to study the influence of network structure on dynamic systembehaviour.

A problem one has to overcome while optimising a graph based cellular automatonfor such a test case is that the coupling structure and the automaton function haveto be considered as a combined system and optimisation has to work on both of themsimultaneously, while they are strongly interdependent on one another. This is the kindof optimisation problem where the coevolution paradigm can be applied successfully,as will be shown later.

1

1 Overview

1.2 Outline

This work addresses the following issues:

1. Coevolution of a high performance pair of cellular automaton rule and couplingstructure for the density classification task.

2. Analysis whether evolution tends to develop graphs with small-world properties.

3. Performance evaluation of the best evolved rule and coupling structure, charac-terising the dependence of the rule on changing graph structures and determiningthe influence of noise on the rule’s performance.

4. Classification of the rule’s behaviour, comparing it to other well known rules,including the majority rule, GKL and Coevolution2.

The second chapter gives an overview of the three fields that are directly linked to thetopic. They are cellular automata, the coevolution paradigm and small-world networks.At least one application to the density classification task has originated from each ofthe fields, which is briefly outlined in the context.

In the third chapter, the modifications that have been made to the coevolutionaryframework of cellular automata rules for the density classification task are discussed.These modifications are employed to enable the coevolution of graph based couplingstructures. The distributed architecture of the system that was designed to performthe coevolution is also described.

In the fourth chapter, the data obtained from a run of the coevolutionary systemis analysed. This analysis is divided into two parts: The direct examination of thepopulation data and the re-evaluation of selected rule and graph individuals by provid-ing new input data. The second part contains a performance comparison of the bestof the evolved rules with automaton rules that have been previously discovered usingdifferent graph models. A special focus lies on the dependency of rule performance onvarious structural properties of the particular graph model. During these comparisons,the best evolved rule and graph pair significantly outperforms both the best standardcellular automaton rule found for the density classification task so far and the bestdeterministic rule using graph based coupling structures. At the end of the fourthchapter, the behaviour of the best evolved rule and graph within a noisy environmentis compared to the behaviour of other rules, the GKL and the majority rule.

The last chapter summarises the results and describes an application of the approachfollowed in this work to a more general set of problem cases.

Additional information about the implementation, including class diagrams and adescription of the population data file format, is given in the appendix. Furthermore,all command line tools used to start the system and evaluate the candidate solutionsare briefly described, along with the list of parameters that were used to generate thedata presented in the fourth chapter. The best evolved rule and graph pair is alsodepicted in the appendix.

2

2 Introduction

This chapter gives an overview of the three fields the topic is mostly concerned with:The field of cellular automata, coevolutionary optimisation algorithms and small-worldnetworks.

At least one application to the density classification task has originated from eachof the fields, which is briefly summarised in the context. The special focus lies on theaspects that are directly connected to the implementation presented in the followingchapter. Although this chapter gives an introduction to the topic, it already containsimportant concepts and definitions, which are used in later chapters.

2.1 Cellular Automata and the Density Classification

Task

Starting from a short review of cellular automata, this section introduces the automatonmodel that is used later to develop the model of a graph based cellular automaton.Further on in this section, a description of the density classification task is given andthe first approaches to this optimisation problem are outlined.

2.1.1 A Short Review of Cellular Automata

Cellular automata have become popular models with a wide range of application inrecent years. Processes from biology, sociology and physics, to note just a few areasof application, could be transferred to the concept of cellular automata. But also thefundamental research has made significant progress, although the particular focus ofthe work shifted over time. They do not only include a computation paradigm thatis quite different compared to the centralised paradigm that is widely used nowadays,including one, or a small number of processing units. Cellular automata are also ableto show complex and sometimes fascinating behaviour while consisting of simple unitsinterconnected to each other.

The concept of a massively parallel automaton, composed of relatively simple cells,came up now and then in recent years under different names. According to the litera-ture, the first one to study this concept was John von Neumann around 1947, who wasinterested in constructing the most simple system capable of reproduction in a sensesimilar to biology (see [25] p. 876 to 878 for the following).

Influenced by Stanislaw Ulam, von Neumann later described a two-dimensional sys-tem containing cells that only exchange information with their direct neighbours. Eachof the cells could be in one of 29 possible states and their behaviour was based on a

3

2 Introduction

rather complicated set of rules, motivated by the process of electronic system assembly(see [8] fore details).

In the next years, work concentrated on the question whether these kinds of automatawere capable of self-reproduction and universal computation. The former question wasanswered by von Neumann by giving several configurations of cell states that were ableto reproduce themselves according to the cell state update rules. The latter questionwas proven much later to be correct even for a particular one-dimensional automaton inthe sense that it was able to emulate a universal Turing machine (see [25] p. 1115, [5]).

Until now, the behaviour of different complex systems originating from biology, so-ciology, ecology, physics and other disciplines have already been studied using cellularautomata, even if the terminology changed from case to case.

To avoid confusion, the definition of a cellular automaton used in this work willbe fixed, keeping in mind that the term is also used in a much broader sense in theliterature. The definition below is based on [8].

Definition 2.1. A cellular automaton (CA) consists of a d-dimensional grid of iden-tical units corresponding to finite-state machines (FSM), which are called cells. Eachcell resides in one of a set S of possible cell states, which it updates synchronouslyin discrete time steps according to an uniform update rule Φ. The state of a cell isvisible to the surrounding cells within a radius r, which form the neighbourhood of theparticular cell. Φ is also called “transition rule” or “CA rule” and the next cell statedepends only on the current state of the neighbouring cells and on the cell’s own state.Therefore, in the terminology of FSM, the tuples of states of the cell’s neighbours formthe input alphabet of the FSM. Its output function, only depending on the cell’s ownstate, is the identity function on S.For d = 1, the neighbourhood is composed of the r closest cells on either side of thecell. The d-dimensional vector of cell states at time step t is also called a configurationat time-step t, which represents the state of the entire cellular automaton.Remark: Usually, the number of cells is considered to be infinite, but it is also pos-sible to fix the number of cells and formulate periodic boundary conditions. The startstate or initial configuration of the CA represents the input of the automaton and theconfiguration reached after a specified number of steps tmax is regarded as its output.

The main part of research so far has focused on one or two-dimensional automata,probably because of the simple visualisation of CA behaviour in this class. The mostpopular form of visualisation of CA behaviour is the space-time diagram, which displaysthe state of each cell as it changes over time, one step below the other. See Fig. 2.2 foran example. Unless noted otherwise, all cellular automata discussed in this work areconsidered to be one-dimensional.

Wolfram [25] developed a classification scheme by systematically examining the be-haviour of a special class of CAs, which he calls the elementary CAs. They can beregarded as the special case of one-dimensional CA by setting r = 1 and S = {1, 0}.According to this scheme, each of the possible 256 elementary CA update rules falls

4

2.1 Cellular Automata and the Density Classification Task

into one of four classes depending on patterns visible in the space-time diagrams of theCA for arbitrary initial configurations.

The first class within Wolfram’s classification scheme consists of those rules thatalmost always reach the same end state for all cells, so that they can be separated in“almost-always-all-cells-zero” and “almost-always-all-cells-one” rules. Rules of the sec-ond class possess a set of different possible end states, sometimes also called attractingstates. The space-time diagrams of rules from this class can all be characterised bya few domains of cells that are either fixed to the same state or alternate through asequence of states periodically. The rules from class three exhibit a more complex be-haviour, but their space-time diagrams show repeating structures that can be identifiedeasily, forms of triangles for example. Finally, the fourth class shows the most complexbehaviour. Looking at the space-time diagrams, the observer might discover simplelocalised structures, which interact with each other in a rather complicated way. Theserules also seem to be located somewhere between class two and three in terms of cellactivity, which means the number of state changes over the run. Rules from class one,as Wolfram remarks, tend to reach their end state quickly for all initial configurations,whereas rules from class three continue to spread their patterns, therefore showing ahigh cell activity.

Wolfram suspects that the class-four CA rules are those rules, which are capable ofuniversal computation (see [25] pages 691 to 693). Indeed, it was shown in [5] thatthis is at least the case for one particular class-four elementary CA. However, the prooffor the general case is still missing, as Wolfram also omits a clear definition of theclassification scheme (other than looking at the space-time diagrams and interpretingthe CA’s behaviour). Despite its informal definition, Wolfram extends his classificationscheme to more general classes of CAs in more than one dimension.

Another approach to classify CA rules was done later by Langton (see [8] for details).In contrast to studying the behaviour of the automaton as it advances over time,Langton formulates a statistic of the CA rule, the λ value. He claims that λ wouldcorrespond to the ability of the rule to show long-time dynamic behaviour in termsof state changes. For a binary state CA with S = {0, 1}, this value is simply theratio of cell state values that are mapped to ”1” by the update rule Φ. Interestingly,Langton notes that CA rules falling into different ranges of λ values exhibit differentbehaviour and can be characterised by properties similar to the ones Wolfram used forhis classification scheme. Langton claims that there is a λ region whose correspondingCA rules show behaviour somewhere between ordered and chaotic and that those rulesare in fact the rules which would have been classified by Wolfram to reside in class four(see [15]).

Besides the CA rules according to Definition 2.1, there have been also studied othertypes of CAs, like those whose cells possibly possess different update rules or which arebased on an irregular connection structure. Sipper applied this approach in [20] to thedensity classification task.

5

2 Introduction

01 1 1 1 10 0 0

01 1 1 1 10 0 1

0

0

0

1

1

N3

0

0

1

1

1

St

0

0

0

1

1

N1

0

0

1

1

1

N2

N2 N3 S N6N5N4N1

0

1

0

1

1

St+1

0

0

0

1

1

N5

0

1

1

1

1

N6

0

0

1

1

1

N4

Φ

t

t + 1

Figure 2.1: CA for the density classification task, Φ can be given in the form of aboolean truth or lookup table, also called rule table.

2.1.2 The Density Classification Task

Definition 2.2. When considering a one-dimensional binary state CA with S = {0, 1},n cells and a radius r < n, the density classification task is the following: Starting froma random configuration, the automaton has to relax to the fixed-point state all-cells-zero or all-cells-one after a specified number of time steps N . The task is consideredto be successful if and only if this end state corresponds to the state that was takenby the majority of cells in the initial configuration. The fraction of “1”s within aconfiguration is also denoted as density ρ.1 Typically, n is chosen to be an odd numberto avoid ρ = 0.5 for the initial configuration.

The density classification task has been first addressed by Packard in 1988 (see [15]for a review of his work). Packard considered a CA with radius r = 3, so that theneighbourhood of each cell is formed by its three neighbours on each side (see Fig. 2.1).He presumed a spatial periodic boundary condition, which means that the cell latticeforms a closed circle. With a neighbourhood size of six and the dependence of thenext state of each cell on the current state, the possible solution space of CA rules istherefore of the size 2128, which is far too huge to be searched through exhaustively.For this reason, Packard used a genetic algorithm to evolve the CA rules.

Packard’s approach includes a population of CA rule individuals, whose chromo-somes contain the bit-string representation of the CA rule table content, making thetransfer from the genotype, the bit-string encoding, to the phenotype, the CA ruletable, simple and straight-forward. The genotype representation uses a lexicographicnumbering scheme corresponding to the order of neighbourhood cells in the CA lattice(with the state of the particular cell in the centre), as described in Fig. 2.1. Accordingto this representation, the cell configuration “0000000” is mapped to the first bit ofthe CA rule chromosome, “0000001” to the second and so on. The genetic operators

1Therefore, the density classification task is also sometimes called Majority Classification Problemor ρc = 1

2problem.

6


performing crossover between two individuals and mutation of chromosomes are im-plemented by exchanging parts of the bit-strings between individuals and random bitflipping respectively. More precisely, one-point crossover is used. See Section 2.2 forthe terminology concerning genetic algorithms.Packard’s algorithm as described in [15] is summarised below:

1. Initialise the CA rule population randomly according to a uniform density distri-bution over the interval [0, 1].

2. Evaluate the fitness of each individual by performing a number of runs with afixed number of CA steps. Each run is initialised with a different randomly gen-erated configuration, while the density of these configurations follows a uniformdistribution over the interval [0, 1]. Then, compute the fraction of cells in the endstate of the automaton, which are in the same state as the majority of cells inthe initial configuration at the starting point of the automaton run. This is thescore of the CA rule individual for the particular initial configuration. Sum upall scores to compute the overall fitness of the CA rule individual.

3. Rank all individuals in fitness order.

4. Take a fraction of the best individuals over to the next generation, form theremaining individuals by performing crossover and mutation between the bestindividuals.

5. Repeat from step 2 on.2

The focus of Packard’s work was not to develop a CA rule that can perform the densityclassification class well or even solve it, but to show that the genetic algorithm wouldevolve rules with a particular bias towards rules with a certain density, denoted ascritical density. This idea was motivated by Langton’s concept of “computation at theedge of chaos” (see [15]) in connection with CA rules. Langton assumed, similar toWolfram, that reasonable CA computation could only be possible within a region ofthe rule space closely located between order and disorder.

Mitchell and her colleagues tried to reproduce the results of Packard in [15], in orderto examine the drift towards certain λ values within the CA population that Packardhad reported. Although they used the same parameter values as Packard, as far as theycould be reconstructed, Mitchell et al. were unable to observe the drift. Additionally,they introduced a stochastic method to determine the number of steps the CA wasallowed to run before the state was considered to be the end-state. In this way, theytried to avoid CA rules that start to alternate between the all-cells-zero and the all-cells-one state, as those were considered undesirable. The other values of the geneticalgorithm can be found in Tab. 2.1. Mitchell at al. compared their results with thebest CA rule for the density classification known at that time, the Gacs-Kurdyumov-Levin (GKL) rule. GKL, constructed by hand by Gacs, Kurdyumov and Levin, was

2Typical evolution run length: About 100 generations

7

2 Introduction

Table 2.1: Parameters of the genetic algorithm used by Mitchell in [15]

CA cells 149CA radius 3CA run length ≈ 298CA rule population size 100initial configurations per generation 300generation gap 0.5bit mutation rate 0.03

1491 1491 1491

t :

298

0

Figure 2.2: Space-time diagrams of the GKL rule: Cells in state “1” are indicated inblack, “0”-state cells are indicated in white. Domains that seem to be coloured in greyare in fact filled with a checkerboard pattern of close “0” and “1” states.

originally not designed to perform any particular computation task, but rather to studycomputation under the effect of errors, according to Mitchell. GKL classifies localmajorities pretty well and has only two attracting end states, the all-cells-zero and all-cells-one states. Nevertheless, it does not solve the density classification task for anyarbitrary initial configuration either but performs significantly better than the rulesevolved by Mitchell’s genetic algorithm.

See Fig. 2.2 for some space-time diagrams of the GKL rule performing the densityclassification task. It is worth to mention the different kinds of domains that appearwithin the diagram. Besides the domains of consecutive strands of “1”s and “0”s,there can be seen a checkerboard pattern propagating through the automaton. Thispattern can be thought of as a region of cells where the majority has not yet locallybeen determined, but has to be considered on a greater scale from the automatonpoint of view. These kinds of domains interact with other domains at their boundaries

8


according to certain rules. For GKL, one can easily identify the regularity behind thisscheme.

A more general approach was studied by Das, Mitchell and Crutchfield in [7], whoexamined CA behaviour and presented a scheme to classify behaviour based on theseboundary structures, which they call particles. The particle based classification is partof the “computational mechanics” framework, which is briefly outlined in [7]: Thespace-time diagrams of the rule under consideration are scanned for recurring domainsof the same pattern of cell states. Each domain pattern can then be described by adeterministic FSM that accepts exactly the cell configurations of this pattern. Afterconstructing one for each pattern, these FSMs are employed as filters running on thespace-time diagrams to extract recurring irregularities between the domains, whichare then identified as particles. The quantity of particles and the complexity of theirinteractions can be used to determine the computational capability of the automaton.Das et al. also present the performance values for the best CA rules they were able toevolve using their algorithm, which could still not outperform GKL.

The density classification task is an interesting problem case on its own, mainly be-cause it includes the concept of global computation: Cells have to cooperate to computea global property based on local information. To successfully accomplish this task, theyhave to efficiently communicate over a possibly long distance, in terms of links betweenthe cells. Information has to be processed efficiently on a global scale because theamount of local memory is strictly limited.

After Mitchell et al. had presented the unsatisfactory results delivered by their geneticalgorithm, the research community started a race about finding the best CA rule toperform the density classification task. However, it was proven in [13] that no binarystate cellular automaton is able to solve the task perfectly. Nevertheless, an upperbound for the rule performance is still unknown and rules with increasingly betterperformance have been found.

In [1] for example, Koza describes an approach, based on the genetic programmingparadigm, to develop rules that for the first time performed significantly better thanhuman-generated rules like GKL. According to this approach, the chromosome thatencodes the candidate solution (in this case a CA rule) is represented as a high-leveltree structure. So for the density classification, instead of using the rule table entriesdirectly, as it had been done before, Koza and his colleagues represented the CA ruleas a combination of boolean functional primitives to reach a more dense neighbour-hood structure among the candidate solutions. Consequently, this approach requiresmore sophisticated genetic operators for mutation and recombination but also intro-duces a much higher degree of freedom to model these operators. The fitness evaluationbecomes more expensive because of the more complex translation from genotype to phe-notype. Nevertheless, the computation expenses caused by the increase in complexityare negligible in comparison with the costs to perform the CA runs.

Later, Juille and Pollack considered a quite different approach in [12]. They exam-ined the progress of rules achieved by the genetic algorithm of Mitchell [7, 14, 15] andidentified one key problem in the quickly flattening learning curve of the rules. Instead,

9

2 Introduction

Table 2.2: Overview of rules evolved for the density classification task, partly takenfrom [1]

Best rule evolved by standard genetic algorithm, see [7] 76.9%GKL 1978, human-written 81.6%Davis 1995, human-written 81.8%Das 1995, human-written 82.178%Koza 1996, evolved by genetic programming, see [1] 82.326%Juille and Pollack 1998, evolved by coevolution, see [12] 86.0%

the CA rules should be confronted step-by-step with a slightly more challenging situ-ation, as soon as they are making progress in terms of classifying more and in somesense more “difficult” initial configurations. See the next chapter for details. Juille andPollack were able to evolve significantly better rules than Mitchell and Koza and couldestablish the current record according to the knowledge of the author.

The particular performance values for the CA rules discovered over the years aresummarised in Tab. 2.2. Note that these values are computed for a large set of initialconfigurations, typically 105 to 107, whose density ρ is binomially distributed with apeak at ρ = 0.5. The fraction of correctly classified configurations forms the perfor-mance. This method differs from the fitness computation of CA rule individuals duringthe evolution process as it was typically used, but it better reflects the differences be-tween the rules because most rules perform equally well for initial configurations withrelatively high or low density.

2.2 Coevolution of CA rules for the Density

Classification Task

First, a short review of coevolutionary optimisation algorithms is given, mainly concen-trating on terminology in comparison to general evolutionary algorithms. Subsequently,the applications of the coevolution paradigm to the density classification task are de-scribed with a special focus on the approach of Juille and Pollack, which forms thestarting point of the coevolutionary framework used in this work.

2.2.1 A Short Review of Coevolutionary Algorithms

The application of heuristic optimisation algorithms, such as genetic algorithms, hasalready proven successful for a whole range of problems. However, there are difficul-ties one has to overcome to model an optimisation problem in a form suitable for thistype of algorithm. The representation of the candidate solutions and the genetic op-erators working on them, as well as finding an appropriate fitness function to guide

10

2.2 Coevolution of CA rules for the Density Classification Task

the selection process, have to be considered. For coevolutionary genetic algorithmsessentially the same kind of questions have to be answered and some new ones arise.Before going into detail, it may be advisable to fix the terminology to later highlight thedifference between coevolutionary and traditional genetic algorithms. This definitiondoes not capture every aspect of a genetic algorithm and should only be used to clarifyterminology.

Definition 2.3. A genetic algorithm (GA) is a heuristic optimisation algorithm mo-tivated by evolution in nature, which works on a population of candidate solutions,the individuals. It tries to emulate the vital processes involved in evolution, selection,reproduction and mutation, and includes the profound principle of “survival of thefittest”. Consequently, it is necessary to formulate a fitness function that successfullyexpresses the capability of an individual, which means the suitability of the candidatesolution it represents for the optimisation problem.

Two representations of candidate solutions can therefore be distinguished: The firstis the genotype, the bit-string encoding of the candidate solution, forming the genome 3

of the individual. The second is the phenotype, the outward manifestation of theindividual, which contains the traits encoded in the genotype and determines the fitnessdepending on environmental influences.

The GA typically proceeds stepwise, processing one generation of individuals duringeach cycle, as follows:

Initialisation: The initial generation is formed of randomly constructed individualsthat represent valid candidate solutions.

Selection: At the beginning of the generation cycle, the fitness function is appliedto each individual. Thereafter, the individuals are normally ranked according totheir fitness and a sampling algorithm is used to preferably select couples of highfitness individuals for mating.

Reproduction: Each couple that was chosen for mating produces two offspring individ-uals by applying the crossover operator to the parents’ genomes. The simplest ofsuch operators, the one-point crossover, chooses a cut point for both of the parentgenomes and pairwise exchanges the chunks of genetic code after this point toproduce the offspring.

Mutation: The newly generated offspring individuals are exposed to random bit-mutations, according to a pre-defined mutation rate.

Elitism: If the concept of elitism is included in the GA, a fraction of the generation(the individuals with the highest fitness) is taken over to the next generationwithout modification. Otherwise, the next generation is exclusively formed ofthe offspring individuals resulting from the reproduction stage. The fraction of

3or rather: forming one or several chromosomes within the genome

11

2 Introduction

newly created individuals for each generation is called generation gap. The nextgeneration cycle is started from the selection stage.

Despite the fact that the principles of GAs are universal enough to model a wide classof optimisation tasks, there are problems when the fitness function itself is not well-defined or distorted by noise.

Consider a typical problem-learner scenario, where there is a set of problem defini-tions contained in a problem set. The goal of the optimisation procedure is to evolveparameters for a problem solver to be able to solve all, or at least as many of the prob-lems as possible. Following the typical GA modelling approach, one would probablydesign a population of solvers and evaluate their fitness by the ability to solve elementsof the problem set. However, if this set would be a subset of all possible problem cases,which would be much larger or even of infinite size, then the fitness evaluation wouldnecessarily be imprecise. As the fitness function is the key element in steering theevolutionary progress to the right direction, the overall performance would possibly besuboptimal.

This is the main motivation Juille and Pollack give in [12] to apply the coevolutionapproach to the density classification task.

Definition 2.4. The concept of a coevolutionary genetic algorithm (CGA) is the gen-eralisation of the genetic algorithm model that includes more than one population. Theindividuals of different populations belong to different classes of individuals, also calledspecies. The populations are not allowed to mix between different species or producemutual offspring. However, they might affect each other through their fitness functions.

CGAs can be classified into cooperative models, where the species are trying to reacha mutual goal, and competitive coevolutionary genetic algorithms, where each speciestries to reach its own goal, possibly defeating another species.

A simple competitive CGA for the situation described above would model two species:The species of problems and the species of problem solvers. The fitness function of thesolvers would reward them in proportion to the degree they defeat problem individualsby solving the problems that these represent. Contrary, problem individuals would berewarded for being too difficult to be solved by solvers.

CGAs have been applied to a number of scenarios similar to the problem-solversituation described above. See Fig. 2.3 for a comparison between the GA and CGAapproach. Some included competitive models, often used in analogy to a “predator-prey” relationship between the species. Hillis was one of the first to introduce sucha model for coevolving sorting networks. See [2] for details. Paredis later used aparticular form of competitive coevolution to evolve solutions for a neural net learningand a constraint satisfaction problem in [18]. Others have followed the cooperativeapproach, by letting different populations work hand-in-hand and try to maximise theoverall solution performance. Husbands, for example, applied a cooperative multi-population approach to the job scheduling problem in [9] to demonstrate the possibleapplication of CGAs to multi-criteria and multi-constraint optimisation problems.

12


Evaluation

Modification

Environment

Training

Exploration

Solution

(a) Evolutionary optimisation of candidate solutions

� Training Environmentmust contain implicitknowledge of problemsituation

� Fixed learning gradient

Evaluation

Modification

Environment

Training

Exploration

Solution

(b) Coevolutionary optimisation of candidate solutions & training set

� Less knowledge of theproblem situation hasto be integrated intothe Training Environ-ment

� Variable learning gradi-ent

Figure 2.3: Evolutionary versus coevolutionary optimisation

2.2.2 Coevolution and the Density Classification Task

According to the author’s knowledge, Paredis was the first to examine the effect ofCGAs on the density classification task in [19]. He encountered a problem in connectionwith the Red Queen effect. The Red Queen effect is normally regarded desirable and onecould claim an advantage of coevolutionary genetic algorithms over traditional geneticalgorithms. According to Paredis, the effect was named after a character of the novel“Alice in Wonderland”. The Red Queen explains to Alice that in the queen’s world,individuals are forced to move constantly in order to stay at the same place. Therefore,it is used as an analogy to the driving force of adaptability and evolutionary progressnot to fall behind the opposing population, especially in competitive coevolutionaryalgorithms.

Paredis’ approach involves two different species, the species of initial configurationsand the species of CA rules, standing in a competitive relationship to one another. Thechromosomes of each species are encoded bitwise, as already seen for the CA rules whileconsidering the standard genetic algorithm approach. The fitness functions of the twopopulations reflect the competition between them: An initial configuration scores if itis significantly hard to be classified by the CA rules and for a rule it is desirable toclassify as many of the configurations from the current population as possible.

In contrast to traditional genetic algorithms, which evaluate each individual once periteration by applying the fitness function to it, Paredis followed a different approach he

13

2 Introduction

already introduced in [18]: The individuals of each population are evaluated continu-ously over their lifetime. During each iteration, a fixed number of couples of individualsfrom the different populations are selected and evaluated depending on each other. Thisis called an encounter. The fitness of each individual depends on the outcome of itslast 20 encounters and the selection of the particular individuals to participate in anencounter is biased towards individuals with a high fitness. In the first version of thecoevolution algorithm, the same kinds of genetic operators are used for both species:crossover, to combine two parent chromosomes to produce offspring and random bitmutation.

Paredis reports about the following problem: After initialising each population withindividuals whose density is uniformly distributed in the interval [0, 1] (like Mitchell ini-tialised the CA rule population in [15]), the population of initial configurations quicklyspecialises for either a density greater than 0.5, or lower than 0.5.4 As just one bit isnecessary to cross the border between these two regions, the population of configura-tions is still highly flexible after this specialisation because random bit mutations arepossible. The CA rule population, on the other hand, will be trained either to classifymostly high densities correctly, or low densities, depending on which kind of configu-ration forms the majority. A quick consideration shows that rules with a high densitymore easily classify initial configurations with a high, than with a low density and theopposite. Consequently, after the initial configurations specialised for, let one assumehigh densities, the CA rules will suffer a great selection pressure towards rules with ahigh density themselves because the evolutionary drive forces them to adapt (the RedQueen effect). This process, however, takes far more time for the rules than for theconfigurations, as the number of bits that have to be flipped might be significantlylarge. Additionally, there are complex interrelations between the single bit positions ofthe rule table, so that bitwise mutations can easily destroy the rule’s ability to reach aconsensus state.

As a consequence, the population of configurations outperforms the population ofrules most of the time, until the rules adapt and themselves outperform the config-urations for a little while. This process is continuously repeated, so that no highperformance rules can emerge.

Paredis presents a possible solution to overcome the problem resulting from the RedQueen effect by partly violating the coevolution paradigm. The initial configurationsare no longer evolved, but new configurations are generated randomly with uniformlydistributed density, so that there is no inheritance between configurations anymore.Still, the selection of initial configurations is based on their fitness, so that configura-tions that are more difficult to classify are more often chosen for encounters and longerremain within the population. According to Paredis, the performance of the evolvedrules subsequently improved and came close to the results reported by Mitchell in [15].

In addition to the problem caused by the Red Queen effect, Juille and Pollack describe

4The number of cells was chosen to be odd, as seen before, therefore avoiding the possibility of initialconfigurations with density 0.5.

14


another problem that coevolutionary algorithms are prone to in [12]. The authorsdepict a situation in which groups of average performance individuals coexist in asingle population in a static manner. No changes occur inside the population becauseeach change comes together with a lower overall performance. Juille and Pollack referto this situation as mediocre stable or meta-stable state. They also note that theproblem caused by the Red Queen Effect in past experiments is actually due to theloss of traits within the CA rule population. Rule individuals tend to loose theirability to correctly classify older configurations if those are not present in the currentconfiguration population.

Juille and Pollack present a coevolution framework in [12] and apply it to the den-sity classification task as a reference example. Although the main coevolution setup,consisting of two populations that belong to different species standing in a competitiverelationship, is the same as Paredis used before, the two approaches are in fact quitedifferent. In the following, the work of Juille and Pollack, published in [11,12], will bediscussed, as the approach presented in the next chapter is based on this concept.

2.2.3 Juille and Pollack’s Approach to the Density Classification

Task

An outline of the basic algorithm follows. For the particular parameter values seeTab. 2.3.

1. Both populations are initialised according to a uniform density distribution overthe interval [0, 1].

2. For each initial configuration individual, a new configuration individual is gener-ated with the same density.

3. The rule individuals are tested against the configurations generated in the stepbefore.

4. � The best 95% of initial configurations are taken over to the next generation.The remaining 5% of configurations for the next generation are created ran-domly according to a uniform density distribution over the interval [0, 1].

� The best 20% of rules are taken over to the next generation. The remaining80% of rules for the next generation are formed by recombination (includinga one-point crossover and random bit mutation) between the top 20% ofindividuals.


The main goals of the implementation of the coevolution concept that Juille and Pollackdenote are:

5Typical evolution run length: Several thousands of generations

15

2 Introduction

Table 2.3: Summary of coevolution parameters used by Juille and Pollack

CA cells 149CA radius 3CA rule population size 400 /1000CA initial configuration population size 400 /1000rule generation gap 0.8initial configuration generation gap 0.05bit mutation rate 0.02

� to ensure that the CA rule individuals are provided an optimal learning gradient.This means that the initial configurations should be neither too easy (so thatmost rules succeed), nor too difficult to classify (so that most rules fail) becausethe evolution would otherwise be unable to pick the most promising individualsfrom the rule population. As the number of configurations, the rules are testedagainst, is significantly small compared to the total space of configurations, theinitial configurations should be able to differentiate effectively between promisingrules and those which are not. In this sense, they should be informative.

� to prevent rule individuals from loosing traits in terms that they “forget” whatthey have learnt some generations before. In practice, the CA rules have to bealways tested against some older initial configurations during each iteration, sothat the Red Queen Effect can not negatively influence the evolutionary process.

� to maintain a constant pressure on the rule individuals to correctly classify ini-tial configurations that are just beyond their reach, as this favours rules, whichare able to adapt best to changes in the evolution environment. Conversely, itis assumed that the rules, which are able to adapt best to changes in the setof configurations they have to compete against, are more likely to successfullyclassify even more difficult configurations. Thus, Juille and Pollack claim thatadaptation is the driving force for improvement.

Juille and Pollack also introduced the concept of resource sharing to the coevolutionof CA rules and initial configurations. This concept had already been addressed bythe same authors in [10], where they compared a genetic programming approach to theIntertwined Spirals Problem based on relative fitness computation with one based onabsolute fitness computation. Juille and Pollack were able to show that for the partic-ular problem case, a relative fitness function, such as one based on resource sharing,facilitates the evolution of better individuals compared to the individuals evolved usingan absolute fitness function.

The principle of resource sharing is intuitively clear: Instead of treating the event ofa successfully performed test independently for each individual and each test case, thepayoff for success is depending on how many other individuals have been successful for

16


the same test case. Juille and Pollack claim that this approach supports a greater di-versity within the population compared to the absolute fitness case, therefore exploitingthe solution space exploration more efficiently.

This claim is also comprehensible, given that in the absolute fitness case, individualswith an outstanding performance will tend to flood the population if no measures aretaken to prevent this (e.g. a spatial distribution of the individuals). Using resourcesharing, moderate performance individuals also have a chance to reproduce if they areable to succeed over rather seldom correctly solved test cases. At a later stage, theevolution might generate an individual that combines the different traits.

To reach their aims formulated above, Juille and Pollack introduced characteristicsfor CA rule and initial configuration individuals. In the following, R denotes thepopulation of CA rule individuals and IC denotes the population of initial configurationindividuals. Let covered be a |R| × |IC| matrix indicating which CA rule individualof the current rule generation could defeat which initial configuration individual of thecurrent generation of configurations. So coveredi,j = 1 if rule i classified configurationj correctly, otherwise coveredi,j = 0.

Definition 2.5. The weight weight ICi of the initial configuration individual ICi iscomputed as follows:

weight ICi :=

{

1P|R|

k=1covered(k,i)

if ∃k : covered(k, i) ≡ 1

1 otherwise

Intuitively, the more rules fail to classify a configuration ICi correctly, the higher itsweight. Here the concept of resource sharing among the rule individuals is imple-mented. For the weight of a CA rule individual, it is necessary to introduce anothercharacteristic.

Definition 2.6. For any rule Ri and initial configuration ICj with density ρ(ICj) theinformational value E(Ri, ρ(ICj)) is defined as follows:

E(Ri, ρ(ICj)) :=

{

log(2) + p ∗ log(p) + q ∗ log(q) p > 0 ∧ q > 01 otherwise

where p is the fraction of initial configurations with density ρ(ICj) that Ri classifiescorrectly and q is the fraction of these configurations, which Ri fails to classify correctly.Remark: q = 1 − p and E() can be regarded as the inverse of the entropy, givingevidence whether Ri performs strictly better or strictly worse than random guessingfor configurations with density ρ(ICj). See Fig. 2.4 for a plot of the function.

The informational value E() gives information about the relevance of the correct or falseclassification of an initial configuration with the given density. Note that p can not bedetermined in practise because the knowledge of the capability of the rule individualis necessarily imprecise. Instead, an approximation based on the configurations in the

17

2 Introduction

00.

20.

40.

6

0.1 0.3 0.5 0.7 0.9

Info

rmat

ional

Val

ue

E()

p

log(2) + p ∗ log(p) + q ∗ log(q)

Figure 2.4: Informational value E() depending on p

current generation has to be used. To merge data and get a more concise value, therange of density values is aggregated to density bins of size 2. p is then computed asthe fraction of correctly classified configurations lying within the particular density bin.

Definition 2.7. The weight weight Ri of the CA rule individual Ri is computed asfollows:

weight Ri :=

{ 1P|IC|

k=1E(Ri,ρ(ICk))∗covered(i,k)

if ∃k : E(Ri, ρ(ICk)) ∗ covered(i, k) 6= 0

1 otherwise

If the informational value is disregarded, then one can see that the more initial con-figurations the rule fails to classify, the higher is its weight. Although this may seemcounterintuitive, one should note that the rule weight will be part of the fitness com-putation for initial configurations, as seen below. For an initial configuration, it isfavourable that it is classified correctly only by a minimal number of rules because thepopulations are competing against each other. On the other hand, if a particular ruledefeats it, then this event is less significant if this rule defeats all other configurationsas well. Here, the concept of research sharing is implemented for the initial config-uration population. Because configurations should be informative in the sense thatrandom guessing of rules should be avoided, the informational value is added to obtaina weighted sum.

The fitness functions for CA rule and initial configuration individuals are then definedusing the characteristic values above.

Definition 2.8. The fitness f(ICi) of an initial configuration ICi is computed asfollows:

f(ICi) :=

|R|∑

k=1

weight Rk ∗ E(Rk, ρ(ICi)) ∗ (1− covered(k, i))

18


1491 1491 1491

t :

298

0

Figure 2.5: Space-time diagrams of the Coevolution2 rule by Juille and Pollack: Cellsin state “1” are indicated in black, “0”-state cells are indicated in white.

Definition 2.9. The fitness f(Ri) of a CA rule individual Ri is computed as follows:

f(Ri) :=

|IC|∑

k=1

weight ICk ∗ covered(i, k)

Juille and Pollack remark that no special method to avoid the negative influence of theRed Queen Effect was implemented in the algorithm. Instead, they claim an inherentproperty of the density classification task prevents the loss of traits after the rules arriveat a certain performance level. Experiments support this claim, as a rule that success-fully classifies configurations with density ρ < ρcrit = 0.5, but ρ ≈ ρcrit, also performsreasonably well for configurations with density ρ′ < ρ. The same holds of course fordensities ρ > ρcrit.

Another reason for the absence of the Red Queen Effect might result from the closerintertwined relationship between the two species based on the informational value,which forces the initial configurations not to concentrate on the most difficult densityregion ρcrit = 0.5. Consequently, the configurations are unable to exploit the rulescompletely and to bring the evolution to a stop, so that it is no longer possible todistinguish promising CA rules from those which are not.

Juille and Pollack were able to find two high performance rules, the better of themachieving about 86%, which is a significant improvement compared to the known CArules for the classification task at that time. See Fig. 2.5 for space-time diagrams ofthis rule.

Werfel, Mitchell and Crutchfield examined the coevolution framework further in [24]and studied the effect of coevolution and optional resource sharing on the quality ofthe evolved rules. Werfel et al. claim that the coevolution paradigm did not deserve

19

2 Introduction

the merit Juille and Pollack granted it because the concept of resource sharing hadhad the greater influence on the performance improvement during the evolutionaryprocess. To support their claim, they study different combinations of a standard geneticalgorithm with and without resource sharing, as well as the same combination for thecoevolutionary approach. However, Werfel et al. agree with Juille and Pollack thatwithout the coevolution principle no high performance rules could have been evolved.

2.3 Small-World Graphs and the Density Classification

Task

Starting from a short review of small-world graphs, different graph characteristic valuesare introduced along with several small-world graph construction algorithms that areused in the fourth chapter. The end of this section is concerned with different appli-cations of small-world networks to the density classification task. In this context themajority rule plays an important role, as it is used later to compare its performance tothe performance of the evolved rules.

2.3.1 A Short Review of Small-World Graphs

The expression six degrees of separation is probably well-known, as almost everyonehas already heard the extraordinary claim that every person on the planet is not morethan six hand-shakes away from any other person. In 1967, the social scientist StanleyMilgram conducted an experiment within this context (see [22] and [4] for the detailsand the following). Milgram randomly selected a group of people from one region ofthe United States and asked them to send letters to people from a different part ofthe country. The recipient was unknown to the person receiving Milgram’s requestbefore, but the person was allowed to send it to someone who he or she knew on a firstname basis and who was expected to be closer to the target. The intermediate personswithin the chain of recipients were asked to acknowledge the letter on its arrival, sothat Milgram was able to trace their path and also collect additional information. Amedian of 5 resulted for the lengths of completed letter chains, which is a significantlysmall number compared to the population size.

At the beginning of the 1970’s, Mark Granover introduced the concept of strong andweak ties to social networks. He conducted a survey, asking people questions abouttheir employment situation and how they had learnt about the open position. It turnedout that for the purpose of finding a new job, acquaintances are more important thanclose friends. Granover described a social network in which each vertex belongs to ahighly clustered circle of friends interconnected by strong ties, while some few long-distance weak ties connect it to other individuals, who are members of different circlesthemselves. This model of social networks is different from the models used in sociologybefore, where random networks had played a major role.

20

2.3 Small-World Graphs and the Density Classification Task

(a) Lattice Graph (b) Small-World Graph (c) Random Graph

Increasing Randomness

Figure 2.6: Small-world graph: Interpolating between lattice and random structure

During the study of biological oscillators like crickets, Duncan Watts had to considersimilar problems for the group of insects that were examined for sociology before. Thequestion about how they are connected to each other in terms of which cricket listensto which other one would seriously affect the network’s capability for synchronisation.Starting from there, Watts and his Ph.D. advisor Steven Strogatz developed a completenew network model, which they applied to different real world networks. This gaveevidence for the widespread existence of networks in nature, which lie between twoextremes: regular lattice structures and complete random graphs. Watts and Strogatzpresented their research 1998 in the journal Nature (see [23]), also including applicationsto dynamic systems, like disease spreading in populations of connected individuals. Thisarticle caused a series of further publications by different researchers, who studied therange of application of this new network model, which became known as the small-worldgraph model.

The research conducted by Watts after the state of the Nature article was presentedin [22]. Watts discusses different graph models to interpolate between lattice andrandom graphs, starting from a model, whose connection structure was motivated bysociology (see Fig. 2.6). Informally, the defining properties of a small-world graph areits relatively high clustering of neighbourhoods of vertices, while the graph diameter issignificantly small and therefore resembles more the diameter of a random graph.

Watts’ first graph model, the α-graph model, suffers from the undesirable effectthat the resulting graph falls apart in unconnected clusters for moderate α parametervalues. In order to avoid this disadvantage, abstract from sociological properties andconcentrate solely on the network structure, Watts presents two different models, theβ and the φ-model. As those will be used later, they are described below along withthe most important graph characteristics to compare them.

21

2 Introduction

Watts considers all graphs to be undirected. Because it is necessary to use regulardirected graphs for the later graph based CA application (see Chapter 3), some of themodels and characteristics have to be adjusted accordingly. However, the differencebetween this work and the corresponding aspects in [22] is discussed in the context. Inthe following, it is assumed that the given graph G(n, k) is a regular directed graphwith n vertices and outgoing vertex degree k. The first definition is taken from [22],Definition 2.2.2.

Definition 2.10. The characteristic path length (L) of a graph G gives evidence aboutthe distribution of the lengths of shortest paths within the graph. Let V (G) denotethe set of vertices of G. L is determined in three steps:First, for all vertices u, v ∈ V (G), let d(u, v) be the length of the shortest path connect-ing u and v. Second, let du be the average value for d(u, v) over all v ∈ V (G). Finally,define L to be the median of {du|u ∈ V (G)}.

The characteristic path length captures the degree of separation within the graph. Itis maximal for the lattice graphs and minimal for the random graph, as one can easilyverify. The next defining property of the small-world model is the clustering coefficient.

Definition 2.11. Following Definition 2.2.5. [22]:

- The incoming neighbourhood Γ(v)in of a vertex v is the subgraph that includesall vertices u linked to v, such that there exists an edge (u, v) in G.

- The outgoing neighbourhood Γ(v)out of a vertex v is the subgraph that includesall vertices w linked to v, such that there exists an edge (v, w) in G.

Definition 2.12. According to Definition 2.2.10. [22]:The clustering coefficient γv of Γv describes the degree of cliquishness around any vertexv, that is to which extent vertices that are connected to v are connected to each other.In the presence of directed edges, there are two cases to be taken into account. Inparticular it is defined:

γinv :=

|E(Γ(v)in)|

|V (Γ(v)in)| ∗ (|V (Γ(v)in)| − 1)

γoutv :=

|E(Γ(v)out)|

k ∗ (k − 1)

E(Γ(v)in) and E(Γ(v)out) denote the set of edges within the incoming neighbourhoodof v and the outgoing neighbourhood of v respectively and V (Γ(v)in) denotes the setof vertices within the incoming neighbourhood of v. Unless mentioned otherwise, γout

v

will be taken into consideration, so that the term γv is used to refer to γoutv .

22


Algorithm 1: β-graph model: construction algorithm

Data:

N : number of verticesk: outgoing vertex degreeβ: the defining model parameter

Result: β-graph G(N, k)begin

/* Initialisation */

G ← lattice graph G(N, k), whose vertices are arranged on a ring, each ofthem possessing a directed edge to its k / 2 closest neighbouring vertices oneither side;

/* Rewiring */

forall edges e = (u, v) dor ← uniformly distributed random variable in [0, 1];if r < β then

new neighbour ← randomly selected vertex, with new neighbour 6= v

and @ e ∈ E(G): e ≡ (u, new neighbour);e ← new edge (u, new neighbour);

end

end

return G ;end

For the later utilisation of graphs to form CA coupling structures, it was necessaryto adjust the two preceding definitions to take directed edges into account. However,the captured information is very similar to the original definition by Watts. It willbe shown later that the two clustering coefficients, γout

v and γinv , indeed develop very

similarly for the graph models taken into account. Therefore, the restriction on γoutv

can be regarded justified for the ease of computation. The motivation for the distinc-tion between outgoing and incoming neighbourhood originates from the nature of theheuristic graph construction algorithms. See below.

Definition 2.13. In compliance with Definition 2.2.11. [22]:The clustering coefficient γ of G is defined as the average value of γv over all v ∈ V (G).

In the following, the β and φ-graph models, which were used by Watts to reproducethe properties of the more intricate α-graph model, are transferred to directed, regulargraphs. See Alg. 1 for the modified β-graph model. It is necessary to introduce a newconcept for the φ-graph model, the shortcut edge.

Definition 2.14. In compliance with Definition 3.1.1. [22]:The range R(u, v) of the edge (u, v) is the length of the shortest path between u andv in the absence of (u, v).

23

2 Introduction

Algorithm 2: φ-graph model: construction algorithm

Data:N : number of verticesk: outgoing vertex degreeφ: the defining model parameter

Result: φ-graph G(N, k)begin


G ← lattice graph G(N, k), whose vertices are arranged on a ring, each ofthem possessing a directed edge to its k / 2 closest neighbouring vertices oneither side;

/* Rewiring */

while there are less than φ ∗ k ∗ n shortcuts within G dou← randomly selected vertex;v← randomly selected neighbouring vertex of u, such that R(u,v ) ≤ 2;w← randomly selected vertex, such that R(u,w ) > 2and @ e ∈ E(G): e ≡ (u,w);

(u, v)← (u, w);end

return G ;end

Definition 2.15. In compliance with Definition 3.1.3. [22]:An edge (u, v) with range R(u, v) > 2 is called a shortcut.

Definition 2.16. In compliance with Definition 3.1.4. [22]:The fraction of shortcut edges within a graph G is denoted by φ.

The φ-graph model is based on the explicit, rather than on a probabilistic introduc-tion of shortcut edges into the lattice graph, which is used to start the constructionprocess. Nevertheless, the vertices to be connected by shortcut edges are chosen ran-domly. See Alg. 2 for a pseudo-code description of the directed φ-graph algorithm.

Comparing both construction algorithms Alg. 1 and Alg. 2, one will probably noticethat the underlying operation, the relocation of a single edge, is the same for bothalgorithms.6 A single relocation operation of the edge (u, v) to form the edge (u,w)affects the outgoing neighbourhood Γ(u)out and the incoming neighbourhoods Γ(v)in

and Γ(w)in. The incoming neighbours of vertex u are only affected indirectly, such asthe outgoing neighbourhoods of v and w.7 This primarily local impact of single-edge

6In this context, the term “rewire” is used more frequently.”7The length of shortest paths from any vertex in V (Γ(u)in) to any other vertex, as well as from any

vertex to V (Γ(v)out) or V (Γ(w)out) are subject to change caused by the relocation of the edge.

24


0.2

0.4

0.6

0.8

1

0.001 0.01 0.1 1β

scaled Lscaled γin

scaled γout

(a) β-model

0.2

0.4

0.6

0.8

1

0.001 0.01 0.1 1φ

scaled Lscaled γin

scaled γout

(b) φ-model

Figure 2.7: Comparison of L and γ for the β and φ-graph model with 400 vertices ofoutgoing vertex degree 6, averaged over 40 realisations.

relocations led to the differentiation between the two types of neighbourhoods in thefirst place.

Fig. 2.7a and Fig. 2.7b depict the effect of different β and φ values on the scaledclustering coefficient and the scaled characteristic path length for both graph models,showing that these statistics behave very similarly for the two of them. The diagramdisplays both clustering coefficients, the average γ(v)in and γ(v)out, indicating thatthey develop almost identically.

Watts experiments gave evidence to the claim that all three considered graph models,the sociological motivated α-model, the β and the φ-model, basically exhibit the samechanges in respect of characteristic path length L and clustering coefficient γ. Whileslowly increasing their defining parameter starting from 0, one can observe that Lstays roughly constant at first but then experiences a significant decline. This declinehappens so abruptly that Watts compares the sudden effect with phase transitions inphysical systems. The clustering coefficient shows a similar behaviour, but its transitionoccurs at much higher β and φ values. It is the class of graphs that lie in between thesetwo points that have the interesting properties of small-world graphs: High clusteringcoefficients and short path lengths.

Watts also examined the scaling properties of L for the different models with respectto the graph size n. It turned out that L scales linearly with respect to n for graphs orig-inating from the domain before the L transition occurs (as one might expect for graphssimilar to lattice-graphs) but logarithmically for graphs from the small-world regionbefore the cliff of the clustering coefficient. After the clustering coefficient has reachedits random-graph limit, L continues to scale logarithmically, which is characteristic forrandom graphs according to Watts.

To show that graphs with small-world properties in fact exist in nature and sociology,Watts considered three example networks in [22]: The collaboration graph of actors,the west-American power grid and the nervous system of the C. elegans worm. At

25

2 Introduction

least for the first example, the small-world model is able to predict the clusteringand length properties reasonably well. The prediction of the clustering coefficient forthe power network works fairly well and still better than a lattice or random-graphmodel could have accomplished. The nervous system, however, successfully withstandsclassification, which Watts traces back to the smallness of the system, which has anegative effect on computing reliable model statistics.

Another application, which is also discussed in [22], is the effect of the networkstructure on dynamic systems. The first example refers to infectious disease spreadingin a population of individuals that are connected by a graph structure. Although themodel is rather simple and the rules governing the spreading of the disease are simplisticfrom a medical point of view, the system is able to show a wide range of behaviourpatterns depending on a small set of parameters. Watts’ experiments indicate that fora broad domain within the parameter space, in which the system approaches a steadystate, the nature of this state (e.g. the population died out) or the time needed toreach this state depends on the underlying graph.

The work performed by Watts and Strogatz influenced the research community andthe concept of small-world graphs was transferred to a wide range of applications.However, the consideration of one important property of real-world networks, whichhad been disregarded by Watts before, shaped later research efforts: The aspect ofgrowth.

Barabasi and Albert introduced the concept of scale-free networks in [4]. Theyrealised that networks like the metabolic system within cells, the World Wide Webor food webs, are in fact not as regular in terms of vertex degree as the kinds ofnetworks that were modelled by Watts and Strogatz. Instead, there are some moresignificant vertices that have far more links than the majority of vertices, such thatthe vertex degree follows a power law distribution. These vertices in the World WideWeb for example are websites that possess far more links than others because of theirpopularity. They are also called hubs in later publications.

In [3] Barabasi denotes a simple algorithm to construct scale-free graphs based onthe principle of preferential attachment : The graph is built up randomly by iterativelyadding new vertices, such that the added vertex is more likely to attach to vertices thathave more links attached to them compared to others.

A scale-free model is required later for comparison between different graph models.Therefore, an algorithm to construct a directed graph with a regular outgoing degree,whose incoming vertex degree follows a power-law, is given by Alg. 3. Note that incontrast to the original model in [3], where in each step one vertex is added to thegrowing graph and its edges are linked to vertices already existing within the graph,here all edges are added iteratively based on preferential attachment. See Fig. 2.8 forthe distribution of the incoming degree for this scale-free model.

2.3.2 Applications to the Density Classification Task

Although the concept of graph based CAs had already been considered (see [20] foran example of graph based non-uniform CA), Watts and Strogatz were the first to

26


00.

030.

060.

090.

120.

15

1 10 100

Fre

quen

cy

Incoming vertex degree

Figure 2.8: Incoming vertex degree distribution of the directed scale-free graph modelfor a graph with 300 vertices and an outgoing degree of 6, averaged over 50 realisations

Algorithm 3: scale-free graph model: construction algorithm

Data:N : number of verticesk: outgoing vertex degree

Result: scale-free graph G(N, k)begin


G ← ({v1, v2, . . . vN}, E) with E := ∅;u ← randomly selected vertex;v ← randomly selected vertex 6= u ;E ← {(u,v)};

prob(v ≡ v) ←|{e ∈ E|∃u : e ≡ (u, v) or e ≡ (v, u)}|

2 ∗ |E|;

/* Main */

forall remaining edges e /∈ E dov ← vertex randomly selected with uniform probability over{v| outgoingDegree(v) < k};

w ← vertex randomly selected according to probability distribution prob ;e ← (v, w);E ← E ∪ e;

end

return G ;end

27

2 Introduction

introduce it to the density classification task in [23] according to the author’s knowledge.Unlike Mitchell and her colleagues in [15], Watts and Strogatz did not evolve the CArules using a genetic algorithm. Instead, they employed a fixed CA rule and changedthe coupling structure by generating undirected small-world graphs using algorithmssimilar to the ones described above. These graphs then determined the interconnectionsbetween the cells.

Watts and Strogatz relied on the majority-rule for his approach, which has beenalready disqualified by Mitchell et al. in [16] for standard CAs because of its inability toreach a correct end state for most initial configurations. As Mitchell and her colleagueswere considering CA with a neighbourhood radius r = 3, the majority rule had toperform a majority vote among seven cells (including the central cell) and the nextstate was always well-defined. Watts’ and Strogatz’ small-world construction algorithm,however, eventually broke the regular degree structure by the addition or deletion ofedges during rewiring. Therefore, they had to modify the majority rule in order toprevent an undefined next state and introduced a random variable for this purpose.

Definition 2.17. The majority rule for a binary state one-dimensional cellular au-tomaton as used in [22,23] is the following:Let Γ(v) := Γ(v)in = Γ(v)out, as undirected graphs are taken into account and let kon(v)denote the number of vertices (cells) within V (Γ(v)) ∪ {v} that are in the “1”-state.The rule updates its state according to the conditions below.

� If kon(v) > k2, then turn to “1”-state.

� If kon(v) < k2, then turn to “0”-state.

� If kon(v) = k2, then switch to “1” or “0”-state with equal probability

Watts performed a series of runs with different parameter values for undirected β-graphsand could achieve a relatively high CA performance of 0.89 for a range of β values.See Tab. 2.2 for a comparison with the previously discovered rules.8 He notes that thethreshold value for φ to enter the high performance domain can be found close to thefraction of shortcuts that are expected to be necessary, so that each vertex possessesat least one shortcut. Watts claims that this value is affirmed by his observations andwould lie around φcrit = 0.2 for n = 149, k = 6. After this value, the performanceof the randomised majority rule stays high until the random-graph limit. Graphs Gwith φ(G) = φcrit = 0.2 are located already well after the beginning of the small-worlddomain, which means after the sudden drop of the characteristic path length, but stillbefore the beginning of the random-graph domain.

Watts observed that the performance of the majority rule with undirected β-graphcoupling structure scales significantly better for increasing automaton size comparedto the GKL rule. Therefore, the advantage of applying a small-world based couplingstructure would increase with system size, although the automaton parameters do notreally satisfy the preconditions k >> 1 for Watts’ small-world model.

8The evaluation process described in [22] indicates that only a set of 100 initial configurations hasbeen used to compute this value. Considering the deviation of standard CA performance, thisnumber is by several orders of magnitude too small to give information about the rule’s capability.

28


Tomassini, Giacobini and Darabos followed Watts’ approach in [21], but instead ofrelying on a heuristic method to construct the graphs to be evaluated, they employedan evolutionary algorithm to evolve the coupling structures. However, Tomassini etal. also only considered the nondeterministic version of the majority rule for the densityclassification task. They were able to evolve high performance graphs showing similarproperties as the ones Watts was considering to be promising, e.g. concerning vertexdegree and fraction of shortcuts φ, although the average degree was slightly lower andφ slightly higher than the results reported by Watts.

According to the approach of Tomassini et al., a spatially distributed population ofgraph individuals is used to reduce the selection pressure and provide the chance thatsub-optimal candidate solutions can eventually improve and enrich the population. Thechromosomes are modelled as adjacency lists. Only the genetic mutation operator isutilised, the crossover operator is disregarded. A mutation event occurs with probability0.5 for each individual, whose chromosome is then altered by either removing an existingedge from a particular vertex or by adding a new edge to a randomly selected vertex.Certain bounds for the maximal and the minimal vertex degree are fixed, so that thegraph always stays connected and the number of edges still manageable. The fractionof shortcuts in the graph individuals (i.e. φ) is included in the fitness computation,therefore increasing the selective pressure towards graphs with a lower φ value.

As no recombination between the graph chromosomes is possible, one could claimthat the used algorithm is not really a genetic algorithm in the original sense, as it ratherbelongs to the class of traditional evolution strategies, which only rely on mutation assource for introducing new genetic information into chromosomes. The authors of [21],however, use the term “genetic algorithm”.

Worth mentioning is the analysis of the fault tolerance of the majority rule in com-parison with the fault tolerance of the GKL rule, which was performed by Tomassiniand his colleagues. For this purpose, a probabilistic cell-failure model was introduced:Each cell’s next state is inverted with probability pfail, so that the probability of cor-rect functioning is only 1− pfail compared to 1 in the original model. Then the bitwiseHamming distance between a CA under the influence of random failures and a cor-rect functioning CA can be computed by comparing their state vectors. Tomassininotes that the majority rule significantly outperforms GKL in terms of fault-tolerance.See [21] for details.

According to the author’s knowledge, Moreira, Mathur, Diermeier and Amaral werethe only ones so far, who studied the deterministic version of the majority rule inconnection with Watts’ β-graph model (see [17]). They adjusted the original modelby introducing directed edges, similar to Alg. 1, therefore fixing the outgoing vertexdegree, so that it was possible to directly compare the behaviour of the majority ruleto the behaviour of GKL.

Moreira et al. kept the basic procedure followed by Watts: A range of β-values isused to generate graphs, which are then evaluated by transforming them to a CAcoupling structure run by the majority rule performing the density classification task.The fraction of correctly classified initial configurations is then used to measure the

29

2 Introduction

graph’s performance.Moreira et al. state at the beginning of their paper that the majority rule plays

a key role in many different biological and social systems and was therefore chosenfor evaluation. Its special resistance to failures and noise caused by environmentalinfluences underlines its importance according to Moreira. The performance of themajority rule is thereafter compared with the performance of the GKL rule undervarying conditions:

1. using a regular lattice coupling structure and one based on the β-graph model

2. in a stable environment and under the influence of random noise

3. using a combination of the first two settings

The random noise model is similar to the one used by Tomassini et al. in [21], yet differ-ent, as it does not affect the next state of the particular cell itself but the neighbour’sreception of the cell’s state. As the same random noise model is used in Chapter 4, themodel is denoted in detail below.

Definition 2.18. Let σij be the value cell i reads for the state of cell j, σj the true

state of j, and η a parameter, which specifies the noise intensity, ranging from η = 0(noiseless dynamics) to η = 1 (random dynamics). Then define:

σij :=

{

σj with probability 1− η

2

1− σj with probability η

2

Despite the fact that the majority rule exhibits a very bad performance for standard CA(see [16]), it significantly improves and eventually outperforms GKL for a β-graph basedcoupling structure and for noise values within a certain range. Though surprising, thiseffect can be explained quite easily. Running on a regular lattice CA, the automatonprovided with the majority rule gets stuck regularly in a state in which blocks ofconsecutive “0”s are standing alongside blocks of consecutive “1”s. The cells at theborders see a majority of cells in the same state they are currently in, so that thedomains are unable to mix, thus the automaton has reached an illegal fixed point (seeFig. 2.9a).

Depending on the β value, to a greater or lesser extent shortcuts are introduced tothe coupling structure, so that some cells also receive information about distant cellstates. Moreira et al. note that this effect does not suffice to prevent the majority rulefrom reaching the described stable states, unless the graph reaches the random-graphlimit. See Fig. 2.9b for an example with a non-random graph within the small-worlddomain using the same initial configurations as in Fig. 2.9a.

The density classification within the random-graph domain is a relatively simpletask for the majority rule. According to Moreira’s consideration, the cell is able tomake an educated guess about the global density by observing the density prevailingin its neighbourhood, which is a random sample over the whole set of cells within theautomaton. As time passes, the quality of this random sample will improve, becauseall cells will step-by-step adjust to the state that was taken by the majority of them.

30


Figure 2.9: Influence of the coupling structure on the behaviour of the majority rule:Space-time diagrams

1491 1491 1491

t :

298

0

(a) Lattice coupling structure

1491 1491 1491

t :

298

0

(b) Graph coupling structure

1491 1491 1491

t :

298

0

(c) Graph coupling structure in noisy environment

31

2 Introduction

Figure 2.10: Consecutive block of cells tending to keep minority consensus, see [17];note that only k

2= 3 edges for each vertex are shown

The poor performance of the majority rule for low β values in the absence of noiseis attributed by Moreira to the existence of blocks of consecutive cells that remainconnected. These cells force each other to keep a once reached consensus state withinthe block. Fig. 2.10 depicts the problematic structure.

The introduction of noise enables the movement of the consecutive blocks describedabove. On the one hand, noise does not significantly affect the state of cells that arelocated within such a block because the rest of the neighbouring cells will still show thelocal majority state. On the other hand, cells that are located at the borders betweenthe blocks are affected by only a single misread neighbour state providing a movementof the state blocks.

Moreira et al. also considered a variation of the CA model by performing asyn-chronous instead of the usual synchronous cell updates and compared the performanceof the majority rule to GKL’s performance. It turned out that GKL is unable to reacha consensus state while running asynchronously on a lattice type CA. The performanceincreased for a larger rewiring probability and a higher but still reasonable noise rate.However, under these conditions, GKL was far outperformed by the majority rule.Moreira et al. therefore conclude that GKL is not robust and constructed for a veryrestrictive environment, whereas the majority rule is “ecological efficient” because itcovers a broad range of application. This might be the reason, they assume, why themajority rule can be frequently encountered in biology or sociology.

Furthermore, Moreira et al. applied the concept of scale-free graphs to the densityclassification task by employing a preferential attachment rule. They distinguishedbetween two cases: In the first case, only the outgoing degree (in terms of “readingdata” or against the information flow) followed a power law, and in the second case,the links were considered to be bidirectional, therefore also the incoming vertex degreesfollowed a power law. For the first case, Moreira et al. note that the majority rule didnot perform well, as opposed to the second case, in which it was able to reach aperformance value of approximately 0.8.

Unfortunately, the precise algorithm that was used to construct both cases of scale-free networks, as well as the necessary adjustments to the majority rule for the secondcase, are not mentioned in their paper.

Finally, Moreira et al. determined the asymptotic behaviour of the border betweeninefficient and efficient β-values when the automaton size n tends to infinity. They notethat this border decreases with system size, instead of increasing, as one might haveexpected before.

32

3 The Coevolution Framework

The first section of this chapter presents the models used in this work, the graphbased model of a cellular automaton and the coevolution framework. Similarities anddifferences to the coevolutionary model of Juille and Pollack are highlighted and theformulae to compute the characteristic values are given.

The next part covers the implementation of the system, the distributed architectureand the design of the software that was used to compute the results described in thefollowing chapter. The two main parts of the system are covered separately: Thecoevolutionary framework and the system to compute CA runs for fitness evaluation.Further details of the implementation are given in the appendix.

3.1 Models

As the focus of this work lies on the application of graph based cellular automata tothe density classification task, the automaton model described in Chapter 2 had to bemodified to consider irregular coupling structures. The introduction of these graphs tothe optimisation problem also required an extension of the coevolutionary frameworkof Juille and Pollack. Both parts are described in this section.

3.1.1 Graph Based Cellular Automaton Model

The graph based cellular automaton model used in this work is a generalisation of thetraditional CA model. In the graph based model, the neighbourhood of each cell doesnot necessarily consist of the cells, which are located at a radius around the particularcell. Instead, connections between distant cells are possible and determined by anunderlying graph structure, whose vertices represent the cells of the automaton. Tostay as close to the standard CA model as possible, the number of neighbouring cellsis uniform for all cells. Therefore, the connection graph has to be directed and regular.In detail:

Definition 3.1. A graph based cellular automaton (GBCA) consists of a set of Nidentical units corresponding to finite-state machines (FSM), which are called cells.Each cell resides in one of a set S of possible cell states, which it updates synchronouslyin discrete time steps, according to an uniform update rule Φ. The next cell statedepends only on the cell’s current state and on the state of the k neighbouring cells itis connected to specified by a N × k matrix A. In particular:

au,i := v ⇔ link i connects cell u to cell v

33


If there is a link from cell u to cell v, then u is able to read the state of cell v. Inthis sense, each cell possesses k input signals used to read the states of k other cells,whereas an arbitrary number of output signals transmit the cell’s own state to othercells. It is not allowed that a cell reads its own state via such a connection. Expressedin graph terminology, the directed graph with regular vertex degree k, whose adjacencyis determined by A, is required to be loop-free.So that by giving N , k, S, A and Φ : Sk+1 → S, one is able to specify a GBCA.Remark: The update rule Φ is also called “transition rule” or “GBCA rule”. Similarto the CA model defined by Def. 2.1, the vector of cell states at time step t is alsocalled a configuration at time-step t. The start state or initial configuration of theGBCA represents the input of the automaton and the configuration reached after aspecified number of steps tmax is regarded as its output.

In the following, the term standard CA is used to refer to the CA model defined inDef. 2.1, which is equivalent to a GBCA using the regular lattice coupling structurewith the same parameters N and k. Where there is no risk of confusion, the term CAwill be used to refer to either of the two models, standard CA or GBCA.

Note that there is an order defined on the set of outgoing edges of each cell in a GBCA,which is determined on the basis of the column index in the matrix A. According tothis order, it is possible to transform a standard one-dimensional CA, whose neighbourscan be ordered from “left-most” to “right-most” or the other way round, into a GBCAby using a simple “rewiring” procedure similar to the β-model for small-world graphs(see below). Moreover, the outgoing degree is fixed by definition, as opposed to theincoming degree, which is variable in the range from 0 to N − 1.

The GBCA model is essentially the same Moreira used in [17]. However, the work per-formed by Moreira did not include the evolution of the graph based coupling structures,so that no adaptation between GBCA rules and the structure of the cell connectionscould be studied.

3.1.2 Coevolutionary Model

The coevolutionary model used in this work was obtained by extending the model ofJuille and Pollack, published in [12], which includes the species of initial configurationsand CA rules, to consider a third species, the species of directed regular graphs. Thesegraph individuals are then pairwise combined with rule individuals, whose genotypesonly include a representation of the rule table, to form a mutual GBCA used for fitnesscomputation. Therefore, a triangular dependency structure is introduced (see Fig. 3.1).

The distinction of GBCA rules from underlying graph structures is introduced formainly two reasons. First, it is assumed that there exist different groups of rulessuitable for sets of coupling structures showing similar properties. In this case, themodel of a candidate solution consisting of both a rule table and a coupling structurewould in fact limit the algorithm’s search through the solution space to only considerpairwise combinations of rules and graphs.

34

3.1 Models

Initial Configuration

Population

<W

eig

ht

>

<W

eig

ht

>

Com

petitio

n

Rule Population

(a) Original model byJuille and Pollack: In-terdependence betweenboth species based onrule and configurationweights.

Graph Population

Initial Configuration

Population

< Weight >

<W

eigh

t><

Weight

>

<W

eight>

<W

eigh

t>

Cooperation

< Weight, CouplingStructure >

Com

petitio

n

Rule Population

(b) Approach followed in this work: Interdependence between all threespecies based on rule, configuration and graph weights. Additionally, thegraph individuals influence the behaviour of the rule individuals on thephenotype level. For the particular formulae see below.

Figure 3.1: Coevolutionary optimisation of graph based cellular automata

Second, the purpose of studying the influence of structural changes on dynamic sys-tems, such as GBCAs, motivates the distinction between rules and coupling structure.The emergence of high performance automata rules and graphs with small-world prop-erties could furthermore indicate the natural occurrence of such systems.

Note that after the addition of the graph species to the model of Juille and Pollackdescribed in Chapter 2, this model is not strictly competitive anymore. The populationsof rule and graph individuals are standing in a cooperative relationship to each other,while both of them are competing with the population of initial configurations.

The characteristic values for the individuals of each species are defined in accordancewith definitions 2.5 to 2.9. Let R denote the population of GBCA rule individuals,let IC denote the population of initial configuration individuals and let G denote thepopulation of graph individuals. For the particular encoding of the genomes see below.

Let covered be a |G| × |R| × |IC| matrix indicating which graph and rule combi-nation of the current generations could defeat which configuration individual of thecurrent generation of configurations. So coveredi,j,k = 1 if GBCA rule j using the cou-pling structure specified by graph i correctly classified initial configuration k, otherwisecoveredi,j,k = 0. All graphs are considered directed and regular in terms of a uniformoutgoing degree, therefore forming valid GBCA coupling structures with respect toDef. 3.1. In the following, the terms graph and graph individual, rule and rule in-dividual, as well as initial configuration and initial configuration individual are usedinterchangeably.

Definition 3.2. The weight weight ICk of the initial configuration ICk is computed

35


as follows:

weight ICk :=

{

1P|G|

i=1

P|R|j=1 covered(i,j,k)

if ∃i, j : covered(i, j, k) ≡ 1

1 otherwise

Intuitively, the more pairs of GBCA rules and graphs fail to correctly classify an initialconfiguration ICk, the higher its weight. Note that this is a straightforward generali-sation of the weight of initial configurations according to Def. 2.5.

Definition 3.3. For any graph Gi, rule Rj and initial configuration ICk with densityρ(ICk) the informational value E(Gi, Rj, ρ(ICk)) is defined as follows:

E(Gi, Rj, ρ(ICk)) :=

{

log(2) + p ∗ log(p) + q ∗ log(q) p > 0 ∧ q > 01 otherwise

where p is the fraction of configurations with density ρ(ICk) that Rj classifies correctlyusing the coupling structure specified by the graph Gi and q is the fraction of theseconfigurations, which Rj and Gi fail to classify correctly.Remark: q = 1 − p and E() can be regarded as the inverse of the entropy, givingevidence whether Rj and Gi perform strictly better or strictly worse than randomguessing for configurations with density ρ(ICk). See Fig. 2.4 for a plot of the functiondepending on p.

The definition of the informational value is essentially the generalised version of Def. 2.6,transferred to GBCA rules and graph based coupling structures. Note that the samemethod to estimate p is used as the one described above: To merge performance dataand get a more meaningful value for p, the range of density values is aggregated todensity bins of size 2. However, instead of taking every graph into account, the infor-mational value is computed on a graph-to-graph basis because it is later used for theweight computation of graph individuals. The weight of GBCA rules is defined similarto Def. 2.7.

Definition 3.4. The weight weight Rj of the rule individual Rj is computed as follows:

weight Rj :=

1P|G|

i=1P|IC|

k=1E(Gi,Rj,ρ(ICk))∗covered(i,j,k)

if ∃k, i : E(Gi, Rj, ρ(ICk))∗covered(i, j, k) 6= 0

1 otherwise

The definitions of the weight and fitness of graph individuals are motivated by thefollowing considerations:

1. The graph based coupling structure should enable as many GBCA rules as pos-sible to correctly classify a maximum of initial configurations. This contains theunderlying assumption, which is described above: The existence of graphs that

36

3.1 Models

are suitable for a whole set of rules, possibly those rules, which show similar be-haviour. However, in the presence of a GBCA rule, which performs significantlypoor for a large set of graphs, the fitness of the particular graph should not beinfluenced too much by this rule. The same holds for the rules: They shouldperform well for a possibly wide range of graphs but those, which for some reasonare not suitable for any rule, might be disregarded. Note that this is essentiallythe effect enabled by resource sharing.

2. Random guessing should be avoided. For this reason, the informational value,which has proven successful already in the case of coevolution of standard CArules (following the approach by Juille and Pollack in [12]), was integrated intothe graph fitness function.

3. If no correct classification is possible, it is beneficial that the automata reach atleast a consensus state. This should prevent the creation of circular state loops orregions of stable states within the automaton, similar to the problems experiencedby the majority rule (see p. 32). Evolution results and experiments have shownthat this problem does not occur while coevolving rules and coupling structures,which is perhaps an interesting result on its own. See the next chapter for details.

Definition 3.5. The weight weight Gi of the graph individual Gi is computed asfollows:

weight Gi :=

P|R|j=1

P|IC|k=1

E(Gi,Rj,ρ(ICk))∗covered(i,j,k)

P|R|j=1

P|IC|k=1

E(Gi,Rj,ρ(ICk))if ∃j, k : E(Gi, Rj, ρ(ICk)) 6= 0

1 otherwise

The definition of the graph weight captures the issues of the first two considerationsgiven above: The graph should enable correct classifications for as many rules as pos-sible while avoiding random guessing of rules.

The definitions of the initial configuration and rule fitness functions are closely relatedto the original model (see Def. 2.8 and Def. 2.9), as the sum over all graphs is taken.Note that this also includes the implicit assumption mentioned above, the existence ofa class of rule individuals that are suitable for a wide range of similar graphs, which isnot obvious in the first place.

Definition 3.6. The fitness f(ICk) of an initial configuration ICk is computed asfollows:

f(ICk) :=

|G|∑

i=1

|R|∑

j=1

weight Gi ∗ weight Rj ∗ E(Gi, Rj, ρ(ICk)) ∗ (1− covered(i, j, k))

Definition 3.7. The fitness f(Rj) of a rule individual Rj is computed as follows:

f(Rj) :=

|G|∑

i=1

|IC|∑

k=1

weight Gi ∗ weight ICk ∗ covered(i, j, k)

37


One should note that the graph weight weight Gi is equally taken into account in bothfitness values, rule and configuration fitness, with different goals. Recall that the weightof a graph coincides with its capability to enable as many rules as possible to classify amaximum of configurations. This disregards the possibility of the evolution of a singlerule-graph pair that is highly successful, while both rule and graph fail in general.However, the coevolution algorithm was designed to consider a preferably broad range ofrule and graph combinations, so this is acceptable. Furthermore, the results presentedin Chapter 4 indicate that this restriction does not prevent the algorithm from evolvinghigh performance rules and graphs.

The weight Gi factor within the configuration fitness computation rewards the con-figuration for rules, which fail to classify the configuration using an otherwise successfulgraph structure. Contrary, if the graph has a low weight (which corresponds to a lowoverall rule performance using this graph), the event of a false classification is consid-ered less significant.

The function of graph weight for the fitness computation of rules is to stabilisethe interdependence between the two species. The event that a single rule succeedsto classify a configuration that defeats all other rules (which would result in a highpayoff for the rule) is regarded to be less meaningful for a graph structure, whichhas revealed to be unsuitable for the majority of rules (corresponding to a low graphweight). Contrary, if the same event happens for a graph that has proven to be suitablefor a wide range of rules, the event is regarded to be more significant. In this sense:The well-being of the whole population is more important than the well-being of singleindividuals.

The definition of the fitness of graph individuals utilises the defined weight charac-teristics for the species of the rules and configurations but also contains an additionalfactor. This factor assesses the occurrence of irregular edges, which are contained ingraph genome. The representation of the candidate solutions in the graph genomes issimilar to the encoding used by Tomassini et al. in [21]. However, Tomassini consideredundirected, non-regular graphs and only took mutation into account, therefore disre-garding the potential genetic diversification that accompanies pairwise recombination,such as crossover. In this work, a graph individual Gi is represented by the matrix Afor GBCAs, which determines the adjacency between the cells according to Def. 3.1.The genetic operators mutation and crossover are described in detail below.

Definition 3.8. For all graph individuals Gi, given by the N × k matrix A encodedin its genome, define: Let u,v be cell indices with 0 ≤ u, v < N and let e = (u, v) bean edge of the graph that is represented by Gi. Then e is called regular if it coincideswith the corresponding edge in a lattice graph with the same number of vertices N andoutgoing degree k as the graph individual Gi. In particular:

∃j. au,j ≡ v

∧

((

j <k

2∧ (u + j −

k

2+ N)%N ≡ v

)

∨

(

j ≥k

2∧ (u + j + 1−

k

2)%N ≡ v

))

⇔ the edge e is regular

38

3.1 Models

An edge that is not regular is called irregular.

Experimental results have shown a significant drift of graph individuals towards the ran-dom graph domain during an early stage of the coevolutionary process. This happensdue to the fact that the rules need a certain time to develop sophisticated behaviour,so that the graphs get little or no feedback before the rules reach that stage. To slowdown this drift, the fraction of irregular edges was introduced as a factor to the graphfitness function. This sustains a certain bias towards graphs, which are still structurallysimilar to lattice graphs.

Definition 3.9. The fitness f(Gi) of a graph individual Gi is computed as follows:

f(Gi) :=

|R|∑

j=1

|IC|∑

k=1

weight Rj ∗ weight ICk ∗ covered(i, j, k) ∗ (1− irregular edges)

where irregular edges is the fraction of irregular edges within the graph.

In the following, the basic coevolutionary process is outlined.

1. Initialise the rule and the initial configuration population randomly accordingto a uniform distribution of their densities over the interval [0, 1]. Initialise thepopulation of graphs with identical copies of the regular directed lattice graphwith the same parameters N and k.

2. Compute a run for each graph, rule and initial configuration generated in theprevious step and save the results in the covered matrix. Thus, |G| ∗ |R| ∗ |IC|runs have to be calculated and checked for correct classification.

3. Compute the weight for each individual of every population.

4. Compute the fitness for each individual of every population and rank the indi-viduals according to their fitness value.

5. To form the next generation for each species proceed as follows:

For the rules and graphs: Take a fraction of the best individuals over tothe next generation without modification (the elitist group) and performcrossover and mutation between these individuals to create the remainingindividuals of the new generation.

For the initial configurations: Take a fraction of the best individuals and foreach of them add a new initial configuration with the same density to the nextgeneration. Form the remaining individuals of the new generation randomlyaccording to a uniform density distribution over [0, 1].


1Typical evolution run length: Several 1000 generations

39


For the particular parameter values see Chapter 4.

The rule and configuration individuals are essentially modelled in the same way thatwas considered by Mitchell et al. and Juille and Pollack, as well as the genetic operatorsworking on them (see Chapter 2 for details). The mutation and crossover operatorsworking on the graph individuals are described below.

(a) Mutation:Let A be the N × k matrix representing the genome of the graph individualGi, which determines the adjacency between the cells. Random mutation isconsidered to work on single edges: The graph genome is passed through edge-by-edge. For each edge, a random value rand is chosen uniformly distributed over[0, 1]. If rand ≤ mutation rate, the end point of the edge is relocated, otherwisenothing happens. Let u and v be cell indices with 0 ≤ u, v < N and au,j = vfor a j with 0 ≤ j < k, and let (u, v) be the edge to be mutated. Two cases aredistinguished:

The edge to be mutated is regular: Then a random vertex is chosen uniformlyfrom all vertices within the graph, such that no loops or multiple edges canform, and the edge is rewired to point to that particular vertex. In detail:Let w be a random cell index, chosen uniformly from 0 ≤ w < N withw 6= u and @ j ′: au,j′ ≡ w. A is modified to A′ as follows: a′

u,j = w and∀0 ≤ u′ < N , ∀0 ≤ j ′ < k: u′ 6= u ∨ j ′ 6= j ⇒ a′

u′,j′ = au′,j′ .

The edge to be mutated is irregular: In this case, a new random value rand2 ischosen uniformly distributed over [0, 1]. Again, two cases are distinguished:

1. rand2 ≤ rewire back probability2: The edge is relocated to the targetvertex, which it would possess in the corresponding lattice graph. Moreprecisely, A is modified to A′ as follows:

a′u,j :=

{

(u + j − k2

+ N) %N for j < k2

(u + j + 1− k2) %N for j ≥ k

2

and ∀0 ≤ u′ < N , ∀0 ≤ j ′ < k: u′ 6= u ∨ j ′ 6= j ⇒ a′u′,j′ = au′,j′ .

2. rand2 > rewire back probability: The edge will be processed in the sameway as a regular edge.

(b) Recombination(crossover):A random crossover cell vcross is chosen uniformly from all cells within theautomaton, excluding the first and the last cell in terms of their indices. Then,the adjacency records in A from all cells with an index v < vcross from the firstparent graph and v ≥ vcross from the second parent are copied to form the firstoffspring. The genome of the second offspring is formed correspondingly. In

2rewire back probability is a typically small value determining the probability of a rewiring operationtowards the corresponding lattice graph.

40

3.2 Implementation

particular:Let AP1 and AP2 be N × k matrices representing the genome of the first andthe second parent graph respectively, and let AO1 and AO2 be N × k matricesrepresenting the genome of the first and the second offspring graph respectively.Then AO

1 and AO2 are constructed as follows:

aO1v,j :=

{

aP1v,j ∀j, v < vcross

aP2v,j ∀j, v ≥ vcross

aO2v,j :=

{

aP2v,j ∀j, v < vcross

aP1v,j ∀j, v ≥ vcross

Note that each of the two operations always forms valid graph individuals, in the sensethat all offsprings represent loop-free, directed regular graphs.3 In fact, the restrictionon these properties facilitate this simple formulation of the genetic operators, that leadto promising results described in the following chapter. Just as the fitness computation,the mutation operator also takes irregular edges into account. The reason for this isessentially the same as the one mentioned above, to reduce the strong drift towardsrandom graphs at the beginning of coevolution.

3.2 Implementation

This section presents the three main components of the system: The component forperforming the cellular automata computations, the coevolutionary system and thecomponent responsible for logging and accumulating evolution data.

First, the distributed architecture for computing CA runs will be described, includingits main classes. The implementation of the coevolutionary system is sketched in abottom-up manner, describing the most important classes. Finally, the logging systemis outlined, followed by a description of the composed system and an example evolutionrun.

See the appendix for additional information about the class design. In this section,the terms “class” and “entity” will be used interchangeably if the context is clear. Classand member function names are written in italics.

In this implementation section, the term GBCA is used either to refer to standard(lattice based) cellular automata, or to automata based on regular directed graphsaccording to Def. 3.1, as also the former can be regarded as GBCA running on a latticegraph. If there are any differences concerning the implementation it will be denotedwithin the context.

The system was required to be capable of performing automata runs for both types,GBCA and standard CA. Therefore, both automata are integrated into the system,implemented in separate entities for the sake of runtime optimisations.

3The introduction of multiple edges is generally possible in the event of a mutation followed bya relocation to a regular edge pointing to the same vertex, though their appearance is unlikelydepending on model parameters. However, results have shown that evolution prevents them fromspreading within the population.

41


3.2.1 Distributed Architecture

The most time-consuming aspect of evolutionary algorithms is usually fitness evalu-ation. Indeed, this is also the case here. As fitness evaluation may include a largenumber of GBCA runs, using different graph based coupling structures and initial con-figurations, each consisting of a large number of time-steps, the performance of thispart of the system is especially taken into account. Therefore, the evaluation of thedifferent GBCA runs was designed to be modular, in the sense that it is possible todistribute it over a group of computer workstations. Also a later hardware implementa-tion of the GBCA core, based on a FPGA4 system architecture, was taken into account.See [26–28] for details on this topic. To separate the process of automata computationsfrom the coevolutionary system, a distributed architecture was chosen on a client-serverbasis, including one client and potentially multiple servers. The basic setup consistsof the coevolutionary system running on the client side, while the necessary automatacomputations needed for fitness evaluations are performed on the server side. Thefollowing concepts and entities are introduced to model the system:

CA: The entity that performs the GBCA runs. After accepting a set of initial con-figurations and GBCA rules, this entity computes the final state of each of theautomata for each of the configurations. If the automata are using graph basedcoupling structures (which means, if they are non-standard CAs), these have tobe specified, too. The maximal number of steps the automata are allowed toproceed is determined before the computation starts.

CAHost: The entity that handles GBCA top-level computation requests. It providesmethods to load a set of rules, initial configurations and graph coupling structures.It also includes methods to fetch the output (the end states of the automata) afterthe computation is finished. The CAHost contains a set of CAInterfaces, whichit uses to distribute the computation load.

CAInterface: The entity that handles low-level GBCA computation requests (see below).There are three possible execution modes:

local mode: Requests are coming from the local CAHost that owns this CAInter-face. They are handed over by a MessageChannel, which is also used to passback computation results. The computations are performed on the samehost in a CA entity. The CA and the CAInterface are communicating witheach other over a second MessageChannel.This execution mode is chosen if GBCA computations are required to beperformed locally. No network traffic is necessary, therefore reducing com-munication overhead.

server mode: The request is received via a TCP connection, which is also used totransmit back computation results. In this mode, the CAInterface is part

4Field-Programmable Gate Array

42

3.2 Implementation

of a CAServer, which is possibly holding several CAInterfaces. The compu-tations are performed within the CAServer, in a CA entity, the CAInterfaceis communicating with by using a MessageChannel. Each CAInterface isprovided its own CA. The remote CAInterface, the request is originatingfrom, is required to run in client mode.This execution mode is chosen when the computations are to be distributedamong one or several systems, so that the CAServer provides computationresources to the CAClient.

client mode: Requests are coming from the local CAHost that owns this CAInter-face. They are handed over by a MessageChannel. Instead of passing therequest to a local CA, the CAInterface forwards it to a remote CAInterface,which is running in server mode, by using a network connection. After theresulting data has arrived from the remote side, the CAInterface passes itback to the CAHost using the MessageChannel.This execution mode is chosen when the computations are to be distributedamong one or several computer systems, so that the CAServer providescomputation resources to the CAClient. However, if CAServer and CA-Client are running on the same physical system, the local mode is prefer-able.

CAServer: The entity that hosts a set of CAInterfaces and waits for incoming connectionrequests. When a CAClient contacts the CAServer, a new CAInterface andCA entity are generated. The further communication is processed between theCAInterfaces directly.

MessageChannel: The CAHost, CAServer, CAInterface and CA entities run con-currently as single threads, though not necessarily on the same host. Lo-cal synchronisation and data exchange between CAHost and CAInterface,CAServer and CAInterface and CAInterface and CA are performed using sharedMessageChannel objects. The MessageChannel includes semaphores for synchro-nisation and provides temporary storage for data records.

computation request: This term is used to refer to a set of three groups of data records:rules, graph based coupling structures and initial configurations. In the case thatthe request includes standard CAs, no graphs are needed, as the coupling struc-ture of cells is already determined by the CA type. The CAHost or CAInterfaceaccepting such a request is expected to return one data record per graph-rule-configuration triple or rule-configuration pair in connection with a standard CA.This data record typically consists of the final state of the automaton.

The different execution modes and systems setups are depicted in Fig. 3.2. The out-side coevolutionary system uses the CAHost entity to request GBCA runs for fitnessevaluation. Therefore, it is completely transparent how many different work stationsare involved in the fitness computation, or even if the GBCA runs are performed insoftware or hardware. This was one of the main goals of the software design.

43


CAInterface

CAHost

CA

(a) Local execution:CAInterface in localmode, only commu-nication over Mes-sageChannel

CAInterface

CAServer

CAInterface

CA

CAHost

TCPCAInterfaceC

CA

S

(b) Local/remote execution: CAInterfaces in local,client and server mode, communication over networkand local MessageChannel

CAInterface CAInterface

CAInterface

CAInterface

CAInterface

CAHost

TCP

CAServer

TCP

CAServer

CA

CA

S

S

S

C

C

(c) Remote execution: CAInterfaces in client and server mode, communication betweenCAInterfaces over network connection, between CAInterface and CA by using a Mes-sageChannel

Figure 3.2: Execution modes for distributed architecture

44

3.2 Implementation

The distribution of the computing load over the set of CAInterfaces is static anddetermined at system startup: Each CAInterface ifc is assigned a relative load fac-tor loadifc ranging from 0 to 1, such that

∑|IFC|k=1 loadk = 1. Here, IFC is the set of

CAInterfaces the CAHost controls. Two cases have to be considered: First, in thecase of the coevolution of graph based CA rules and coupling structures the graphs aredistributed among the IFC, which means that ifc receives loadifc ∗ |G| graphs, |R| rulesand |IC| configurations. Second, if the optimisation includes only rules and initial con-figurations (if the standard CA model is used) then ifc receives loadifc ∗ |R| rules and|IC| configurations. In either case, ifc is required to produce an output state vectorfor each of the combinations (which corresponds to the configuration the automatonhas reached after the maximal number of allowed cell update steps).

After the GBCA computations at all CAInterfaces have finished, the data fragmentscan be reassembled by the CAHost, and it passes them back to the coevolutionarysystem. However, the CAHost can easily be extended to include some kind of load-balancing mechanism to distribute the computation requests more flexibly over the setof available CAInterfaces, but this was not within the focus of this work.

3.2.2 The Coevolutionary System

Recall that there are three different species to consider: The cellular automata rules,which are represented by their rule table entries, the initial configurations, each ofwhich forming a start state for the cellular automaton and the regular directed graphs,whose genomes determine the adjacency of the cellular automata cells encoded as amatrix. Unless noted otherwise, the following is referred to the GBCA model becausethe coevolution of rules and coupling structures for graph based automata is mainlyconsidered.

Generation

For each of the species, the coevolutionary system holds one population throughoutthe optimisation process. The state of the population during a single evolution step isincluded in the Generation entity. Below, the operations provided by the Generationclass are outlined.

adjustWeights: This method requests the current Generation to adjust the weight valuesof the individuals according to the definitions given above. Note that the weightsonly depend on the values within the covered matrix, indicating which rule-graphpair was able to defeat which initial configuration during the last evaluation step.

advance: The population is requested to migrate to the next Generation. In practise,this method returns a new Generation object.Note that the coevolutionary system relies on elitism: For the rule and graphpopulations, individuals of the Matingpool are taken over to the next Generation

45


unmodified. The new Generation is filled up with individuals created by pair-wise recombinations of elements of the Matingpool. After the application of thecrossover operator, mutation is considered, too.

evaluateFitness: The fitness of each individual is evaluated using the weights of individu-als of other Generations, which were computed during the adjustWeights methodcall on these Generations. Note that this operation requires that all Generationshave already completed adjusting their weights for the current evolution cycle.

evaluateOutput: This method is called at the beginning of a new Generation cycle. TheGeneration is requested to store the values of the global covered matrix in aformat suitable for the population.5 The stored data is accessed later during thenext adjustWeights call.

generateInitialPopulation: This method is only called at the very beginning of the evolu-tion to create the initial Generation for each population. Rules and configurationsare initialised according to a uniform density distribution over the interval [0, 1].The initial graph population consists of identical copies of a lattice graph withthe same outgoing degree and number of vertices. Therefore, at the beginningof the optimisation process for GBCA rules and graphs, all cellular automatarepresented by a rule-graph pair are identical to the corresponding standard CAuntil short-cuts are introduced by mutation.

pack: This method transforms the Generation into a representation capable of beingsent over a MessageChannel for local processing or over a network connection.Note that there are essentially four kinds of these data record representations:CaGraphs, CaRules, CaInputs and CaOutputs. The first three stand for graph,rule and initial configuration population data respectively. The last data recordrefers to the outcome of the automata computations. However, there are similarbut different kinds of data sets for both standard CA and GBCA. See also theappendix for the class hierarchy of participating classes.

select: A Matingpool is formed for each population. It consists of the individuals of thecurrent Generation, which are scheduled for reproduction, selected according totheir fitness values.

These methods are encapsulated in higher abstracted operations because they have tobe executed successively for each population. This task is performed by the Genera-tionManager, which is described next.

GenerationManager

The composition of the GenerationManager is depicted in Fig. 3.3. Note that there isone Logger entity assigned to each population. The function of these objects is mainly

5In practise, matrix row and column permutations are performed to simplify a later access to itsvalues.

46

3.2 Implementation

GraphOutputs

GraphOutputsCaRules

GraphOutputs

CaInputsLogger

CaRulesLogger

GraphsLogger

CAHost

CaGraphs

GenerationManager

Gra

phG

ener

atio

n

CaR

ule

Gen

erat

ion

CaI

nputG

ener

atio

n

CaInputs

Figure 3.3: GenerationManager : The candidate solutions for the three species aretransformed into their corresponding data structures: GraphGeneration → CaGraphs,CaRuleGeneration → CaRules and CaInputGeneration → CaInputs. These are loadedinto the CAHost entity for evaluation. The resulting output data record of typeGraphOutputs is forwarded back to each population for fitness computation. The Log-ger entities are collecting Generation data for a later recovery of the coevolution dataand for producing progress output during the optimisation.

to compute progress output but also to save the status of each Generation to enable alater recovery of the evolved individuals for evaluation. See p. 50 for details concerningthe logging system. The main methods of the GenerationManager are:

evaluateFitness: This method calls the pack method on all participating Generationobjects within the evolutionary process and loads them into the CAHost. Afterthe computations have finished, it fetches the resulting output from the CAHostand passes it to each Generation via the evaluateOutput operation. After re-questing each Generation to adjust the individual weights (adjustWeights), thefitness computation is performed by executing the evaluateFitness method of eachGeneration.

select: This method calls select on each Generation.

shutdown: The shutdown operation is used to shut down the CAHost and all Loggerentities. It is the responsibility of the CAHost to tear down the network connec-tion if the run involved remote CAInterfaces. The Logger objects are requiredto close their data stream output and assure that a later recovery of the evolvedindividuals is possible.

47


spawnNextGenerations: This method calls advance on each Generation and requeststhe corresponding Logger entities to save their status before migrating to thesubsequent Generation.

The fitness evaluation requires the formulation of the fitness function. To facilitate amodular design, it was declared as a separate interface, the FitnessFunction.

FitnessFunction

In practice, the FitnessFunction contains only two methods: The computeFitness andthe computeWeight method. However, it is necessary to include two versions of eachmethod, to enable coevolution of two and of three species respectively. See below forthe distinction between two and three population coevolution.

computeFitness: This method computes the Fitness of the current Individual, which isa member of a Generation entity. Note that because of the coevolution principleand resource sharing, the Fitness of the Individual depends on the own perfor-mance, on the performance of the opposing Generation and indirectly on theperformance of the members of its own Generation.

computeWeight: This method computes the weight of the Individual according to theformulae given above.

Evolution

The coevolutionary system is designed to enable a broad application of the evolutionprinciple, potentially to other problem cases or following different evolutionary strate-gies. For this reason, it is encapsulated in the Evolution entity, which is used to deriveparticular subclasses. These subclasses are implemented to reflect the following evolu-tionary approaches to the density classification task:

Standard genetic algorithm: The setup consists of one population of standard CA rules,which are rated according to their classification performance over a set of ini-tial configurations. These configurations do not evolve themselves, but they aregenerated randomly for each rule Generation according to a uniform density dis-tribution over [0, 1]. No resource sharing is included in the FitnessFunction.Note that this approach is essentially identical to the one followed by Mitchellin [7,14,15], which only produced suboptimal solutions so far. The class contain-ing this evolutionary strategy is the GenericCaEvolution class.In practice, the system employs a special type of Generation, theGenericIcEvolutionGeneration, for the initial configurations, so that the mainparts of the coevolution framework can be reused (see below).

Coevolution of standard CA rules and initial configurations: The setup consists of onepopulation of CA rules and one population of initial configurations, which arecoevolved. Resource sharing is present among both populations. Note that this

48

3.2 Implementation

CAClient

Evolution

GbcaCoEvolution

-GenerationManager-GenerationManagerGbcaCoEvolution

-GenerationManagerGenericCaCoEvolutionGenericCaEvolution

GenericCaCoEvolutionGenericCaEvolution

Figure 3.4: Evolution and GenerationManager : The CAClient serves as a host for themain entity: The Evolution, which can be any of the following three: Single populationevolution of standard CA rules included in GenericCaEvolution, two population co-evolution of standard CA rules and initial configurations represented by GenericCaCo-Evolution, and three population coevolution of GBCA rules, their coupling structuresand initial configurations, implemented in GbcaCoEvolution. Each Evolution entitypossesses a corresponding GenerationManager.

approach is identical to the approach pursued by Juille and Pollack in [12]. Theclass representing this strategy is the GenericCaCoevolution class.

Coevolution of GBCA rules, initial configurations and graph coupling structures: Thisis the approach which is presented in this work: Coevolution of rules, graphsand configurations, including resource sharing for each population. The classGbcaCoEvolution extends Evolution to incorporate this strategy.

As number and type of the participating populations differ from one of the variants toanother, the GenerationManager acts as an abstract base class. Subclasses, providingmethods to work on the set of populations as described above, are derived for eachof the evolution variants. Following this approach, the subclasses of Evolution aredesigned to act as a container for the corresponding GenerationManager, such that theoverall concept is the one depicted in Fig. 3.4. See Tab. 3.1 for the changes necessaryto customise the coevolutionary system for each of the applications.

Because the standard genetic algorithm approach does not provide an evolution ofthe initial configurations, a stub Generation had to be introduced, whose only pur-pose is to implement the advance method according to the procedure to generate therandom sampling of configurations over the density interval [0, 1]. Furthermore, aFitnessFunction for the standard CA rules was implemented that did not depend on

49


Table 3.1: Adaptations for different evolutionary strategiesStrategy Base Class Derived Class

GBCA coevolutionEvolution GbcaCoEvolutionGenerationManager GbcaCoEvolutionGenerationManager

CA coevolutionEvolution GenericCaCoEvolutionGenerationManager GenericCaCoEvolutionGenerationManager

CA genetic algorithm

Evolution GenericCaEvolutionGenerationManager GenericCaEvolutionGenerationManagerFitnessFunction GenericCaEvolutionFitnessFunctionCaInputGeneration GenericIcEvolutionGeneration

any weight of the initial configurations and did not include the informational valuefactor.

3.2.3 The Logging System

As denoted already above, there is a Logger entity for each species, which providesmethods to produce specific logging data depending on the population type. Thelogging information includes the following:

GBCA rules, graphs and configurations:

� The actual fitness used for selection, which is calculated as described above.Furthermore, for the rules and graphs, the same logfile includes the relativeand average number of correctly classified initial configurations. For theconfigurations, the relative and average number of rules that failed to clas-sify the particular individual correctly is included. These values are givenfor the best Individual (in terms of fitness), for the elite group on average,for the whole population on average, and for the worst Individual for eachGeneration. This data was used to generate Fig. 4.1 for example (see thenext chapter for details).

� The complete population in a serialised from, such that any individualevolved during the coevolution run can be recovered for a later evalua-tion (for the precise file format see the appendix). This data is written onceat the end of a Generation cycle and it is used during the evaluation phaseat a later stage to restore single individuals and whole Generations (see thefollowing chapter).

Additionally for GBCA rules and initial configurations:

� A distribution of the performance values over the range of possible densitiesof the individuals. Note that identical to the computation of the informa-tional value, the individuals are grouped into density bins of size 2. See

50

3.2 Implementation

CaInputGeneration

GraphsLogger

CaInputsLogger

GraphGeneration

CaRulesLogger CaRuleGeneration

CAInterface CAInterface

GBCA

TCP

GbcaCoEvolutionGenerationManager

CAHost

GbcaCoEvolution

CAClient

CAServer

Figure 3.5: GBCA computation and coevolutionary system integration; note that theMessageChannels between CAHost and CAInterface, CAServer and CAInterface andCAInterface and GBCA are omitted for the sake of clarity.

Def. 2.6 for the context within the computation of the informational value.For example, assume an automaton with 149 cells. The initial configura-tions with density 0, consisting of consecutive “0”s, then falls into the samedensity bin as any of the 149 configurations with density 1

149, the class of

configurations with a single cell residing in the “1”-state.The performance, which means the fraction of defeated individuals fromthe opposing population, is computed for each rule and configuration andaveraged over all individuals within the same density bin. Fig. 4.2 is basedon this data. Refer to the next chapter for details.

Additionally for GBCA rules and graphs:

� At given intervals, the best Individual of each Generation is written out to aseparate file. This contains the GBCA rule or graph itself, its fitness valuesand the corresponding entries of the covered matrix for the GBCA rules. Ina different file, the performance of this rule depending on the density of theinitial configurations is written using the same partitioning of the densityrange described above.

3.2.4 Composed System

The overall system architecture for the main application considered in this work, thecoevolution of GBCA rules, coupling structures, and initial configurations, is depictedin Fig. 3.5. Note that the setup shown in the figure reflects the setup visualised inFig. 3.2c restricted to a single CAServer.

51


3.2.5 Example System Run

In the following, a stepwise overview of a typical coevolution run is given, includingthe main participating entities. Refer to Fig. 3.6 for a visualisation of the interactionsbetween the different entities. However, only the entities involved in evaluation at thetop-most level are included in the figure. The particular Generations, Loggers andCAInterfaces are omitted for the sake of clarity.

I) Initialisation: After the system startup, the CAClient reads in the configurationabout the type of evolutionary run to perform. The configuration of the networksystem specifies whether a local CA computation is required or whether remotecomputation resources are to be used. As noted already above, for the coevo-lutionary system it is transparent which mode of operation is chosen. After theCAClient created the CAHost and the Evolution objects, it initialises the Evo-lution entity and passes over the CAHost. The Evolution object then starts theGenerationManager, forwards the CAHost reference to it and adds the appropri-ate Generation and Logger entities to the GenerationManager.The initialisation phase has finished after the CAClient calls the run method onthe Evolution entity to start the evolutionary process.

II) Evolution cycle: The evolution cycle can be roughly divided into two parts:

a) Evaluation stage:At the beginning of the evolution cycle, the Evolution entity calls the eval-uateFitness method on the GenerationManager, which transfers each Gen-eration it is responsible for to the CAHost for evaluation. This is doneby transforming the Generation to its particular data set type, as alreadymentioned above. The CAHost splits the data into pieces and distributes itover the set of available CAInterfaces. Depending on the network configu-ration, the internal setup of the CAHost will resemble one of those depictedin Fig. 3.2.After all CAInterfaces have finished passing back the resulting data, theCAHost is able to compose the complete GraphOutputs data set, whichcontains the end state of each GBCA rule using each of the graph cou-pling structures and initial configurations, and passes it back to the Gen-erationManager. The GenerationManager then calls the following meth-ods on each Generation (in this order): evaluateOutput, adjustWeight andevaluateFitness . See above for a description of these methods. The evalu-ateFitness method also sorts the individuals according to decreasing fit-ness values. Finally, the GenerationManager requests the Logger enti-ties assigned to each Generation to perform the required logging opera-tions, such as to write out fitness and performance values. After that, theevaluateFitness method of the GenerationManager returns and the fitnessevaluation has finished.

52

3.2 Implementation

b) Modification stage:At the beginning of the modification stage, the Evolution entity calls theselect method on the GenerationManager, which propagates the selectmethod call to all Generations within its scope. Subsequently, every Gen-eration chooses the elements of its Matingpool depending on their fitnessvalues.The following spawnNextGenerations call, originating from the Evolu-tion object, translates into advance method calls on each Generation, re-questing them to migrate to the next Generation. At the end of thespawnNextGenerations method call, the Logger entities are requested tosave the current evolution status of their assigned population for a laterrecovery of any individual of the current Generation. After the spawn-NextGenerations method returns, the current evolution cycle has finished.

III) Termination:At the end of the evolutionary process, the Evolution entity shuts down the Gen-erationManager to reach a clean system stop. The GenerationManager propa-gates the call to the Loggers of each population, so that they can close their openfiles and save the status of their Generation. It also calls the shutdownAll methodof the CAHost entity, which shuts down all of the CAInterfaces. In the case ofa remote system setup, the shutdown procedure of the CAInterface also includesthe tear-down of the network connection. After the shutdown call returns to theEvolution entity, the coevolutionary process has ended.

53


Initi

alis

atio

nE

volu

tion

Cyc

le 1

Evo

lutio

n C

ycle

2T

erm

inat

ion

:CAClient :CAHost :Evolution :GenerationManager

«create»

new CAHost()

«create»

new Evolution()

«create»

new GenerationManager()

evaluateFitness()

ok

select()

ok

spawnNextGenerations()

ok

evaluateFitness()

ok

shutdown()

ok

CaRules

CaInputs

CaGraphs

run()

CaRules

CaInputs

CaGraphs

GraphOutputs

GraphOutputs

shutdownAll()

Figure 3.6: Exemplary coevolution run

54

4 Coevolution Run and Evaluation of

Rules and Graphs

This chapter presents data that was obtained during the coevolution run for GBCArules, graphs and initial configurations. The main characteristics for all three popu-lations and their development over generations are examined. Additionally, the char-acteristics of the evolved graphs are analysed approaching the question whether theyshow small-world properties.

Subsequently, an evaluation based on the evolved individuals of the rule and graphpopulations is performed to measure their performance for a large number of indepen-dently generated initial configurations. The outcome of the evaluation is presentedalong with the best GBCA rule and graph individual found during this process, in thefollowing called Gbca rule and Gbca graph respectively.

During the second stage of the evaluation, the influence of the coupling structureon the performance of Gbca rule is compared to its influence on the performance ofthe majority rule by considering different underlying graph models and graph modelparameter ranges. The significance value is introduced as a further graph characteristic,which is used to determine the differences between the behaviour of Gbca rule and thebehaviour of the majority rule for varying graph model parameters.

Finally, the performance of Gbca rule and Gbca graph under the influence of noise isexamined and related to the performance of the majority rule and the GKL rule underthe same conditions.

The last section contains a summary of the rule and graph properties based on theobservations made during the evaluation process.

4.1 Analysis of System Runs

Note that in this context of competitive coevolutionary optimisation, there are severalcharacteristics to measure the capability of an individual:

Fitness: This value is computed during the actual coevolution run and depends on thecapability of the individual, the capability of its fellow individuals in the samegeneration and the capability of individuals from other generations that belongto different species (see Section 3.1 for details).

Performance: Refers to the average fraction of defeated individuals of opposing gen-erations. In particular: For the GBCA rules, this is the fraction of correctlyclassified initial configurations averaged over all graphs. The performance of theconfiguration individuals is computed by determining the fraction of rules thatfail to classify this configuration in the current generation, which is averaged

55

4 Coevolution Run and Evaluation of Rules and Graphs

Table 4.1: System run parameters of coevolution runs comprising 10000 generations

Run A Run B

GBCA configurationCells (N) 149 149Neighbours (k) 6 6Run length 298 298

RulesPopulation Size 200 250Mutation Rate 0.005 0.005Generation-gap 0.6 0.6

ConfigurationsPopulation Size 300 400Generation Gap 0.2 0.2

GraphsPopulation Size 20 30Mutation Rate 0.008 0.008Generation Gap 0.4 0.4

over all graphs. For the graph individuals, this value is calculated by computingthe fraction of initial configurations that have been classified correctly using theparticular graph on average over all rules.

Evaluation performance: This term is used for the performance computed independentlyof the coevolutionary run, usually for a large set of 105 to 107 initial configurations.The procedure to generate these configurations is described in Section 4.2. Inparticular, the fraction of correctly classified configurations forms the evaluationperformance for both rules and graphs.

Two runs under varying conditions have been performed. The outcomes of theparticular runs, however, only differ in the quality of the evolved rules and graphs.Tab. 4.1 summarises the parameter values for both runs. Unless noted otherwise, theobservations concerning the coevolution run reported in this chapter refer to Run B.Note that the execution of both runs, followed by the evaluation of the collected data,accounted for several weeks of computing time.

4.1.1 Fitness and Performance

Contrary to a typical genetic algorithm approach, the fitness value of an individual givesonly little information about the capability of the individual, as its fitness dependsdirectly on the individuals of the cooperating and competing species in the currentgeneration. Nevertheless, the development of fitness for all three populations is givenfor the sake of completeness, as it also contains abrupt changes shortly after systemstart. One can see in Fig. 4.1 that the populations of rules and configurations experiencedramatic changes in their fitness values within the first ten generations.

The fitness of the best rule individual of each generation temporarily reaches a valueabove 40 but then levels off close to 1.5. This can be explained by the discovery of

56


0

0.5

1

1.5

2

0 10 20 30 40 50 60

Generation

Best performanceGeneration avg performanceBest fitnessGeneration avg fitness

(a) Development of fitnessfor GBCA rules

0

0.2

0.4

0.6

0.8

1

0 10 20 30 40 50 60

Generation

Elite avg performanceGeneration avg performanceElite avg fitnessGeneration avg fitness

(b) Development of fitnessfor initial configurations

0

0.2

0.4

0.6

0.8

1

0 10 20 30 40 50 60

Generation

Best performanceGeneration avg performanceBest fitnessGeneration avg fitness

(c) Development of fitnessfor graphs

Figure 4.1: Development of fitness and performance values within each population: Thevalue for the individual with the highest fitness is plotted beside the average value of thepopulation, except for the configurations. Because of the significant deviation betweenthe fitness of the best individuals of consecutive generations, the average fitness of theelite group is used instead.

a new GBCA rule “strategy”. One example for such a sudden evolutionary progres-sion is the change from “default strategies”, always targeting the same end state, to“block expanding strategies” (refer to [14] for the details). Because the fitness of theconfigurations directly depends on the capability of the rules, it gets immediately af-fected, though less intensively. Similarly, the performance values also change for allthree species, including the graphs, within the first 10 generations. This is due to thefact that the graphs also benefit from the improvement of the GBCA rules.

After the first 20 generations, no considerable changes can be observed, neither forthe fitness values, nor for the performance values. However, there is a constant slightdecrease of the fitness values within the graph population from 0.03 in generation 20to 0.02 in generation 5000, where it levels off.

4.1.2 Evolutionary Changes in Rule and Configuration Populations

In the following, the development of the density of rules and configurations over thecoevolution run is examined. Fig. 4.2 shows this development and also includes theaverage performance of rules and configurations depending on their density.

Both populations, rules and configurations, exhibit similar behaviour in terms ofthe density development of their individuals. As pointed out before, for standard CArules, an increase in rule density facilitates the correct classification of configurationswith a higher density than the critical density ρcrit = 0.5 and the opposite (see alsop. 14). Naturally, the same holds for GBCA rules. Therefore, the rule and configuration

57


00.10.20.30.4

0.20.40.60.8

010

2030

4050

0

20

40

Individuals

Density

Generation

(a) Early stage: Density and performance dis-tribution of initial configurations

0

0.3

0.6

0.9

0.20.40.60.8

010

2030

4050

050

100150

Individuals

Density

Generation

(b) Early stage: Density and performance dis-tribution of rules

00.10.20.30.4

0.20.40.60.8

0

3000

6000

9000

0

30

60

Individuals

Density

Generation

(c) Density and performance distribution ofinitial configurations

0

0.3

0.6

0.9

0.20.40.60.8

0

3000

6000

9000

050

100150

Individuals

Density

Generation

(d) Density and performance distribution ofrules

Figure 4.2: Changes in generation performance depending on density for GBCA rulesand initial configurations: Greyscale values correspond to the average performance ofindividuals with the particular density. The number of individuals is given on thez-axis.

58


populations try to defeat each other by repeatedly crossing ρcrit until they stabilisearound ρcrit after the first 20 generations.

According to Fig. 4.2, the highest performance on average can be found in the centrearound density ρcrit for both populations. The configurations within this domain are themost difficult configurations to classify (even one bit modification determines between acorrect and an incorrect classification). However, instead of concentrating on this highperformance range, the configurations approach it, but individuals of density ρcrit±

1149

never take over the population. There is a gap in the population for the densityrange enclosing ρcrit, which can be already recognised in 4.1.2. This is due to the factthat the high fitness domain does not match the high performance domain lying veryclose to ρcrit, which is a positive effect of the informational value factor within thefitness computation. Recall that the purpose of the informational value is to preventconfigurations that force rules to guess their density to achieve high fitness values.

The behaviour denoted above is essentially the same behaviour reported by Juilleand Pollack in [12], but here the stabilisation process seems to happen much earlier.1

This might indicate that the space of GBCA rules using irregular coupling structures ismore “solutions rich” compared to the space of standard CA rules using lattice shapedstructures for the density classification task, as it was already assumed by Tomassiniin [21].

4.1.3 Evolutionary Changes in Graph Population

The changes in the graph population can be characterised by the properties alreadyused for small-world graphs in Section 2.3: Clustering coefficient γ and characteristicpath length L. However, both are already statistically aggregated values (the first theaverage over all γv, the second the median over the averaged length of shortest paths,compare to Def. 2.10 and Def. 2.13), so it is more instructive to observe the distributionof their base values within the graph population.

Fig. 4.3 compares the distribution of γv and the distribution of the length of shortestpaths within an early graph generation with their particular distribution in the finalgeneration.

Recall that the individuals of the initial graph population are constructed by forminga regular directed lattice graph with the same defining properties, which are the numberof vertices N and the outgoing degree k. Therefore, all graph individuals start withvalues of γlattice = 0.6 and Llattice ≈ 12. In the following generations, the end pointsof edges are relocated, thus changing clustering coefficients and the length of pathsbetween selected vertices.

As one can see from Fig. 4.3a, the values for γv have already significantly decreasedin generation 1000 and are then peaked around two values, γv = 0.2 and γv

′ = 0.5. Thelength of the shortest paths as depicted in Fig. 4.3b has also drastically decreased and

1After about 20 generations compared to 1300 generations in the exemplary coevolution run givenin [12].

59


00.040.080.120.16

0 0.1 0.2 0.3 0.4 0.5 0.60

10

20

30

00.040.080.12

Frequency

γv

Individual

Frequency

(a) Early stage: Distribution of γv withinthe graph generation 1000

00.10.20.30.40.5

0 1 2 3 4 5 60

10

20

30

00.10.20.30.40.5

Frequency

p

Individual

Frequency

(b) Early stage: Distribution of thelength of shortest paths p between anypair of vertices within graph generation1000

00.040.080.120.16

0 0.1 0.2 0.3 0.4 0.5 0.60

10

20

30

00.040.080.120.16

Frequency

γv

Individual

Frequency

(c) Distribution of γv within the graphgeneration 10000

00.10.20.30.40.5

0 1 2 3 4 5 60

10

20

30

00.10.20.30.40.5

Frequency

p

Individual

Frequency

(d) Distribution of the length of shortestpaths p between any pair of vertices p

within graph generation 10000

Figure 4.3: Development of the distribution of clustering coefficients γv and the lengthof shortest paths

60

4.2 Evaluation of Evolved Rules and Graphs

is peaked around 4. This process continues in the following generations, so that thesituation in the final generation is described by Fig. 4.3c and Fig. 4.3d. The majorityof γv values is now lying between 0.1 and 0.2, which is significantly lower than ingeneration 1000, but still well above the expected clustering coefficient for randomgraphs γrand ≈

6149≈ 0.04.2 Only very few vertices tend to have a lower clustering

coefficient than 0.1 in any graph, but still a relatively large number of them have aγv value above 0.2. The length of shortest paths within graphs of generation 10000 isalmost perfectly balanced around 3, which is close to Lrand ≈ 3.3.3

From these observations, one can conclude that evolution favours graphs with shorterpaths, which are still connected in a non-random fashion. In this sense, they possessthe characteristic path length of a random graph while showing a higher clusteringcoefficient by an order of magnitude, therefore having the defining properties of small-world graphs. Any deeper considerations are postponed to the later sections.


In this section, the term performance refers to evaluation performance unless notedotherwise (see p. 56).

4.2.1 Evaluation Performance and Graph Properties

In the following, an evaluation run is performed on the GBCA rule and graph popu-lations. After the coevolution run, it is not necessarily obvious which rule and graphcombinations yield the best performance values for mainly two reasons.

First, the configurations used for overall performance evaluation to compare differentCA rules are typically generated bitwise randomly, which means that the probabilityfor each position in the initial configuration (which corresponds to the start state of asingle cell within the automaton) to take either one of the values “0” or “1” is equal.4

Contrary, the performance that is computed during the coevolution run (as shown inFig. 4.2) is based on the initial configurations present in that particular configurationgeneration.

Second, during coevolution all GBCA rule and graph individuals are rated separatelyusing the set of possible combinations between them and combining the outcome toa single fitness value. For this reason, it is necessary to find the optimal matching ofrules and graphs.

All possible rule and graph pairs of each generation, starting from generation 1000,are re-evaluated using 500 bitwise randomly generated initial configurations. Then,

2A directed random graph with regular outgoing vertex degree in which the target vertex for eachedge is chosen uniformly over all vertices, corresponding to [22], p. 112.

3Lrand was determined based on experiments for this case but can be be also expressed by upperbounds for the graph diameter, see [22].

4The density of configurations generated by this procedure is distributed binomially and peakedaround ρcrit = 0.5.

61


the rules and graphs are ranked according to their performance over this set of con-figurations, and the 10 best individuals of each population are selected for a secondevaluation. The second stage includes 30000 configurations, so that the result can beused as a better estimate for the evaluation performance. Fig. 4.4a depicts the perfor-mance development of the best GBCA rule-graph couple of each generation within thesecond stage, as well as the fraction of shortcuts φ and the average clustering coefficientγ of the best graph.

Already at the starting point of the evaluation process, in generation 1000, rule andgraph pairs with a performance higher than 0.8 are found. However, the variation inthe evaluation performance over the subsequent generations is relatively high. It isworth to mention the steady increase of the φ value of the best graph until generation3000, when it finally stabilises between 0.45 and 0.55. γ, on the other hand, steadilydecreases, until it reaches 0.2 in generation 3000 and finally levels off around 0.15,compared to γrand ≈ 0.04 for the random graph.

Fig. 4.4b and Fig. 4.4c show the development of the distribution of γv and the lengthof the shortest paths within the best graph individual. As already assumed on the basisof the previous observations, the distribution of γv shifts towards lower values. Thishappens non-continuously in several stages, rather than in a continuous process. Onecan identify such a shift around generations 3000 and 7000, which indicates relativelyfast changes within the population of graphs closely before and after these genera-tions. These changes are possibly caused by adaptation to modifications within therule population or by evolutionary progress in the graph population. The distributionof the length of shortest paths changes much more smoothly and finally resembles thedistribution present in the complete final generation, as shown in Fig. 4.3d.

Consequently, these observations fit to the development of clustering coefficients andthe length of shortest paths in the generations, which has been already observed before:The development towards graphs with low characteristic path length, caused by theintroduction of shortcut edges, while partially retaining the structured neighbourhoodderived from the originating lattice graph.

In the following, the rule-graph pair with the highest performance found during theevaluation process is compared to other rules and graph structures performing thedensity classification task. For this purpose, an additional graph model is introduced,the ring-based small-world graph (RingSWGraph) model.

4.2.2 Ring-Based Small-World Graph Model

The purpose of the RingSWGraph-model is to compare the influence of randomisedgraph construction algorithms, which are based on the application of a random rewiringprocedure, to a strictly deterministic construction algorithm. This algorithm is requiredto produce graphs that show the same or similar defining properties of small-worldgraphs as the graphs constructed by the heuristic models. Because the main propertyused to compare graphs from the small-world domain is their fraction of shortcuts φ(see [21, 22] for example), the resulting graphs are required to lie within a close range

62


0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

Generation

Best rule-graph pair performanceBest graph γBest graph φ

(a) Evaluation performance of best rule-graph matching and φ

and γ characteristics for the best graph

00.040.080.120.16

0 0.1 0.2 0.3 0.4 0.5 0.6

3000

6000

9000

00.040.080.120.16

Frequency

γv

Generation

Frequency

(b) Distribution of γv for the best graphindividual of each generation

00.10.20.30.40.5

0 1 2 3 4 5 60

3000

6000

9000

00.10.20.30.40.5

Frequency

p

Generation

Frequency

(c) Distribution of the length of short-est paths p between any pair of verticeswithin the best graph of each generation

Figure 4.4: Evaluation process of coevolution run

63


(a) Start state (latticegraph)

(b) First ring completed (c) Final state

Figure 4.5: Construction of RingSWGraph with 17 vertices and outgoing degree 4

of the given φ parameter.The RingSWGraph construction algorithm is starting from a regular cyclic lattice

graph (corresponding to the coupling structure of a standard CA with periodic bound-ary conditions) by introducing distinctive shortcuts. These shortcuts are formed byrewired edges, similar to the randomised β and φ-model. As a shortcut is an edge withrange of a value higher than two, it has to be ensured that the length of the shortestpath between the origin of the edge to be rewired and the new designated target vertexis at least three. For this purpose, the algorithm creates interleaving rings within thelattice graph, such that the vertices of a particular ring are interconnected by shortcutedges in both directions. See Alg. 4 for a pseudo-code version of the algorithm andFig. 4.5 for a visualisation of the concept.

The RingSWGraph algorithm uses a two-dimensional matrix representation of thegraph, which is in fact identical to the representation of the graph individuals duringthe coevolution run: Each row corresponds to the list of vertices that are adjacentto the vertex with the particular row index. The graph is also considered directed,although for each edge, there always exists an edge in the opposite direction.

The rewiring of edges has to follow a certain order and orientation to ensure thatshortcuts that have been introduced at an early stage do not loose their shortcut statuswhen new edges are rewired. For this reason, the outer edges to the left and right ofa vertex (which corresponds to the first and last index within its adjacency arrayrespectively) are replaced by edges to more “distant” vertices, whereas the next inneredges are rewired to “closer” vertices (in terms of the length of shortest paths).

The connectivity is retained by keeping the most inner edges unmodified, whichfinally become themselves shortcuts for k ≤ 6, thereby enabling the algorithm to covera wide range of the parameter φ.

64


Algorithm 4: RingSWGraph construction algorithm

Data:N : number of verticesk: outgoing degreeφmin: lower bound for the fraction of shortcuts within the graph to beconstructed (compare to Def. 2.16)

Result:A: N × k matrix, such that au,i = v ⇔ the i-th edge going out fromvertex u is connected to vertex v and φ(G(A)) ≥ φmin

begin


rings ← k + 1; elements ← floor(N / (k + 1)); ring index ← 0;usable vertices ← elements ∗ rings; ring radius ← k / 2;

A ← N × k matrix, such that au,i = v ⇔ the i-th regular edge going outfrom vertex u is connected to vertex v in LatticeGraph(N, k);

if elements < 2 thenprint(“Unable to set up ring!”);return

end

else

/* Rewire edges to form rings as long as φ value of current

graph is less than φmin */

for ring index < rings and getPhi(A) < φmin doelement index ← 0;for element index < elements and getPhi(A) < φmin do

current index ← ring index + (element index ∗ rings);A[current index][0]←(current index − (ring radius ∗ rings) + usable vertices) modusable vertices ;

A[current index][1]←(current index − rings + usable vertices) mod usable vertices ;

A[current index][k − 2]←(current index + rings) mod usable vertices ;

A[current index][k − 1]←(current index + (ring radius ∗ rings)) mod usable vertices ;

element index ← element index +1;end

ring index ← ring index +1;end

end

return A;end

65


4.2.3 Evaluation Performance Depending on Graph Models

During the evaluation run, a promising GBCA rule and graph pair was found amongthe individuals of generation 8793, which are called Gbca rule and Gbca graph in thefollowing. Both of them are included in the appendix. The performance values ofthis pair and the performance values of combinations of Gbca rule using any of the de-scribed graph models are compared to the performance values of the majority rule. Thecomparison also includes standard CA rules, such as Coevolution2 and GKL. Tab. 4.2summarises the outcome of 60 runs, each including a total of 105 initial configurationsand one representative of the particular graph classes.

Note that the random graph model employed during the evaluation for Tab. 4.2 andin the following involves a uniform random selection of target vertices for outgoingedges, therefore retaining the condition of a uniform outgoing degree within the graph.

Apparently, the majority rule performs relatively poor for all participating couplingstructures, except the random graph and φ and β-graphs for high parameter values,which essentially brings them close to the random graph limit. This result is supportedby the observations of Moreira et al. in [17] for the majority rule without the influenceof noise (see p. 30 for details). The majority rule is unable to gain even moderateperformance using the RingSWGraph-model for any φ value, which indicates that thecells are not able to reach a consensus state for most configurations when using thismodel.

As expected, the Coevolution2 rule exhibits high performance for standard CA andperforms significantly better than GKL and the majority rule using random graphs.

Gbca rule outperforms the majority rule on all graph structures, excluding randomgraphs. However, for moderate β and φ values, as will be seen later in this chapter,Gbca rule does not achieve a performance higher than 0.5 either, which would cor-respond to a “random guessing” strategy (e.g. for φ = 0.2, not shown in Tab. 4.2).Nevertheless, Gbca rule and Gbca graph combined outperform the best combinationof the majority rule with any graph by more than one percent, which is significant inrelation to the small performance improvements achieved at the time before the dis-covery of Coevolution2 (compare to Tab. 2.2). It is worth to mention that Gbca ruleand Gbca graph also outperform Coevolution2.

Neither Gbca rule nor the majority rule is able to perform very well using a scale-freebased coupling structure. Moreira notes in [17] that in such a case not the majority ofcells determines the final state of the automaton, but the set of most connected cellswithin the automaton. The negative effect on both rules seems to be very similar forthis kind of topology.

Recall that for this evaluation run, the particular member of each class of graphs hasbeen fixed to enable the computation of performance deviation depending on the sets ofinitial configurations. In the following, the effect of graph changes for the same modelparameters, such as β and φ, but different random seeds is considered. For this process,60 elements of the β and φ-graph classes for each choice of β and φ, as well as randomgraphs, each of them initialised by a different random seed, are used to determine the

66


Rule Performance Deviation (10−4)

Majority rule

Gbca graph 0.6153 15.9143Random-graph 0.8319 9.6405Scale-free graph 0.6522 14.0314Lattice-graph < 0.0001 0.053

β-graphβ = 0.4 0.2294 13.5964β = 0.5 0.7818 14.638β = 0.6 0.8187 13.6393

φ-graphφ = 0.4 0.411 16.6933φ = 0.5 0.7207 15.0913φ = 0.6 0.8314 13.4642

RingSWGraphφ = 0.4 0.0007 0.9482φ = 0.5 0.0698 6.9564φ = 0.6 0.0982 9.2881

Gbca rule

Gbca graph 0.8715 9.3173Random-graph 0.8168 10.7684Scale-free graph 0.6657 16.7899Lattice-graph < 0.0001 0.2554

β-graphβ = 0.4 0.7524 14.8131β = 0.5 0.8319 11.4675β = 0.6 0.82 13.3925

φ-graphφ = 0.4 0.8205 12.4209φ = 0.5 0.8339 11.3601φ = 0.6 0.8432 10.3021

RingSWGraphφ = 0.4 0.6415 15.9740φ = 0.5 0.6706 17.7971φ = 0.6 0.701 12.5793

Coevolution2 0.8593 10.2838GKL 0.8148 11.4707

Table 4.2: Performance comparison between Gbca rule and Gbca graph, majority rule,Coevolution2 and GKL, using different graph models: The table contains the averageevaluation performance of a single rule-graph pair over 60 evaluation runs, each includ-ing a total of 105 initial configurations.

67


GBCA Rule Graph Performance Deviation

Majority rule

β-graph0.4 0.4171 0.11660.5 0.7398 0.06790.6 0.8263 0.0206

φ-graph0.4 0.3964 0.09660.5 0.7129 0.09620.6 0.8334 0.0322

Random-graph 0.8428 0.0067

Gbca rule

β-graph0.4 0.8055 0.03310.5 0.8332 0.00660.6 0.8328 0.0072

φ-graph0.4 0.8125 0.04170.5 0.8423 0.00650.6 0.8462 0.006

Random-graph 0.8302 0.0068

Table 4.3: Performance comparison: The performance represents the average evaluationperformance of 60 evaluation runs each using a different β, φ or random graph. Theperformance for each run was computed individually over the same set of 105 initialconfigurations.

coupling structure for Gbca rule and majority rule. Tab. 4.3 summarises the results.As one can see from the data, the performance deviation of the majority rule is

higher than for Gbca rule for any graph structure, except random graphs, partially byan order of magnitude. This indicates that Gbca rule is less dependent on the existenceor absence of certain features within the cell topology, such as consecutive cell blockskeeping minority states, than the majority rule (see p. 32).

The overall situation stays the same: The majority rule is well suited for randomgraphs and correctly classifies about one percent more configurations than Gbca rulewithin this graph domain. For graphs that have not yet reached the random domain,Gbca rule is generally performing better.

Note that the RingSWModel is not taken into account during this second evaluationbecause it always produces the same graph for any fixed φ parameter value, based on itsdeterministic construction algorithm. The scale-free graph model has been disqualifiedby the consideration above.

4.2.4 Evaluation Performance Depending on φ

To compare the performance of Gbca rule to the performance of the majority rule forcontinuously varying coupling structures, a common basis has to be found to relate thedifferent graphs. Here, the φ value is chosen for this task, as it is the generally acceptedcharacteristic to be used to characterise the transformation from lattice type graphs

68


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

per

form

ance

φ

(a) Majority rule

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

φ

Gbca graphβ-graphφ-graphRingSWGraph

(b) Gbca rule

Figure 4.6: Comparison of graph models: Dependency of rule performance on φ; theevaluation performance was computed for 104 initial configurations averaged over 30realisations

over small-world graphs to random graphs.Fig. 4.6 shows the performance development of both Gbca rule and majority rule de-

pending on φ for different graph types. Besides the β, φ and RingSWGraph-model, alsothe evolved graph Gbca graph is included, which is stepwise transformed by randomedge relocations. Three aspects are noteworthy:

1. The performance of the majority rule starts to increase at much higher values ofφ than Gbca rule does for all considered graph models. In fact, Gbca rule is ableto achieve higher performance than 0.5 already at very low values of φ, exceptfor the RingSWGraph-model. Recall that this is the domain where all graphs arestill very similar to lattice graphs and therefore highly structured before randomshortcuts are introduced to the graph. But then its performance decreases forβ and φ-graphs, until φ has reached a value of 0.1 at which point the perfor-mance reaches its minimum of about 0.3, before it starts to rise again. A possibleexplanation for this second aspect is the existence of a trade-off between twocompeting properties of the graph: Regular structure and its positive effect onthe forwarding of cell state transitions, and the convergence of the cell neighbour-hoods towards a truly random sampling of cell states within the automaton. Thisdevelopment is examined below.

2. It can be observed that the majority rule experiences a sudden increase in itsperformance after a certain number of shortcut edges have been introduced tothe evolved Gbca graph. This happens close to a φ value of approximately 0.6,when the performance using the φ-graph model has already gained a much higherlevel. The sudden increase of the majority rule’s performance may indicate thatthere are structural properties of Gbca graph, which have to be “destroyed” firstby random edge-rewiring, before the rule can benefit from the high number of

69


shortcuts. The performance of the evolved Gbca rule using Gbca graph alsoslightly increases until φ ≈ 0.6 but then starts a slow decrease, until it reachesits random graph limit.

3. Neither of both rules is able to perform very well using the RingSWGraph-model.The performance of the majority rule, however, does not even exceed a value of0.1, which demonstrates that it is unable to benefit from the regular introductionof shortcuts. Contrary, the performance of Gbca rule almost linearly increaseswith φ until it reaches a value of approximately 0.7. This observation effectivelyillustrates the different behaviour of the two rules.

In the following, a new characteristic value for graphs is introduced, partly motivatedby Definition 3.1.7 and Definition 3.1.8 in [22].

4.2.5 Significance Value S

Definition 4.1. The significance value S(v) of a vertex v ∈ V gives information aboutthe degree of “contraction” in terms of characteristic path length that is achieved inits neighbourhood. In particular it is defined:

S(v) :=Median(SPV \v(V (Γ(v)in), V (Γ(v)out))

Median(SPV (V (Γ(v)in), V (Γ(v)out)))

where SPW (W ′,W ′′) is the set of shortest path lengths from the vertices of set W ′

to the vertices of set W ′′ with intermediate vertices in the set W for all W,W ′,W ′′ ⊆ V .

Definition 4.2. The significance value S of a graph G(V,E) is defined as the averagedeviation5 over the S(v), v ∈ V :

S :=

∑

v∈V |S(v)− S|

|V |

where S is the average significance value S(v) over all v ∈ V .Remark: The average deviation does not attribute the weight to single strong devia-tions from the mean that would be attributed by the standard deviation. It is used inthis context because a substantial deviation for some vertices is expected within anygraph modified by random edge relocations.

According to Def. 4.1 and Def. 4.2, S captures the transformation between a relativelystructured graph with several intermittent occurrences of shortcut edges (possibly con-centrated on a few selected vertices) and a graph, in which all edges have the same ora very similar range. For the former, the value of S would be maximal, as only fewvertices possessing such shortcuts exist within the graph. These are important for their

5Also called mean deviation.

70


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Sig

nifi

cance

Valu

eS

φ


Figure 4.7: Significance value S depending on φ for the different graph models; thesignificance value was averaged over 30 realisations of each graph type

neighbouring vertices, as they “contract” the shortest paths to all other vertices. In agraph in which all edges have a very similar range, no vertex among the neighbours ofany vertex is particularly important in this sense because all vertices are expected to beseparated by equally long paths. This is not only the case for random graphs, but alsofor completely regular lattice graphs. The intermediate values of S, however, dependon the structural properties inherent in the graph construction algorithm. Therefore,the significance value captures the structural difference between the particular graphmodels, as shown by the scatter-plot in Fig. 4.7.

Note that it is generally possible that a vertex v takes the position of a “bridge”between two otherwise separated sets of vertices, in which case S(v) would be undefined.However, this theoretical case is disregarded here.

The plot of S for β and φ-graphs steadily increases from φ = 0, until it reachesits maximum at φ ≈ 0.12. Contrary, the development of S for the RingSWGraph-model is determined by its fixed introduction of shortcuts, which results in a regulararrangement of vertices possessing several shortcut edges among others only connectedby short range edges. This effect yields a continuous increase of S for this model,until it by far outranges the S values of β and φ-graphs at a φ value of 0.5. Theincrease is stopped, when the majority of edges within the graph have been turned intoshortcuts, so that eventually the rise of the average significance value compensates forthe influence of the introduction of new shortcuts on S.

The significance value of φ and β-graphs decreases after reaching the maximum valueat φ ≈ 0.12, though β-graphs show a generally higher value. This can be explainedby the different selection procedures of edges to be rewired in both models: For theβ-graph, the events of the selection of two particular edges are independent from oneanother. Conversely, the φ-graph model selects exactly as many edges as required toreach the specified φ value, which introduces a dependency between the two events.

S for the modified evolved Gbca graph proceeds somewhere between S for the βand φ-graph models and also approaches zero when it turns into a random graph. Its

71


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Per

form

ance

Significance Value S

(a) Majority rule

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Significance Value S


(b) Gbca rule

Figure 4.8: Performance of GBCA rules depending on graph significance value S;evaluation performance was computed for 104 initial configurations averaged over 30realisations

initial value, however, which means the value of the unmodified graph as a result ofthe coevolutionary process, lies close to 0.1.

Overall, the significance value S captures the structural changes occurring duringthe introduction of shortcut edges, which forms the basic operation of each of theconsidered graph construction algorithms. Especially its different development for thedeterministic and randomised graph models is remarkable.

4.2.6 Evaluation Performance Depending on S

Fig. 4.8 depicts a scatter-plot of the performance development of both Gbca rule andthe majority rule depending on S, as it was computed during the evaluation processdescribed above.

At first view, the plot of the performance depending on S resembles a shear openingtowards lower values of S. For the majority rule, this effect can be explained using theobservations from Section 2.3.

The cells following the majority rule are able to benefit from the “quality” of therandom sampling of cell states within their neighbourhood for random graphs. However,as the majority rule heavily relies on that concept and is also vulnerable to consecutiveblocks of cells keeping minority states (see p. 30), it requires a relatively high φ valueto accomplish the correct classification or even reach a valid end state.

Besides the number of shortcuts, their spreading within the automaton also deter-mines the performance of the majority rule: The more equally they are distributed, themore likely will the local majority of neighbouring states resemble the global majorityof all automaton cell states, provided that each cell resides in the same state as themajority of its neighbours. For this reason either very low (approximately 0) or compa-rably high performance values can be found for low S values, which correspond either

72


to rather random or regularly structured graphs. For β and φ-graphs, the juncturebetween the two regions can be found close to a significance value of 0.3, which roughlycorresponds to a φ range of [0.25, 0.3] depending on the graph-model (see Fig. 4.7).This range of φ values matches the domain, which separates the efficient from the non-efficient regime of the majority rule, in terms of classification performance for all graphmodels except the RingSWGraph-model (compare to Fig. 4.6). The graphs generatedby the RingSWGraph-model mostly lie beyond the “critical” S value of 0.3.

Furthermore, the performance of the majority rule decreases almost linearly in S forβ and φ-graphs graph models until it reaches zero. This indicates that the definitionof S successfully captures the transition from graphs, in which shortcuts are sparselydistributed over the set of vertices, to graphs containing a majority of shortcuts.

Gbca rule’s performance behaves rather differently, although the two domains inFig. 4.8 can be also recognised for this rule. At very low values of S, approximately0.05, Gbca rule achieves a performance slightly higher than 0.5 for the lower domain aswell, except for the RingSWGraph-model. This is due to the fact that in contrast to themajority rule, Gbca rule does not need a large fraction of shortcuts to reach a consensusstate of all cells, which means a valid end state, as long as the present shortcutsare approximately equally distributed (corresponding to a low S value). ComparingFig. 4.9a and 4.9b with Fig. 2.9a and 2.9b shows that Gbca rule has acquired a differentstrategy to overcome stable minority states within the automaton: Instead of relyingon the dissemination of density information over randomly linked cells, it shifts blocksof local majority states through the automaton. Although Gbca rule is unable to reacha consensus state for the lattice coupling structure, just as the majority rule, only veryfew equally distributed shortcuts are necessary to cause collisions between these blocksand reach a valid end state. The number of steps, after which a collision occurs, therebydepends on the size of the block and the spreading of shortcut edges within the GBCA.

In contrast to the majority rule, Gbca rule is able to achieve a performance of a valuehigher than 0.7 also for the RingSWModel. Fig. 4.8 depicts an almost linear increaseof the performance depending on S. This can be best understood by examining aspace-time diagram of Gbca rule using a RingSWGraph coupling structure as shownin Fig. 4.10.

As one can observe, the regular introduction of shortcuts does not prevent Gbca rulefrom stabilising in a valid final state. In fact, the prevailing local connections betweencells, forming clustered neighbourhoods, enable the formation and progression of reg-ular cell domains within the course of the automaton run, similar to the GKL andCoevolution2 rule.6 In this way, the evolved rule differs significantly from the majorityrule, as the latter only contains the basic domains of continuous “0” and “1” blocks.

Gbca rule’s capability of utilising the regular shortcuts, introduced by theRingSWGraph-model with increasing S, can also provide an indication why its perfor-mance does not drop as sharply as the performance of the majority rule for increasing

6The boundaries between the domains were called particles within the context of standard CA rules,compare to Section 2.1

73


1491 1491 1491

t :

298

0

(a) Lattice coupling structure

1491 1491 1491

t :

298

0

(b) φ-graph coupling structure: φ = 0.01

1491 1491 1491

t :

298

0

(c) φ-graph coupling structure: φ = 0.4

Figure 4.9: Space-time diagrams of Gbca rule for increasing φ values74


1491 1491 1491

t :

298

0

Figure 4.10: Space-time diagrams of Gbca rule for the RingSWGraph-model with pa-rameter φmin = 0.5

S values in the range of ]0, 0.3]: Both rules exhibit a performance value higher than 0.8for graphs in the random domain (the upper left domain of the performance scatter-plots in Fig. 4.8a and Fig. 4.8b). When the fraction of shortcuts φ decreases and somevertices gain a higher significance value than others, S starts to increase from a valueclose to 0. The majority rule quickly looses its basis for good classification, the qualityof random samplings represented by cell neighbourhoods. Contrary, Gbca rule benefitsfrom the increasing clustering and shows its highest performance in this domain, justbefore φ decreases under a critical level, causing S to increase over approximately 0.2.Thereafter, Gbca rule experiences a performance loss. Recall that Gbca graph, thegraph structure evolved by coevolution, possesses a significance value of approximately0.1, which lies within the efficient S range of [0, 0.2].

4.2.7 Performance Development in Noisy Environment

It was stated in [17] that the majority rule using a graph based coupling structure isespecially resistant to the influence of noise. Contrary, the performance of GKL wasshown to be significantly reduced even for moderate noise levels. In the following, thebehaviour of Gbca rule using Gbca graph is examined under the influence of noise andits performance is compared to the performance of the majority rule and GKL underthe same conditions. The noise model that is employed during the GBCA runs is themodel of Moreira et.al, see Def. 2.18.

The scatter-plot in Fig. 4.11 depicts the rule performance depending on noise level ηfor Gbca rule and majority rule using Gbca graph to determine the GBCA cell topol-ogy. Additionally, the performance of GKL run on the same set of configurations isincluded. 104 initial configurations were randomly constructed according to a uniform

75


00.2

0.4

0.6

0.8

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

Per

form

ance

η

Gbca ruleMajority rule

GKL rule

(a) Gbca rule, Majority and GKL comparison

00.20.40.60.81

00.20.40.60.81

00.1

0.20.3

0.40.5

0.6

Performance

ρ

η

Performance

(b) Gbca rule

00.20.40.60.81

00.20.40.60.81

00.1

0.20.3

0.40.5

0.6

Performance

ρ

η

Performance

(c) Majority rule

00.20.40.60.81

00.20.40.60.81

00.1

0.20.3

0.40.5

0.6

Performance

ρ

η

Performance

(d) GKL rule

Figure 4.11: Rule performance comparison under the influence of noise, using 104

configurations with uniformly distributed density

76

4.3 Summary

density distribution. Note that this process differs from the technique to generate theconfigurations for computing the evaluation performance. The evaluation performancewas calculated for bitwise randomly generated configurations, whose density is peakedaround ρcrit = 0.5. Contrary, for the configurations with uniformly distributed density,each density is equally likely to be generated, so that the rule’s performance can bedetermined over the whole range of possible densities, instead of mainly focusing onρcrit. Note that this procedure is actually the same Packard introduced to the densityclassification task (see p. 7).

Fig. 4.11a shows the performance of all three rules depending on η. One can observethat the majority rule performs best for the whole range of considered noise levels,whereas Gbca rule stays relatively close to it for 0 ≤ η ≤ 0.1 but then drops behindthe majority rule, which also continues its decrease. GKL’s performance, however,drops much earlier than both considered GBCA rules and its decline also happensmore sharply.

Fig. 4.11b, 4.11c and 4.11d show the performance of all three particular rules de-pending on the density ρ of the configuration to be classified and η. The performanceof the majority rule reaches furthest into the range of noise levels and is able to classifyrelatively high and low densities still after η has risen to a value of 0.4. Additionally,its performance for configurations lying within a range of ρcrit does not collapse, as thishappens for the GKL rule. GKL does not accomplish to perform well even for veryhigh or low densities at reasonable noise levels. This might be due to the effect thatthe noise destroys the particles that are used by GKL to disseminate majority stateinformation throughout the automaton (see p. 8). Gbca rule performs similarly to themajority rule, in the sense that it neither collapses in the domain of ρcrit nor prema-turely fails to classify very high or low densities. However, its performance decreasesearlier than that of the majority rule.

4.3 Summary

The analysis in this chapter indicates that the properties of the evolved GBCA rule areindeed very different from the properties of the majority rule, which has been primar-ily used because of its simplicity and its easy adaptation to the application of graphbased CA. Contrary to the majority rule, whose performance correlates positively withthe randomness of cell interconnections, Gbca rule relies on both principles, regularstructure and random shortcuts. As it was shown during the different evaluations pre-sented in this chapter, Gbca rule exhibits a higher performance using the evolved graphstructure, than the majority rule based on a random connection structure. Both rules,however, also share certain properties. Besides the decreased length of automata runs,which has not been considered here but which is obvious from observing the space-timediagrams, both rules show similar resistance to noise.

Considering the graph models used in the literature, Gbca rule possesses an averageperformance for moderate parameter values, which have shown to be completely insuf-

77


ficient for the majority rule. In this sense, the evolved rule seems to be more suitablefor graph based coupling structures originating from the small-world domain.

Besides the comparisons with the majority rule as an example for GBCA rules,the evolved rule has also been related to standard CA rules performing the densityclassification task, such as Coevolution2 or GKL. The evaluations have shown thatGbca rule in connection with Gbca graph outperforms standard CA rules according todifferent criteria: Gbca rule has a higher performance, a better resistance to noise andfinally, requires less automaton update cycles to reach a consensus state. Furthermore,it might be argued that graph based systems are indeed more common in nature thanpurely lattice shaped system. Therefore, a GBCA would be considered a more realisticmodel for a cellular computing system requiring global coordination. Gbca rule alsoshares a property with GKL and Coevolution2 that is considered typical for moreelaborate standard cellular automata rules: It forms blocks of continuous cell states thatcan be observed in space-time diagrams to “move” through the automaton, possiblyinteracting with other blocks at their boundaries.

Overall, the coevolutionary system has proven to coevolve high performance cellularautomata rules along with coupling structures determining the connections betweentheir cells. Even if the problem formulation was strongly focused on the particulardefinition of the density classification task, it should also be possible to apply it tomore general CA problem situations, as the coevolutionary framework did not requireany particular dependency on the evaluation of the individuals.

78

5 Conclusion

This work originates from mainly three domains, which have been addressed individ-ually in the introduction chapter: The field of cellular automata, evolutionary algo-rithms, particularly coevolutionary optimisation, and graph-theory, mostly covered bythe application of the small-world paradigm. Each of the domains contributed to thefinding of different automata providing a partial solution of the density classificationtask. The fields and their contributions are summarised below.

� Cellular automata theory has gained importance in a wide range of applicationsover the last years. Its significance as a nonlinear modelling paradigm for biolog-ical, sociological and physical processes increased its popularity in general. Earlyresearch conducted within the context of cellular automata theory addressed thedensity classification task by trying to determine properties, needed for CA rulesto successfully perform the task, e.g. following the idea of “computation at theedge of chaos”.

� Coevolutionary optimisation algorithms have been applied to various problems,which usually included some kind of bipartite problem versus solution situa-tion. Following the earlier approaches, coevolutionary algorithms have also beenutilised for the density classification task, focused on the study of dynamics nec-essary to evolve high performance candidate solutions in rough and noisy fitnesslandscapes. Research performed in this context led to the discovery of the bestperforming standard CA rules for the task.

� The “discovery” of small-world networks has triggered a wide range of researchstudies originating from different disciplines. These have been partly concernedwith the search and verification of small-world networks within biological, so-ciological and evolving technical networks, such as the Internet. The study ofnetwork structures exhibiting small-world properties and their influence on dy-namic systems included the application of special graph based coupling structuresto CA, particularly to the density classification task. This work by Watts andStrogatz originated from the claim that the majority rule using small-world graphbased cellular automata is able to develop high performance for this task, whichhad been questioned at first (see [22], page 247).

This work considers a combination of all three approaches, including aspects from everyfield. The resulting coevolutionary framework was designed to evolve three differentspecies instead of two in parallel: rules, graphs and initial configurations. The mainissues that are addressed in this work, as denoted at the beginning, are the following:

1. Coevolution of a high performance pair of cellular automaton rule and couplingstructure for the density classification task.

79

5 Conclusion

2. Analysis whether evolution tends to develop graphs with small-world properties.

3. Performance evaluation of the best evolved rule and coupling structure, charac-terising the dependence of the rule on changing graph structures and determiningthe influence of noise on the rule’s performance.

4. Classification of the rule’s behaviour, comparing it to other well known rules,such as the majority rule, GKL and Coevolution2.

The coevolutionary system described in Chapter 3 was able to succeed in the evolu-tion of a pair of GBCA rule and graph based coupling structure that shows a higherperformance than the currently best known standard CA rule, the Coevolution2 ruledeveloped by Juille and Pollack. Moreover, this pair, consisting of Gbca rule andGbca graph, also outperforms the majority rule using a random graph connection struc-ture. An analysis of the behaviour of Gbca rule and Gbca graph has been performed inChapter 4. Some essential differences compared to the behaviour of both standard CAand the majority rule using different graph models could be found. These propertiesindicate that the coevolutionary system facilitates the evolution of interdependent rulesand graphs. Considering the clustering coefficient γ and the characteristic path lengthL, one could observe that the evolved graphs indeed possess characteristics typical forsmall-world graphs.

As the main focus of this work was the optimisation of GBCAs for this very specificproblem situation, the density classification task deserves a discussion about its rele-vance. In the following, the task will first be examined from the GBCA point of viewand then potential similarities to other problem situations are highlighted.

The density classification task represents an ideal problem case for the evolutionof sophisticated GBCA rules showing complex behaviour. Recall that for the stan-dard CA, complex behaviour is usually associated with a relatively long transient timebefore the automaton reaches its final state. However, for GBCAs the number of inter-mediate steps necessary to reach a consensus state of all cells is typically dramaticallylower depending on the diameter of the graph coupling structure. The most importantproperties of the classification task within this context are:

� The density classification task includes the need of an efficient connection struc-ture between the cells to compute a global automaton property based on localneighbourhood information only.

� It requires long range coordination and synchronisation of cells, in the presenceof a small neighbourhood size in comparison to the number of cells within theautomaton.

� The interconnections between the cells can be directly modelled as a graph. How-ever, if the size of the neighbourhood should remain the same for all cells, thenit is necessary to consider directed graphs with a regular vertex degree.

� The evaluation performance is clear, well defined and easy to compute, thoughits computation requires a significant amount of time in the case that the parallelCA has to be emulated by a serial software system.

80

� It has been proven in [13] that no two-state CA exists which is able to solve thetask perfectly. Nevertheless, no upper bound for performance of CA rules hasbeen found yet.

These properties qualify the density classification task for serving as an ideal testcase to study the impact of local changes on the dynamic behaviour of the GBCA. Theeffect of minor modifications within the connection structure or within the rule tablecan be observed directly by evaluating changes in the classification performance of theautomaton.

Some of the properties denoted above can also be transferred to a more generalcontext, therefore enabling the GBCA to serve as a simplified model of a distributedcomputing system. Though the automata rules used for the density classification taskare remarkably primitive and state information stored for each cell is reduced to a singlebit, the automaton exhibits complex behaviour in terms of both the development of cellstates (observable in space-time diagrams) and the evaluation performance. Therefore,to find candidates that could be modelled by GBCAs one has to find systems, whosebehaviour is not determined by the capability of their single components but is governedby their connectivity and cooperation. In this sense, the system might be consideredto “be more than the sum of its parts”. Other characteristics are global coordinationand synchronisation of components, solely based on local neighbourhoods, instead of acentralised entity, which means the system ought to be self-organising.

In the following, examples of systems are given that fulfil some or all of the criteriadenoted above.

� Ad hoc networks, consisting of cellular, possibly mobile devices, which use wirelesscommunication channels. Note that ad hoc networks are especially required tooperate decentralised and show a high degree of self-organisation.

� Computing architectures following the cellular computing approach, which hasbeen discussed recently.

� Distributed (global) computing approaches, which rely heavily on optimisationof communication overhead.

� Ubiquitous computing related applications, as distributed computation also in-cludes the need of effective communication and coordination of potentially prim-itive components.

� Widespread, large-scale wireless LANs, whose performance depends on the capa-bility and location of access-points and their interconnections.

Following the approach presented in this work, one can consider coevolution of topol-ogy and underlying protocols for networks originating from one of the fields mentionedabove. During this process one would probably focus on one or a few possible appli-cations that are required to run on the network and therefore determine the capabilityof the candidate solutions. Thereafter, conclusions could be drawn from the outcome,which would be used to guide the design and implementation of the final system.

81

5 Conclusion

This of course requires a computational model of the system to be optimised, whichmight not be accessible for any case. Additionally, the system has to be suitable to bemodelled within the concept of evolutionary algorithms in the first place. To overcomethis problem, the coevolutionary algorithm could employ the genetic programmingparadigm, which enables the optimisation to take place at a much higher, algorithmiclevel.

Finally, the property to be optimised, e.g. transmission throughput or network life-time in the presence of energy constraints, might not be suitable to serve as an objectivefunction for the evolutionary algorithm. However, the stochastic nature and the sizeof the solution space of the relatively simple density classification task already yieldsa highly irregular fitness landscape for the space of GBCA rules and graphs. Despitethese obstacles, the coevolutionary algorithm succeeds to evolve rule and graph indi-viduals that are adapted to one another, so that they exhibit a high-performance whenthey are combined.

The application of “graph-aware” principles or protocols to the networks or com-puting architectures denoted above may increase their performance and scalability.However, there are still open questions to be answered. New criteria for the depen-dency of dynamic systems on their topology have to be considered in the years to come.The dynamics of small-world networks are still not completely understood, especiallyconcerning the aspect of growth or locality and mobility, as well as their behaviourunder the influence of environmental changes, such as component failures. New toolsand objectives have to be developed to meet the requirements of large-scale graph andnetwork analysis.

82

Bibliography

[1] David Andre, Forrest H Bennett III, and John R. Koza. Discovery by geneticprogramming of a cellular automata rule that is better than any known rule forthe majority classification problem. In John R. Koza, David E. Goldberg, David B.Fogel, and Rick L. Riolo, editors, Genetic Programming 1996: Proceedings of theFirst Annual Conference, pages 3–11, Stanford University, CA, USA, 28–31 1996.MIT Press.

[2] Peter J. Angeline and Jordan B. Pollack. Competitive environments evolve bet-ter solutions for complex tasks. In Stephanie Forrest, editor, Proceedings of the5th International Conference on Genetic Algorithms, ICGA-93, pages 264–270,University of Illinois at Urbana-Champaign, 17-21 July 1993. Morgan Kaufmann.

[3] Albert-Laszlo Barabasi. Linked. Penguin USA, 2003.

[4] Albert-Laszlo Barabasi and Reka Albert. Emergence of scaling in random net-works. Science, 286:509–512, October 1999.

[5] Matthew Cook. Universality in elementary cellular automata. Complex Systems,15, 2004.

[6] James P. Crutchfield and Melanie Mitchell. The evolution of emergent compu-tation. In Proceedings of the National Academy of Sciences, USA 92:23, pages10742–10746, 1995.

[7] Rajarshi Das, Melanie Mitchell, and James P. Crutchfield. A genetic algorithmdiscovers particle-based computation in cellular automata. In PPSN, pages 344–353, 1994.

[8] Tino Gramss, Stefan Bornholdt, Michael Gross, Melanie Mitchell, and T. Pelliz-zari, editors. Computation in Cellular Automata: A Selected Review. Wiley-VCH,Weinheim, Berlin, 1998.

[9] Phil Husbands. Distributed coevolutionary genetic algorithms for multi-criteriaand multi-constraint optimisation. In Evolutionary Computing, AISB Workshop,volume 865 of Lecture Notes in Computer Science, pages 150–165. Springer, 1994.

[10] Hugues Juille and Jordan B Pollack. Co-evolving intertwined spirals. InLawrence J. Fogel, Peter J. Angeline, and Thomas Baeck, editors, EvolutionaryProgramming V: Proceedings of the Fifth Annual Conference on Evolutionary Pro-gramming, pages 461–467, San Diego, February 29-March 3 1996. MIT Press.

[11] Hugues Juille and Jordan B. Pollack. Coevolutionary learning: a case study. InProceedings of the Fifteenth International Conference on Machine Learning, pages251–259, 1998.

83

Bibliography

[12] Hugues Juille and Jordan B. Pollack. Coevolving the ideal trainer: Applicationto the discovery of cellular automata rules. In Proceedings of the Third AnnualGenetic Programming Conference, 1998.

[13] Mark Land and Richard K. Belew. No perfect two state cellular automata fordensity classification exists. Physical Review Letters, 74(25):5148–5150, 1995.

[14] James P. Crutchfield Melanie Mitchell and Peter T. Hraber. Evolving cellularautomata to perform computations: Mechanisms and impediments. Physica D,75:361–391, 1994.

[15] M. Mitchell, P. T. Hraber, and J. P. Crutchfield. Revisiting the egde of chaos:Evolving cellular automata to perform computations. Complex Systems, pages89–130, 1993. Santa Fe Institute Working Paper 93-03-014.

[16] Melanie Mitchell, James Crutchfield, and Rajarshi Das. Evolving cellular au-tomata with genetic algorithms: A review of recent work. In Proceedings of theFirst International Conference on Evolutionary Computation and its Applications(EvCA’96). Russian Academy of Sciences, 1996.

[17] Andre A. Moreira, Abhishek Mathur, Daniel Diermeier, and Luıs A. N. Amaral.Efficient system-wide coordination in noisy environments. In Proceedings of theNational Academy of Sciences of the United States of America, volume 101(33),pages 12085–12090, 2004.

[18] Jan Paredis. Coevolutionary computation. Artificial Life, 2(4):355–375, 1995.

[19] Jan Paredis. Coevolving cellular automata: Be aware of the red queen. In ThomasBack, editor, Proceedings of the Seventh International Conference on Genetic Al-gorithms (ICGA97), San Francisco, CA, 1997. Morgan Kaufmann.

[20] M. Sipper and E. Ruppin. Co-evolving architectures for cellular machines. PhysicaD, 99:428–441, 1997.

[21] Marco Tomassini, Mario Giacobini, and Christian Darabos. Evolution of small-world networks of automata for computation. In Xin Yao, Edmund Burke, Jose A.Lozano, Jim Smith, Juan J. Merelo-Guervos, John A. Bullinaria, Jonathan Rowe,Peter Tino Ata Kaban, and Hans-Paul Schwefel, editors, Parallel Problem Solvingfrom Nature - PPSN VIII, volume 3242 of LNCS, pages 672–681, Birmingham,UK, 18-22 September 2004. Springer-Verlag. only workshop paper available elec-tronically.

[22] Duncan J. Watts. Small worlds: the dynamics of networks between order andrandomness. Princeton University Press, Princeton, NJ, USA, 1999.

[23] Duncan J. Watts and Steven H. Strogatz. Collective dynamics of ”small-world”networks. Nature, 393:440–442, 1998.

[24] Justin Werfel, Melanie Mitchell, and James P. Crutchfield. Resource sharing andcoevolution in evolving cellular automata. IEEE Trans. Evolutionary Computa-tion, 4(4):388–393, 2000.

84

Bibliography

[25] Stephen Wolfram. A new kind of science. Wolfram Media Inc., Champaign, Ilinois,US, United States, 2002.

[26] Peter Zipf, Oliver Soffke, Andre Schumacher, Radu Dogaru, and Manfred Glesner.On a hardware architecture for the evolution of cellular automata functionality. InInternational Symposium on Signals, Circuits and Systems, ISSCS 2005, volume 1,pages 91–94, 2005.

[27] Peter Zipf, Oliver Soffke, Andre Schumacher, Radu Dogaru, and Manfred Glesner.Programmable and reconfigurable hardware architectures for the rapid prototypingof cellular automata. In International Conference on Field Programmable Logicand Applications, 2005, pages 329–334, 2005.

[28] Peter Zipf, Oliver Soffke, Andre Schumacher, Clemens Schlachta, Radu Dogaru,and Mandfred Glesner. A hardware-in-the-loop system to evaluate the perfor-mance of small-world cellular automata. In International Conference on FieldProgrammable Logic and Applications, 2005, pages 335–340, 2005.

85

A Implementation

A.1 Class Diagrams

CA

Gbca GenericCa

NativeGbca NativeGenericCa

(a) The cellular automata classes; GenericCa and Gbca represent the standard lattice CA and theGBCA model respectively. For each of them there is one implementation based on the Java NativeInterface (JNI), which is written in C (NativeGenericCa and NativeGbca)

GbcaCoEvolution

Evolution

GenericCaCoEvolution GenericCaEvolution

(b) The evolution base class and its particular subclasses, which implement the different strategiesused for evolution, as described in 3.2.2: GenericCaEvolution, which evolves a single population ofrules, GenericCaCoEvolution, which coevolves rules and initial configurations and GbcaCoEvolution,which coevolves graph based automata rules, coupling structures and initial configurations.

87

A Implementation

GbcaCoEvolutionGenerationManager GenericCaEvolutionGenerationManager

GenerationManager

GenericCaCoEvolutionGenerationManager

(c) As described in 3.2.2, the task of the GenerationManager is to operate on the current Generationof each species. Therefore, for each of the strategies, there exists a corresponding GenerationManager.

CellularGeneration

CaInputGeneration CaRuleGeneration

Generation

GraphGeneration

GraphAdjListGeneration

(d) The classes implementing the Generation of candidate solutions. There is one type of Generationfor each species, which is composed of individuals of the corresponding type. The rule and configurationgenerations are abstracted to the CellularGeneration class, which represents a Generation of cellularbinary individuals. CellularGeneration provides methods to select individuals based on their density,for example. Note that CaInputGeneration models a generation of initial configurations. The namingwas chosen to abstract from configurations and potentially consider other types of CA inputs as well.See also 3.2.2.

88

A.1 Class Diagrams

CellularIndividual

CaInputIndividual CaRuleIndividual

Individual

GraphIndividual

GraphAdjListIndividual

(e) The implementation of the Individual types reflects the same concept as the Generation subclasses.Each species possesses unique properties and methods to operate on them.

CellularGenome

CaInputGenome CaRuleGenome

Genome

GraphGenome

GraphAdjListGenome

(f) As described in 3.1.2, there are different genetic operators for each species. These are implementedto work on the corresponding Genome, which is part of each Individual.

89

A Implementation

.(g) The FitnessFunction embodies the central element for the evaluation of the particular Individual.Depending on the performance of the candidate solution that is encoded in its Genome, the Fitness-Function computes the fitness value for the particular Individual. In practice, the GenerationManagercreates a FitnessFunction of the correct type and passes it to the Individual which is required to callthe computeFitness() method of the FitnessFunction. See also 3.2.2

GenericCaEvolutionFitnessFunctionCaRuleFitnessFunction

CaInputFitnessFunction

FitnessFunction«interface»

GraphFitnessFunction

CALogger

CaInputsLogger CaRulesLogger

Logger

GraphsLogger

(h) There are different Logger classes, which are responsible for logging evolution data during thesystem run. See 3.2.3 and 3.2.5 for a description of the logging system and A.2 for the populationdata file format. The CaInputsLogger and CaRulesLogger classes log CaInputGeneration and CaRule-Generation data respectively, whereas GraphsLogger is used for GraphGeneration.

90

A.1 Class Diagrams

GenericCaConfig

CaConfig

GbcaConfig

«interface» ConfigRecord

(i) The ConfigRecord interface declares methods to initialise and check configuration records. It isused for parsing the configuration files, such as client.xml, genetic.xml and server.xml, for storingconfiguration data within the population data files and to pass configuration data over the CAInter-face. The two subclasses GenericCaConfig and GbcaConfig of CaConfig contain the configurationsof the two particular cellular automata types. They correspond to the <generic-ca-config> and<gbca-config> sections in genetic.xml.

ClientConfig

NetConfig

ServerConfig


NetworkConfig ServerSetConfig

(j) These classes are used to process and store network configuration data. They correspond tothe <network-config>, <client-config>, <server-set-config> and <server-config> sections inclient.xml and server.xml. See above for a short description of the ConfigRecord interface.

91

A Implementation

GbcaCoEvolutionConfig

EvolutionConfig

GenericCaCoEvolutionConfig GenericCaEvolutionConfig


GeneticConfig

(k) The GeneticConfig class is the top-level container for all configuration records, except the networkconfiguration, and corresponds to the <genetic-config> section in genetic.xml. The configuration ofthe coevolution run, e.g. the type of optimisation to perform, is included in the particular subclass ofEvolutionConfig. A GenerationConfig entity is part of each EvolutionConfig (see below). In particular,the subclasses correspond to the <gbca-co-evolution-config>, <generic-ca-co-evolution-config>and <generic-ca-evolution-config> sections of genetic.xml. Each of them relates to one of thethree strategies described in 3.2.2. See above for a short description of the ConfigRecord interface.

GenericIcEvolutionPopulation

CaInputPopulation

PopulationConfigRecord

CaRulePopulation GraphPopulation


GenerationConfig

(l) The GenerationConfig class holds a set of PopulationConfigRecords, which specify the typeand properties of the particular populations involved in the evolutionary optimisation. It cor-responds to the <generation-config> section in genetic.xml. PopulationConfigRecord servesas an abstract base class which is extended for each species appropriately. Its subclassescorrespond to the <ca-rule-population>, <ca-input-population>, <graph-population> and<generic-ic-evolution-population> sections of genetic.xml, the last of which is used to specifyconfigurations for GenericCaEvolution. See above for a short description of the ConfigRecord interface.

92

A.1 Class Diagrams

CADataSet

CaGraphs CaInputs CaOutputs CaRules IterativeCaOutputs

DataSet

(m) The DataSet class in an abstract base class for the data record classes used to exchange dataover MessageChannel objects and between CAInterfaces by using a network connection. Besides theclasses for the main types of data records, such as rules, graphs and initial configurations, there is alsothe IterativeCaOutputs class, which is used to store iterative automata runs. These differ from theruns used for evaluation because they contain all intermediate steps for the purpose of visualisation.

CADataSet

CaOutputs IterativeCaOutputs

DataSet

«interface» OutputSet

GraphOutputs IterativeGraphOutputs

(n) All data records containing output generated by cellular automata runs are required to implementthe OutputSet interface. This interface declares methods the CAHost uses to merge output data, whichwas received from multiple sources, i.e. CAServers or CAInterfaces in local mode. A GraphOutputsentity is composed of several CaOutputs (one for each coupling structure), which are transmittedsequentially. After arrival, the CaOutputs are recomposed to a GraphOutputs object. See above for ashort description of the IterativeCaOutputs class.

93

A Implementation

BetaGraph

SWGraph

PhiGraph RingSWGraph

Graph

LatticeGraph RandomGraph ScaleFreeGraph

(o) The Graph base class and its particular subclasses implement the different graph models used forthe evaluation of GBCA performance based on graph properties. See 2.3.1 for the β, φ and scale-freegraph models and 4.2.2 for the RingSWGraph-model. Note that the SWGraph class constitutes asmall-world graph on the basis of a rewiring operation, which is part of any of the considered graphconstruction algorithms, except for the scale-free model. As the basis of all these model-dependentrewiring operations is a regular directed lattice graph, the SWGraph class extends LatticeGraph.

94

A.2 Population Data File Format


In the following, the file format used to store individual data during a run of thecoevolutionary system is described in detail. As already mentioned in Section 3.2, thereexists one data file for each population, which is accessed by the Logger responsible forthe particular Generation when requested by the GenerationManager. The populationdata file is created at system startup and is used to store the current Generation objectduring each evolution cycle.

However, not the complete data contained in every Individual is written out to thefile for efficiency reasons. The “volatile” fields, such as fitness values, which can stillbe reconstructed by repeating the evaluation step, are left out. Nevertheless, the basicprinciple followed during the implementation was to facilitate a simple restoration ofthe evolution data, enabling a recursive serialisation and deserialisation procedure.Also, some redundant data field are included in the file, such as the density value forthe CellularIndividual, which is included to simplify a later search through the datarecords under the sacrifice of additional disc space. The population data files facilitatethe later recovery of any arbitrary individual or any complete generation evolved duringthe evolutionary process.

All three population data files can be split into three sections: The header, configura-tion section and the data section. Below, the sections of the three different populationdata files are compared. See Fig. A.1 for the different concepts.

Header: The header is identical for all three different species, rules, configurations andgraphs. It consists of a string marker written in Java Unicode, namely “# Popu-lation data file\n”, followed by a Java long value specifying the version, 1010 inthis case. The next Java long value specifies the number of generations stored inthe file.

Configuration Section: For all three species this section holds the PopulationConfi-gRecord, which contains the properties of the population along with the evo-lutionary settings, such as mutation rate or generation gap. The data file usedto store CA rule data also includes the CaConfig object, which contains the pa-rameters for the CA.The configuration section is needed to ensure data integrity for a later evaluationof the evolved individuals. It also simplifies the post mortem analysis performedby the GbcaEvaluator tool.

Data Section: The data section contains the actual saved generation data that is formedby a list of generations, consisting themselves of a list of individuals. Althoughthere are some differences between the data formats for all three individual types,the representation of the single generation is essentially the same: The first recordholds the index of the generation within the evolution followed by the numberof individuals in the generation and the current value of the individual counter,which is used to assign Id values to newly created individuals. The next fieldis holding the serialised RandomSet of the generation, which consists of a set of

95

A Implementation

Figure A.1: File Format

Version

Generations

length Random Set

Id

length CaConfig

Id Counter

”# Population data file\n”

Off-

spring

Id

≡ 1 byte

length CaRulePopulation

size = length

size = length

spring

Off-

size = length

size = length

size = length

size = 26

ρ

”H

eader

”C

aC

onfig

CaR

ule

Popula

tion

CaR

ule

Gen

om

e

CaR

ule

Indiv

idual

CaR

ule

Gen

eration Generation

Counter

length

length

IndividualsSize

ρ

(a) CA rule population data file

Version

Generations

Id Counter

length Random Set

≡ 1 byte

Id


Id

length CaInputPopulation

size = 26

size = length

size = length

size = length

size = length

ρ

ρ

CaIn

putG

ener

ation

”H

eader

”C

aIn

putI

ndiv

idual

CaIn

putG

enom

e

CaIn

putP

opula

tion

length

length

GenerationCounter

IndividualsSize

(b) Initial configuration population data file

96


size = length

size = length

size = length ∗ node degree

size = length ∗ node degree

Id Counter

length

length Random Set

degree

size = 26

Generations


Version

≡ 1 byte

Gra

phA

djL

istG

ener

ation

Gra

phA

djL

istI

ndiv

idual

Gra

phA

djL

istG

enom

e

Gra

phPopula

tion

”H

eader

”

vertex

GraphPopulation

length

length

degreeId vertex

Off-Id

IndividualsSize

spring

Off-

spring

GenerationCounter

(c) Graph population data file

97

A Implementation

random generators associated to the Generation. These generators are used, forexample, to provide random input to the genetic operators crossover, mutationand selection.The representation of the individuals varies from species to species. The struc-ture of rules and the structure of configurations are basically the same, only theoffspring counter was omitted for configuration individuals. This byte value getsincremented each time the particular rule or graph individual produces offspring.The density as an integer value was included in the representation to enable afaster search through the data file, in the case that only certain density valuesare of special interest. However, it could also be computed based on the genome.Rule and configuration genomes are composed of a length record and a consecu-tive byte set of the specified length, which is an encoded version of the bit vectorforming the rule and configuration genotype. The graph genome holds the num-ber of vertices within the graph (the value of the length field) and the outgoingdegree. Thereafter, a list of adjacency records follows.

For the sake of simplicity (and disc space), it was decided to restrict the cell (or vertex)indices to the range of [0,255], therefore restricting the maximal number of cells orvertices to 256. However, this restriction only concerns the data files, it does not effectthe surrounding system and it could be easily adjusted.

A.3 Random Generators

Primarily, the basic random generator, originating from the java.util.Random classof the Java 2 platform, has been used to produce the necessary random input forcoevolution and subsequent evaluation. According to the Sun Java documentation, ituses a 48-bit seed, which is modified using a linear congruential formula.

However, as the random generator is a crucial factor for the significance of the results,a native version of the methods to generate initial configurations was implemented,which employs random generators of the GNU Scientific Library (GSL). Several runswere performed using different generators declared in <gsl/gsl rng.h>, but the dif-ferences between performance results for different generators were lying within thedeviation computed in 4.2.3. Therefore, the evaluation performance of the evolvedrules could be confirmed. However, it is hard to estimate the influence of the randomgenerator on the evolutionary optimisation process.

98

B Results

Run B, which was analysed in Section 4.1, yielded the discovery of Gbca rule andGbca graph. In the following, the steps that led to its evolution are briefly outlined, aswell as the command parameters used to obtain the evaluation data for Section 4. Notethat the configuration files that were used during the process, client.xml, server.xmland genetic.xml, are the ones given in C.1.

B.1 Coevolution

Coevolutionary process: The coevolution is run for 10000 generations, given the speci-fied parameters (providing data for Fig. 4.1 and Fig. 4.2).# start client.sh -n gbca 1 -e gbca co evol conf

Re-evaluation of rules: The rules and graphs are re-evaluated starting from generation1000 (providing data for Fig. 4.3 and Fig. 4.4).1

# start gbca evaluator.sh -r log/ca/population ser.data -g

log/graph/population ser.data -o gbca evaluator/eval -t 30000 -s 0 -d

-a 1000 -b 10000

Analysis of evaluation performance: Visual inspection of the data obtained fromthe previous steps shows several rules and graphs combinations, which ex-hibit a relatively high performance. Among them rule individual 1319126and graph individual 105475 in generation 8792. The corresponding line ineval rules best performance.log shows:8792 0.8723666666666666 26171 0 3 5.893333333333328 # rule ID: 1319126

graph ID: 105475

Extraction from population data file: Using the IndividualScanner one is able toobtain the individuals:# start individual scanner.sh -p log/ca/population ser.data -r 1319126 >

gbca rule

# start individual scanner.sh -p log/graph/population ser.data -g 105475

> gbca graph

Computing the evaluation performance: Using the GbcaTestBench the evaluationperformance value of 0.87152 can be computed:# start gbca testbench.sh -r gbca rule -g gbca graph -n gbca 1 -o

testbench gbca -t 100000 -s 0

1the -d flag was used to evaluate rules with density ρ = 0.5 only, because these were expected toshow the highest performance, as experiments indicated. The restriction was imposed due to runtime improvements.

99

B Results

For comparison, a performance value of 0.85962 is computed for Coevolution2:# start genca testbench.sh -r rules/coevolution2.rule -n generic 1 -o

testbench ca -t 100000 -s 0

B.2 Analysis

For the data concerning the progress of the coevolutionary algorithm that was gath-ered by the Loggers during the evolution process, see 3.2.3. The evaluation data wasobtained as described below.

Tab. 4.2 and Tab. 4.3:# start gbca graph comparator.sh -r gbca rule -g gbca graph -n gbca 1 -o

graph comparator/gcmp3 -t 100000 -s 2 -b ’0.4’ -p ’0.4’

# start gbca graph comparator.sh -r gbca rule -g gbca graph -n gbca 1 -o


# start gbca graph comparator.sh -r gbca rule -g gbca graph -n gbca 1 -o


Fig. 4.6, Fig. 4.7 and Fig. 4.8:# start gbca graph testbench.sh -r gbca rule -g gbca graph -n gbca 1 -o

graph testbench/gtb -t 10000 -s 0

Fig. 4.11:# start gbca noise testbench.sh -r gbca rule -g gbca graph -n gbca 1 -o

noise testbench/ntb -t 10000 -s 0

B.3 Gbca rule

In the following, a representation of the evolved Gbca rule is given:

0 00000001 00000001 00010111 0000010132 00000001 00010111 00010111 0101111164 00010101 00000111 01010111 0101111196 00010111 01011111 01111111 01111111

B.4 Gbca graph

The representation of Gbca graph is given on the following pages. Note that the leftcolumn refers to cell/vertex indices of the automaton.

100

B.4 Gbca graph

C N0 N1 N2 N3 N4 N5

0 146 147 148 5 124 1301 111 2 0 2 98 42 148 0 1 3 44 53 0 1 2 4 63 64 1 78 3 5 6 75 79 3 115 66 7 86 3 101 5 105 8 97 75 5 6 8 9 108 100 64 61 74 10 1019 6 7 8 11 56 1210 7 8 9 11 12 1311 23 9 10 108 129 1412 9 10 11 13 3 14113 10 11 12 14 15 1614 11 12 13 15 16 1715 12 13 113 16 17 1816 13 99 15 17 63 8117 14 131 16 18 145 11518 15 16 17 19 46 2119 16 146 18 20 21 2220 17 18 144 135 85 11221 18 56 20 22 47 2422 19 20 128 43 94 6923 20 119 22 104 92 2624 21 22 23 109 146 2725 22 23 148 26 27 826 80 137 25 27 7 2927 24 25 26 28 29 7728 25 26 139 122 30 3129 26 27 28 30 31 3230 27 28 29 31 32 3331 105 29 65 32 79 3432 29 57 87 33 34 3533 53 3 32 34 35 334 136 32 18 35 36 11835 32 133 34 37 37 10336 33 34 35 108 38 6537 34 64 108 76 39 4038 35 120 37 39 40 4139 36 10 38 40 41 4240 130 25 59 41 42 70

C N0 N1 N2 N3 N4 N5

41 112 40 40 42 106 4442 28 107 79 69 44 4543 40 41 60 35 72 4644 41 42 50 45 28 4745 82 43 44 82 47 14746 43 44 45 47 113 4947 44 45 81 48 49 5048 45 46 47 106 120 449 46 47 48 50 51 3950 47 48 49 105 118 5351 48 58 86 55 53 5452 49 50 51 53 25 11053 50 51 130 136 55 3854 51 52 91 55 11 13055 52 26 54 56 97 5856 84 54 1 57 58 5957 46 55 99 58 59 2858 55 39 104 59 60 6159 56 105 124 60 61 6260 38 81 6 61 62 5361 1 127 60 19 125 3662 59 60 0 63 119 6563 60 61 62 140 55 6664 106 62 93 65 112 6765 46 63 23 66 67 6866 63 64 65 123 91 6967 116 16 66 68 69 7068 65 66 67 93 51 7169 66 95 3 70 71 7270 67 90 19 71 72 8071 68 69 70 72 73 3072 21 57 52 133 137 7573 23 71 144 74 75 7674 71 72 73 75 87 4975 72 107 74 83 77 7876 138 92 75 85 78 377 74 139 97 91 19 8078 137 76 77 91 57 8179 100 77 1 95 114 8280 119 78 101 81 128 6481 78 117 80 82 45 24

101

B Results

C N0 N1 N2 N3 N4 N5

82 79 80 8 66 43 8583 80 81 31 12 85 8684 81 49 83 139 86 8785 82 77 84 54 113 8886 84 84 85 87 88 8987 84 25 86 88 89 10288 85 141 87 89 90 9189 86 87 88 48 79 9290 107 88 89 53 92 3191 55 89 90 23 93 9492 89 90 78 93 94 9593 90 71 92 94 95 9694 91 14 93 9 111 9795 29 109 37 12 77 9896 93 94 95 71 140 9997 94 95 96 27 99 10098 95 96 138 145 100 899 39 97 98 30 101 75100 104 98 134 101 102 37101 98 32 14 102 103 113102 99 77 51 129 4 121103 100 58 102 104 105 30104 137 102 103 58 106 107105 85 103 121 106 98 36106 136 104 22 107 108 109107 130 97 127 108 109 143108 105 106 107 109 97 111109 76 146 38 110 111 60110 56 108 109 111 80 113111 40 109 78 112 98 104112 109 110 12 119 15 115113 110 26 53 127 115 117114 111 63 0 115 5 0115 112 113 73 116 117 118116 113 114 115 117 118 43117 114 142 116 118 119 120118 115 100 6 70 123 121119 116 117 10 132 121 14120 25 114 143 85 122 45121 83 119 120 36 123 0122 119 52 121 4 124 125

C N0 N1 N2 N3 N4 N5

123 120 121 90 93 83 126124 121 122 29 125 126 62125 92 123 67 126 32 57126 87 124 68 19 128 129127 105 44 15 103 129 130128 52 126 127 129 138 131129 122 127 128 33 31 132130 127 128 56 131 89 133131 128 119 90 24 133 134132 82 130 131 133 144 145133 55 131 132 1 135 73134 131 132 133 59 136 137135 132 125 134 136 137 117136 133 134 104 137 138 139137 134 135 114 138 139 97138 135 105 39 125 140 141139 136 137 138 140 141 142140 87 138 8 42 142 145141 138 139 84 142 143 144142 139 140 52 143 71 145143 35 141 142 144 145 146144 36 142 143 116 82 147145 142 143 42 146 147 148146 72 144 21 147 122 22147 68 148 146 148 0 1148 145 146 123 124 74 2

102

B.5 Density-Performance Diagram

Figure B.1: Performance versus ρ0.

50.

60.

70.

80.

91

0.42 0.44 0.46 0.48 0.5 0.52 0.54 0.56 0.58

per

form

ance

ρ

gbca rule & gbca graphmajority rule & random graphCoevolution2GKL

B.5 Density-Performance Diagram

A common representation for the capability of a cellular automaton rule performing thedensity classification task is a diagram that depicts the performance depending on thedensity of the configurations to be classified. Fig. B.1 displays such a representation,which was computed for 100000 bitwise randomly generated initial configurations.

103

C Tools and Utilities

In the following, the configuration files and the usage of the coevolution start scriptsand evaluation tools are described. Further information is given in the source codedocumentation.

C.1 Configuration Files

There are three configuration files: client.xml, which holds the configuration for theclient side, server.xml, the server configuration, and genetic.xml, which containsspecifies all properties of the evolutionary runs to be performed.

C.1.1 client.xml

The configuration file for the client. Contains the IP address of the interface to be usedfor network communication and a set of IP addresses of potential CAServers, togetherwith a load value assigned to them. This load value specifies to which extent theCAServers should be used for computation requests. Note that the load values of allCAServers must sum up to 1 (see p. 43). The bootstrap port will be used to contact theserver and to exchange information about which ports will be used for data transfer.

If a CAServer’s IP record is set to localhost the corresponding CAInterface will berun in local mode, so that the CA run computations take place on the client side.

<?xml version="1.0"?>



<network-config id="net_conf">

<client-config id="client_conf_1"

ip="192.168.2.4"

/>

<server-set-config id="server_conf_1">

<server-config id="remote1"

load="0.4"

ip="192.168.2.1"

bootstrap_port="8888"

/>

<server-config id="local"

load="0.3"

105

ip="localhost"


/>

<server-config id="remote2"

load="0.3"

ip="192.168.2.2"


/>

</server-set-config>

</network-config>

C.1.2 genetic.xml

This configuration file consists of two parts: First, a set of named CA configurationrecords, which specify the number of cells and the size of the cell neighbourhood of theparticular automaton, as well as the maximal number of steps the automaton is allowedto proceed during one run. Note that there two types of automata implementationsconsidered here: The GenericCa (corresponding to a standard lattice CA) and theGbca type (an implementation of the GBCA model).

The setting run_mode=iterative identifies the special run mode that is used toproduce space-time diagrams.

The field noise_rate determines the rate of environment noise that is sup-posed to be present during the run, following the noise model defined by Def. 2.18(noise rate = 2 ∗ η). It is possible to assign values to noise_rate in the range of [0, 1].

The second part of the configuration file consists of a set of evolution configurationrecords, which determine the type of evolutionary strategy to be used for the opti-misation (see 3.2.2) and its length in generations. The parameters of the particularpopulations are also part of the configuration records. Their most important fieldsare: The size of the population, the generation gap and the mutation rate to be used.Note that in none of the implemented evolutionary models mutation is feasible for thepopulation of initial configurations.




<genetic-config id="gen_conf_1">

<generic-ca-config id="generic_1"

cells_size="149"

neighbors_size="6"

run_length="298"

/>



106

C.1 Configuration Files

<generic-ca-config id="gen_view_1"

cells_size="149"

neighbors_size="6"

run_length="298"

run_mode="iterative"

/>

<gbca-config id="gbca_1"

cells_size="149"

neighbors_size="6"

run_length="298"

/>



<gbca-config id="gbca_view_1"

cells_size="149"

neighbors_size="6"

run_length="298"

run_mode="iterative"

noise_rate="0.1"

/>

<generic-ca-co-evolution-config id="gen_co_evol_conf" cycles="10000"

random_seed="200">

<generation-config>

<ca-rule-population size="100" mutation_rate="0.005"

generation_gap="0.6" log_dir="/tmp/log_var/ca"

best_individual_log_interval="2"

/>

<ca-input-population size="300"

generation_gap="0.2" log_dir="/tmp/log_var/ic"

/>

</generation-config>

</generic-ca-co-evolution-config>

<generic-ca-evolution-config id="gen_evol_conf" cycles="10000"

random_seed="200">

<generation-config>


generation_gap="0.6" log_dir="/tmp/log_var/ca"


/>

<generic-ic-evolution-population size="500"/>

107

</generic-ca-evolution-config>

<gbca-co-evolution-config id="gbca_co_evol_conf"

cycles="10000" random_seed="4444">

<generation-config>


generation_gap="0.6" log_dir="/home/sandre/log/ca"


/>

<ca-input-population size="400"

generation_gap="0.2" log_dir="/home/sandre/log/ic"

/>

<graph-population size="30" mutation_rate="0.008"

generation_gap="0.4" log_dir="/home/sandre/log/graph"


/>


</gbca-co-evolution-config>

</genetic-config>

C.1.3 server.xml

The configuration file for the server side. Contains the IP address of the networkinterface to listen to, as well as the corresponding bootstrap port. Furthermore, twoport ranges are given that are used for the control and the data sockets respectively.Therefore, incoming connections to the CAServer, using any of the ports within thetwo ranges, must be possible at run time.

When the client establishes a connection to the server it first uses the bootstrapport to open a TCP connection. After accepting the incoming connection, the serverreceives the CA configuration from the client and sends back a control and a dataport, if it is willing to accept the connection. Subsequently, it closes the connection onthe bootstrap port and starts a CAInterface thread which then listens on the controland data port for incoming connections from the client’s CAInterface. All furthercommunication is then processed between the CAInterfaces directly.




<network-config mode="server">

<server-set-config>

<server-config id=""

ip="192.168.2.4"

bootstrap_port = "8888"

108

C.2 Main Tools

ctrl_port_start="8000" ctrl_port_stop="8999"

data_port_start="9000" data_port_stop="9999"

/>

</server-set-config>

</network-config>

C.2 Main Tools

In this section, the term performance refers to the (evolutionary) performance achievedduring the coevolution run (see p. 55).

C.2.1 CAClient

Usage:start client.sh -n <ca config name> -e <evolution config name>

[-i <genetic config file>] [-c <ca client config file>]

Starts the client side of the system including the evolutionary algorithm.

The configuration options for the CA are taken from the configuration entry<ca_config_name> out of <genetic_config_file>. If <genetic_config_file>is not specified, then the default conf/genetic.xml is used. The parameters forthe evolutionary algorithm (including the different options for the populations)are given by the entry named <evolution_config_name> out of the same file. Allnetwork related parameters are taken from the file <ca_client_config_file>. Ifnone is given, the default conf/client.xml is used.

Output description (see also 3.2.3):

{GENERATION-LOG-PREFIX}/main.log: This file is the main log file and containsthe development of performance and fitness of the population over time. Inparticular: The performance and fitness of the best individual, the average overthe elite group, the average over the complete generation and the performanceand fitness of the worst individual.

{GENERATION-LOG-PREFIX}/density fitness distr.log: Contains the distri-bution of the performance and fitness values over time. For each generationand each density interval (using intervals of size two, see p. 36) the averageperformance and fitness of the individuals within this density range is writtenout. Furthermore, the total number of entries within the corresponding den-sity bin is added.Note that this output should be plotted in 3d, e.g. with gnuplot:

<gnuplot> set palette rgbformulae 33,13,10

<gnuplot> set pm3d at s

<gnuplot> splot density_fitness_distr.log using 1:2:3 with pm3d

109


{GENERATION-LOG-PREFIX}/{BEST-INDIVIDUAL-PREFIX} {N} data: The bestindividual of the generation N is written out to this file, together with itsfitness value and other information. BEST INDIVIDUAL PREFIX can be set in<genetic_config_file> by setting the ”best individual prefix” option. Thedefault value is ”best individual”.The output is written every {BEST INDIVIDUAL LOG INTERVAL} gen-erations (can be set by the best individual log interval field in<genetic_config_file>. These output files are suitable to serve asinput for the TestBench tools, which is especially valuable for estimating theprogress of the system.

{GENERATION-LOG-PREFIX}/{BEST-INDIVIDUAL-PREFIX} {N} offspring stat:

Contains a list of individual Ids and the number of offsprings they haveproduced so far during the coevolution run. Note that this number thereforeis accumulated over several generations, since the generation when theindividual first appeared in the population. Furthermore, it is generallypossible and not unlikely that different individuals possess the same genotypeor the genotype of an individual that drops out of the population and getsre-introduced at a later stage of the evolution. This output is only producedfor rule and graph population, if the evolution contains one.

{GENERATION-LOG-PREFIX}/{BEST-INDIVIDUAL-PREFIX} {N} performance:The distribution of initial configurations that have been classified correctly bythe best rule individual of generation N. The parameter of the distribution isthe one-density in the initial configurations. This output is only generated forthe rule population. {BEST INDIVIDUAL PREFIX} is constructed as describedabove.

Example:# start client.sh -n generic 1 -e gen co evol conf

Starts the program to perform a coevolution process to optimise stan-dard CA automata rules. The parameters for the coevolution process,such as the size of each single population, are taken from the configu-ration entry gen co evol conf of the conf/genetic.xml configurationfile. The program will use the CAServers named in the network config-uration in conf/client.xml for performing the CA computations. TheCA parameters (including number of cells, size of the neighbourhood andlength of the CA run) are used from the generic 1 configuration recordin conf/genetic.xml.

C.2.2 CAServer

Usage:start server.sh [-s <server config file>]

Starts the CAServer which will listen for incoming connections. Port and

110

C.3 Evaluation Utilities

interface to be used are taken either from <server_config_file> or fromconf/server.xml.


In this section, the term performance refers to evaluation performance, see 4.1.

C.3.1 GbcaEvaluator

Usage:start gbca evaluator.sh -r <rule population data> -g <graph population data>

[-o <output prefix>] [-t <test cases>] [-s <seed>] [-c

<ca client config file>] [-d] [-a <start index>] [-b <stop index>]

Performs a post mortem analysis of a former Gbca coevolution run. The populationdata for the rule and graph individuals is taken from <rule_population_data>

and <graph_population_data> respectively.The population data is “sieved through” generation by generation in two stages:

First, all rule individuals are combined with all graphs to compute their perfor-mance over a relatively small set of initial configurations with binomially dis-tributed densities. The amount of of configurations to be used is controlledby the SCAN_INPUTS_BUFFER_SIZE constant in util.GbcaEvaluator. Then, theindividuals from both populations are ranked according to their performanceand the best MAX_BEST_RULES_EVAL_SIZE rule individuals, as well as the bestMAX_BEST_GRAPHS_EVAL_SIZE graph individuals are taken to the next stage. Now,the best rules and graphs are pairwise evaluated over a typically much larger setof <test_cases> initial configurations.

The generation index to start and stop the evaluation process can be given by<start_index> and <stop_index>. It is also possible to restrict the evaluationprocess to Gbca rules with a density of 0.5, as experiments support the claim thatthey show the best performance. This is optional and enabled by the -d switch.

In the following, the term “best graph” refers to the graph individual that en-abled the best rule individual (the individual with the best performance in thesecond stage for any of the possible graph structures) to reach its maximal perfor-mance over the set of coupling structures.

Output description:{<output prefix>} {GENERATION}.log: Holds the performance values for all

rule-graph individual pairs evaluated in the first stage.{<output prefix>} {GENERATION} graphs clustering.log: The distribution

of clustering coefficients within this graph generation is written to this file.{<output prefix>} {GENERATION} graphs pathlengths.log: Contains the dis-

tribution of pathlengths within this graph generation.

111


{<output prefix>} graphs best avg edge range.log: Contains the averageedge range for each vertex for the best graph individual of each generation.

{<output prefix>} graphs best clustering.log: This file consists of the dis-tribution of clustering coefficients for the best graph individual of each gener-ation.

{<output prefix>} graphs best pathlengths.log: The distribution of path-lengths for the best graph individual of each generation is written to thisfile.

{<output prefix>} graphs best performance.log: This file contains the per-formance value and rank of the best graph individual and also its properties,like the characteristic path length and edge range, clustering coefficient, frac-tion of shortcuts (φ value) and the significance values.

{<output prefix>} graphs best node degree.log: Holds the distribution ofincoming degrees for the best individual of each graph generation.

{<output prefix>} graphs overall performance.log: Contains the perfor-mance of each graph generation, as computed during the first evaluation stage.Note the first ranked individual does not necessarily match the best graph in-dividual according to the terminology given above.

{<output prefix>} rules best performance.log: This file consists of the per-formance values for the best rule individual of each generation. Additionally,the rank of the best rule and the corresponding best graph and the differencebetween the GKL rule performance are included.

{<output prefix>} rules overall performance.log: Contains the perfor-mance of each rule generation, as computed during the first evaluation stage.In the case, that the -d switch is provided over the command line, the numberof individuals contained in this list will vary depending on the number ofindividuals with density 0.5.

C.3.2 GbcaGraphComparator

Usage:start gbca comparator.sh -r <rule data> -g <graph data> -n <ca config name>

-b <beta> -p <phi> [-o <output file>] [-t <test cases>] [-s <seed>] [-i

<genetic config file>] [-c <ca client config file>]

Determines the performance of the given Gbca rule and graph under the influenceof different graph models, including β and φ-graphs, random, scale-free, RingSW-Graph and lattice graphs, using the model parameters given on the command line.Additionally, the given graph structure specified by <graph data> is considered.For this purpose, INPUT REALISATIONS sets of initial configurations, each consist-ing of <test cases> configurations, are generated and the average performance ofthe given rule is compared to the average performance of the majority rule for thesame configurations and graphs.

112


As a second step, the deviation of the rule’s performance depending on particularelements of the β, φ and the random graph class is determined by generatingGRAPH REALISATIONS realisations and computing the standard deviation of theperformance value. A set of <test cases> initial configurations is fixed duringthis step.

Output description:

{<output file>}:

1st step: The performance and performance standard deviation of<rule data> and the majority rule for the following graphs (in thisorder): The graph given by <graph data>, a single element of the classof β, lattice, φ, random, scale-free and RingSWGraph graphs.

2nd step: The performance and performance standard deviation of<rule data> and the majority rule for the following graphs (in thisorder): β, φ and random graphs.

C.3.3 GbcaGraphTestBench

Usage:start gbca graph testbench.sh -r <rule data> -g <graph data> -n

<ca config name> [-o <output prefix>] [-t <test cases>] [-s <seed>] [-i


Compares the performance of the given Gbca rule <rule_data> using the couplingstructure <graph_data> to the same rule’s performance using different graph mod-els, including β, φ, random and scale-free graphs for a wide parameter range. Fur-thermore, a lattice graph structure is considered that is turned into a random graphunder stepwise edge modifications. The same random modifications are applied tothe given coupling structure as well and the impact on the rule’s performance ismeasured.

Overall, <test_cases> initial configurations with binomially distributed densi-ties are used to evaluate the performance of each rule. During each iteration, adifferent value for the parameter that determines the structure of the particulargraph model is chosen, gradually increasing it. For the two graphs under randommodification, the given input graph and the former lattice graph, the iteration indexmatches the number of randomly modified edges. To provide a basis for estimatingthe impact of topology changes on the performance, the configurations remain un-changed throughout all iterations. The number of iterations to be computed canbe changed by modifying the ITERATIONS constant in util.GbcaGraphTestBench.Each iteration is performed REALISATIONS many times for the same parameter setand the performance is averaged. Additionally, for studying the relationship be-tween rule performance and coupling structure, the tool also records the changesin the graph characteristic values under modification (see below).

113


Refer to GbcaTestBench for the rest of the command line parameters.

Output description:

{<output prefix>} graph{N}.log: This file is kept for each graph type N . Foreach iteration it contains: The characteristic path length, the clustering co-efficient, the fraction of shortcuts φ, the characteristic edge range1 and thesignificance values of the graph. Furthermore, the performance of the givenGbca rule and the majority rule running using the particular coupling struc-ture N are included, as well as for the GKL rule using a standard CA with thesame number of cells and neighbourhood size.

{<output prefix>} iteration{I} graph{N} density performance distribution.log:

Similar to the output files generated by the TestBenches, this file containsthe CA rule performance distribution for this particular graph type anditeration within the modification process. For comparison, it also includes theperformance of the GKL rule running on a standard CA and the majorityrule running on the same graph. Note that additional initial configurationswith uniformly distributed densities are appended to the list of configurationsto produce this output. Otherwise, the data would be too sparse for plotting,especially for very low and very high densities. The number of additionalconfigurations can be changed by modifying the DENS_PERF_TEST_CASES

constant in util.GbcaGraphTestBench.

{<output prefix>} vertex sig.log: The significance values are written out foreach vertex and each graph type. This file is appended from one iteration toanother, providing 3d plotting data.

{<output prefix>} sig dist.log: The distribution of the different significancevalues for each graph is written in this file. Similar to the data describedabove, this data is suitable for 3d plotting.

C.3.4 GbcaNoiseTestBench

Usage:start gbca noise testbench.sh -r <rule data> -g <graph data> -n

<ca config name> [-o <output prefix>] [-t <test cases>] [-s <seed>] [-i


GbcaNoiseTestBench follows a similar approach as GbcaGraphTestBench but in-stead of considering the impact of topology changes on the performance of therule given by <rule_data>, the effect of noise is examined. For comparison, themajority rule is tested under the same conditions, running on the same couplingstructure, as well as the GKL rule for an exemplary comparison with the standardCA model.

1The median of the range of all edges within the graph.

114


In contrast to most other tools, the initial configurations generated for the Gb-caNoiseTestBench evaluation are not generated according to a binomial densitydistribution. Instead, their densities are uniformly distributed over the interval[0, 1] to support the data aggregation for the performance values depending onconfiguration density.

The noise level is successively incremented starting from 0, until it reachesMAX_NOISE_RATE. To change the number of intermediate steps, modify theITERATIONS constant in util.GbcaNoiseTestBench.

Output description:

{<output prefix>} dens noise perf dist.log: Contains the performance ofeach of the three rules depending on the initial configuration density for eachof the noise levels.

{<output prefix>} main.log: Holds the performance of each of the rules withincreasing noise level. Note that this data is actually the accumulated dataincluded in {<output_prefix>}_dens_noise_perf_dist.log}.

C.3.5 GbcaTestBench

Usage:start gbca testbench.sh -r <rule data> -g <graph data> -n

<ca config name> [-o <output file>] [-t <test cases>] [-s <seed>]


This tool works very similar compared to the GenericCaTestBench. It starts<test_cases> many runs for the Gbca automaton rule given by the individualdata file <individual_data>, each initialised with a different configuration, whosedensities follow a binomial distribution. During these computations, the automa-ton will use the graph structure specified in <graph_data>. The input format ofthe graph data is the following: Starting from the first line that does not beginwith a ’#’ it must contain one list of vertex indices per line, representing adjacentvertices. Together, the file must contain N ∗ k vertex indices in N lines and kcolumns, where N is the number of vertices within the graph and k its outgo-ing degree, separated by spaces. Therefore, the file essentially contains the sameinformation as an adjacency list representation of the graph.

All information about the CA is taken from the <ca_config_name> recordin <genetic_config_file> (default: conf/genetic.xml). The number of cellswithin the CA must match the number of vertices in the graph data file<graph data>, as the neighbourhood size must match the outgoing degree. Theconfiguration record must also contain the length of the automaton run and andmay contain a noise rate for the CA execution.

Output description:

115


Command Line: Evaluation performance of <rule data> and <graph data>

(which is the fraction of correctly classified configurations from the set of<test cases> configurations) compared to the evaluation performance of GKLfor the same configurations.

{<output file>}: This output is similar to the performance output producedduring the coevolution run for the rule individuals: The performance depend-ing on the density of the initial configurations. It also contains the performanceof the GKL rule, ran on the same set of initial configurations.

C.3.6 GenericCaTestBench

Usage:start genca testbench.sh -r <rule data> -n <ca config name>

[-o <output file>] [-t <test cases>] [-s <seed>]


Starts <test_cases> runs for the standard CA automaton rule given by the in-dividual data file <individual_data>, each initialised with a different configu-ration whose densities follow a binomial distribution. All CA parameters aretaken from the <ca_config_name> record in <genetic_config_file> (default:conf/genetic.xml). These are: Number of cells in the CA, number of neighboursof each cell, the length of the automaton run and a potential noise rate. The per-formance of the particular standard CA rule individual is then compared to theperformance of the GKL rule by using the same set of initial configurations.

Output description:

Command Line: Evaluation performance of <rule data> (which is the fraction ofcorrectly classified configurations from the set of <test cases> configurations)compared to the evaluation performance of GKL for the same configurations.

{<output file>}: This output is similar to the performance output producedduring the coevolution run for the rule individuals: The performance depend-ing on the density of the initial configurations. It also contains the performanceof the GKL rule, ran on the same set of initial configurations.

C.4 Visualisation Tools

C.4.1 GbcaViewer

Usage:start gbca viewer.sh -r <rule data> -g <graph data> -n <ca config name>

[-o <output prefix>] [-t <test cases>] [-s <seed>]


116

C.5 Miscellaneous

This tool is similar to the GenericCaViewer. Note that the graph used for the Gbcaruns is specified by the input file <graph_data>. The input format is the sameas for the GbcaTestBench. It is required that the Gbca configuration given by<ca_config_name> contains the setting ’run mode=”iterative”’ (see also C.4.2).

C.4.2 GenericCaViewer

Usage:start genca viewer.sh -r <rule data> -n <ca config name>

[-o <output prefix>] [-t <test cases>] [-s <seed>]


The viewer tool can be used to produce image files for standard CA runs (calledspace-time diagrams). The output file is in PGM2 image format and be used asinput to almost any image processing tool.

For each of the <test_cases> initial configurations, an output file contain-ing the run of the automaton, using the rule specified by <rule_data>, is cre-ated. The other parameters are the same as the ones used for the GenericCaTest-Bench. It is required that the GenericCa configuration given by <ca_config_name>

contains the setting ’run mode=”iterative”’. All output files share the prefix<output_base_name>.

C.5 Miscellaneous

C.5.1 ConfigDumper

Usage:start config dumper <population data file> <generation type>

Extracts the PopulationConfigRecord from a serialised population data file andwrites it to the standard output. If the data file consists of generations of the typeCaRuleGeneration, the CA configuration record is written out, too.

The type of the generation in question has to be given on the commandline by specifying <generation_type>, e. g. genetic.carule.CaRuleGeneration,genetic.cainput.CaInputGeneration or genetic.graph.GraphAdjListGeneration.

C.5.2 IndividualScanner

Usage:start individual scanner.sh -p <population data file> (-r | -g) <id>

2Portable Grey Map

117


Scans for an individual with the given Id <id> in the serialised population file<population_data_file>. The type of the generation has to be given by providingeither the -r switch for a generation of the type genetic.carule.CaRuleGeneration,or -g for a generation of the type genetic.graph.GraphAdjListGeneration.

If the scan passed successfully, so that the individual with the given Id wasfound, it is written to standard output, otherwise there will be an error message.Note that this output can be used to provide input for the GenericCaTestBenchor the GbcaTestBench, depending on the CA type.

118

coevolution and performance evaluation of graph based...

Documents