KEY WORDS

Boolean Networks

metaheuristics

robotics

Acknowledgements

First of all, I would like to thank Andrea Roli, who has offered me the

possibility to undertake this work and for being always very kind and helpful.

I would like to thank also Marco Dorigo for giving me the possibility to

work in IRIDIA and Mauro Birattari for his advices and support.

Thanks also to Carlo Pinciroli, who has always supported me and, more

important, has revealed a very good friend and a “vecchio cuore rossonero”

like me.

I thank also my IRIDIA family: Alessandro, Alexander, Ali, Antal, Arne,

Eliseo, Franco, Gianpiero, Giovanni, Manu, Marco, Nadir, Nithin, Prasanna,

Rachael and Sara. Thank you for the great time we spent together.

I would like to thank also all my university mates, with a special mention

for Bat and Frison, who have cooperated with me in several projects and for

being always very helpful.

I thank also all my friends, especially Bakken, Bat (again), Giulia, Monta

and my team-mates, for the good time we spent together in the last years.

Thanks also to Ettore, for donating me a smile and a big hug every day.

A huge thank to you, Alice, for your patience, care and love. Without

you this work would not be possible.

Eventually, I would like to thank my parents for giving me the possibility

to achieve my studies, for their continuous support in all my decisions and

for giving me a warm place I can always return to.

vi Acknowledgements

Contents

Acknowledgements v

1 Introduction 1

1.1 Outline of the work . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Background: robotic agent programs 5

2.1 Rational agents . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Evolutionary Robotics . . . . . . . . . . . . . . . . . . . . . . 8

3 Boolean Networks 11

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.2 A formal definition . . . . . . . . . . . . . . . . . . . . . . . . 13

3.3 Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.4 Random Boolean Networks . . . . . . . . . . . . . . . . . . . . 16

3.5 Boolean Network design: state of the art . . . . . . . . . . . . 19

4 Methodology 21

4.1 Background concepts . . . . . . . . . . . . . . . . . . . . . . . 21

4.2 Introduction to metaheuristics . . . . . . . . . . . . . . . . . . 24

4.3 Methodology description . . . . . . . . . . . . . . . . . . . . . 26

5 Preliminary studies 29

5.1 Experiments and results . . . . . . . . . . . . . . . . . . . . . 29

5.2 Result analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 38

viii CONTENTS

6 Robotics applications 41

6.1 Path follower . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

6.2 Phototaxis and antiphototaxis . . . . . . . . . . . . . . . . . . 48

7 Conclusions 65

A The e-puck robot 67

Bibliography 74

List of figures 76

List of tables 77

1. Introduction

Since thousands of years human beings have tried to understand how the

human brain can perceive and predict the environment and select the ac-

tions to manipulate it. One of the goals of Artificial Intelligence is not only

to understand such features, but also to build intelligent entities (e.g., an

intelligent robot).

In order to achieve this objective, several design methodologies have been

proposed in the last years. The first studied approach consists in design in-

telligent entities by hand, whereas most recent studies propose automatic

design techniques. The latter approach is very interesting, for example be-

cause an automatic design methodology may return solutions not evident a

priori to the designer. The main components of this kind of techniques are

an abstract model of the entity behaviour and a certain optimisation algo-

rithm. The behaviour model is characterized by some parameters, whose

optimal values are initially unknown. Thus, the optimisation algorithm is

used to find the optimal set of parameters for such model.

We identify Boolean Networks (BNs) as a suitable behaviour model: they

have been introduced by Stuart Kauffman as a model for genetic regulatory

networks and as an abstraction of complex systems in order to study the

mechanisms of evolutionary processes in living beings. The choice of BNs

as behaviour models is due to their capability to show complex dynamics

notwithstanding the compactness and simplicity of their definition. More-

over, the availability of tools for their analysis enables the designer to study

2 Introduction

the result of the automatic design and makes it possible to reuse it as a

building block for future applications, even more complex.

A suitable optimisation algorithm for BN design may be represented by

metaheuristic techniques, that are general search strategies upon which a

specific algorithm can be designed. Metaheuristics are particularly appropri-

ated to tackle BNs design because they are usually able to find (near-)optimal

solutions in huge search spaces in a limited amount of time.

In this thesis, developed in collaboration with the Institut de Recherches

Interdisciplinaires et de Developpements en Intelligence Artificielle (IRIDIA)

of the Universite Libre de Bruxelles, we propose an automatic methodology

to build intelligent robotic entities, based on Boolean Networks design by

metaheuristic techniques.

The goal of this work is to show a proof of concept: at first, the method-

ology is validated and tested on abstract case studies; then, it is applied

on robotics applications in a simulated environment, in particular on a task

that requires the robot to develop a sort of internal memory to be achieved.

Finally, the designed BNs are ported and tested in a real robotic platform.

1.1 Outline of the work

The reminder of the thesis is organized as follows.

In Chapter 2, we introduce some background concepts such as the one of

rational agents, which represent a kind of intelligent entities, and the main

features of the techniques used to program them. We also outline a promi-

nent example of one of these methodologies, that is Evolutionary Robotics.

In Chapter 3, we present Boolean Networks (BNs), describing their dy-

namics and properties. We also illustrate the most relevant analytical studies

and the state of the art of BN design.

1.1 Outline of the work 3

Chapter 4 provides the description of the proposed methodology: we

model the automatic BN design as an optimisation problem and we adopt a

metaheuristic technique to solve it.

In Chapter 5 we apply our methodology on abstract case studies in order

to validate it. Then, we analyse BNs resulting from the validation process in

terms of their dynamic regime.

In Chapter 6 the methodology is applied on robotics applications. The

first case study aims to acquire familiarity with the methodology and con-

cerns a simple task (i.e., path following) in which a robot selects its actions

basing only on current sensory inputs. The second case study presents a more

complex task, which requires the robot to develop a sort of internal memory

to achieve it. The automatic BN design concerning both the case studies is

executed in a simulated environment. At the end of the process, we port the

obtained behaviour models (i.e., BNs) into a real robotic platform and we

undertake extensive experiments in order to evaluate their effectiveness.

Finally, Chapter 7 draws some conclusions and gives an outlook for future

works.

4 Introduction

2. Background: robotic agent

programs

This chapter aims to introduce background concepts that will be used in the

following of this thesis. In Section 2.1, we introduce the concept of rational

agent. We will use the key word agent in a generic sense, meaning anything

that can be viewed as perceiving its environment through sensors and acting

upon that environment through actuators. Moreover, we illustrate the main

characteristics of rational agent programming techniques. Finally, we outline

a prominent example of one of these methodologies, namely Evolutionary

Robotics.

2.1 Rational agents

One of the goals of artificial intelligence consists of synthesize rational agents1.

An agent is anything that can be viewed as perceiving its environment

through sensors and acting upon that environment through actuators (see

Figure 2.1).

Following the notation of Russell and Norvig [44], we name percept the per-

ception inputs of the agent in a given instant of time, and percept sequence

1To introduce this concept we follow Russell and Norvig’s approach [44].

6 Background: robotic agent programs

Figure 2.1: Agents interact with the environment through sensors and actu-

ators.

the complete sequence of inputs that the agent has perceived so far.

In general, an agent could establish the action to perform in a given instant

of time according to the percept sequence collected until such instant; the

mapping between any percept sequence and the respective action is called

the agent function. The agent function is an abstract mathematical function

and it is implemented by the agent program.

Once defined the concept of agent, we have now to clarify the meaning

of rationality. In a first approximation, we can assert that a rational agent is

the one that chooses the right action, where the right action is the one that

will cause the agent to be most successful. We utilise the term performance

measure to denote the criteria that determine objectively how successful an

agent is.

In general, the sequence of the agent actions is meant to achieve a task. In

order to completely define a task and its features, we introduce the concept

of task environment, that consists of a set of data concerning the following

items:

• performance measure: a function that quantifies the quality of the

agent actions. It depends on the requirements of the task to solve;

• environment: sometimes all the information the agent needs is ob-

2.1 Rational agents 7

servable in each instant of time, sometimes not; moreover, the char-

acteristics of the environment may condition significantly the agent

actions;

• sensors: the types of information the agent can perceive depend on its

sensors; furthermore, the kind and precision of the sensors define the

agent capabilities;

• actuators: the typologies of actuators determine which actions the

agent can perform.

The diverse nature of task environments requires two main kinds of agent

programs to accomplish the relative tasks:

• memoryless : this kind of agent program can produce actions based

only on its current sensory information. It is utilised when the agent

does not need to display any sort of memory of its interactions with

the environment (e.g., the pieces of information that the agent needs

to select the best action are observable at every instant of time);

• memory based : in this case the agent program must be able to keep

memory of (part of) previous input patterns, as well as its actions,

in an internal state. Then, it produces actions based on both internal

dynamics and current sensory information. This kind of agent program

is suitable when the choice of the action to perform depends also on

the past history of the agent. Thus, memory based agent programs are

more complex but also more powerful than the memoryless ones.

The task environment description may determine also how to design the

agent program. For example, the task environment sometimes may be not

completely defined, either because the designer may not be able to clearly

define the performance measure, or because sensory input noise can not be


easily modeled, or the environment could present unpredictable events. On

the other hand, sometimes the task environment may be completely and ac-

curately defined, but the designer either may not be able to find a solution

or he could not be interested in particular features of the solution, demand-

ing only an agent program that lets the agent achieve the task. In general,

designers can face these kinds of problems choosing between two different

approaches: the first one consists of designing the agent program by hand,

then test, debug it and, if necessary, reverse engineer. This process can be

iterated until the agent program satisfies the requirements. The second ap-

proach consists of using a methodology that automatically finds the solution

and implements the agent program. In general, automatic methodologies are

based on two main components:

• agent program model: a parametric model of the agent program.

The value of the parameters is initially unknown;

• optimisation algorithm: its purpose is to find a good set of param-

eters for the agent program model.

In order to illustrate the main characteristics and issues of the automatic

methodology, that is the one adopted in this thesis, in the next section we

briefly present Evolutionary Robotics, one of the most notable examples of

this kind of design methodology.

2.2 Evolutionary Robotics

Evolutionary Robotics (ER) is a methodological tool to automate the design

of agent programs [40]. It is inspired by the Darwinian theory of natural

selection, that is the principle of selective reproduction of the fittest individ-

ual in a population. This means that the individual that adapts best to its

2.2 Evolutionary Robotics 9

environment has a higher probability to reproduce and to pass its genetic

material to the following generations. In particular, ER is based on the ap-

plication of evolutionary computation techniques [18] [23] to sets of agent

programs. The process can be described as follows: given a certain task en-

vironment, a population of genotypes is generated randomly (the genotypes

being the population individuals), and each genotype corresponds to an agent

program. Then, the process consists of a number of iterations called genera-

tions. For each generation, genotypes are tried one by one in the environment

and evaluated in achieving the task. The evaluation of a genotype is given by

a fitness function, that often corresponds to the performance measure of the

task environment. Genotypes undergo a process called selective reproduction,

in which the fitter is the genotype, the higher is its reproduction frequency.

The reproduction of a genotype into the next generation usually consists of

a modified copy of the genotype itself. Modifications usually happen as re-

sult of genetic operators such as mutation of individual genes or crossover

(i.e., mixing multiple genotypes together). The process typically ends when

a certain limit on the number of generations is reached, or the designer is

satisfied with the solution(s) obtained.

In ER, agent program models may consist of rule sets, or decisions trees,

but the most commonly used are artificial neural networks (ANNs) [21] [19].

ANNs are computational models that attempt to simulate either the struc-

ture or functional aspects of the biological central nervous system. For the

most part, we distinguish between two kinds of ANNs: feedforward ANN [50]

and recurrent ANN [4] [13]. The first category behaves as a memoryless agent

program, thus it is utilised in tasks in which the agent does not need to dis-

play any sort of memory of its interactions with the environment. Instead,

recurrent ANNs present the same features as a memory based agent program,

thus they are used in tasks in which the agent bases decisions both on current

sensory information and previous input patterns.

In general, ER presents several good qualities. For example, the use of

artificial evolution minimises the incorporation of design prejudices and con-


straints, that are left to the dynamics of the evolution [20]. Furthermore,

with ER techniques it is possible to optimise the ANN topology as well as

some aspects of hardware design. Artificial evolution often finds solutions to

the problem that were not a priori evident to the experimenter [40]. In addi-

tion, the properties of ANNs guarantee versatility, generalisation capabilities

and tolerance to noisy sensory input.

On the other hand, ER has still to face many challenges [36] and presents

some limits both in the agent program model (i.e., the ANN) and in the

optimisation algorithm (i.e., the genetic algorithm).

The biggest limit is the impossibility to analyse the solutions found and

to reverse-engineer them. This is mainly due to the complexity of the ANN

model; moreover, given the fact that the most majority of the robotic plat-

forms offer low-level processing units, ANNs often prove to be too computa-

tionally demanding. For example, platforms that do not offer floating point

operations impose severe limitations on the implementations of recurrent

ANNs.

The impossibility to analyse the solutions found makes the obtained agent

programs a kind of black-box.

In this thesis we propose another methodology for automatic design of

agent programs in order to tackle these limits. We propose the use of Boolean

Networks as agent program model: this choice is due mainly to the availabil-

ity of tools for their analysis and to the simplicity of the model. Such features

enable us also to use an optimisation algorithm simpler than genetic algo-

rithms.

In Chapter 3 we introduce Boolean Networks and their properties, focus-

ing on the features that will be the subject of our methodology.

3. Boolean Networks

In the beginning of this chapter we will introduce Boolean Networks, de-

scribing their dynamics and properties and some analytical studies. Then,

we will focus on the features that will be the subject of this thesis: Random

Boolean Networks (Section 3.4), and the design of Boolean Networks, whose

state of the art is outlined in Section 3.5.

3.1 Introduction

Boolean Networks (BNs) have been introduced by Stuart Kauffman as a

model for genetic regulatory networks [27] [28] [30] and as an abstraction of

complex systems in order to study the mechanisms of evolutionary processes

in living beings.

The BN structure can be thought of as a directed graph with N nodes.

The number of ingoing arcs in each node is referred to as K. Each node xi

is associated to a Boolean variable and a Boolean function; the arguments

of the xi Boolean function are the Boolean variables of the nodes whose

outgoing arcs are connected to xi. A simple example of BN is showed in

Figure 3.1. The topologies of a BN can be various and they depend on the

specific application field.

The Boolean variables of all the nodes in a given instant of time t, rep-

resent the state of the BN at instant t. Since a Boolean variable can assume

12 Boolean Networks

Figure 3.1: An example of Boolean Network with N = 4 and K = 2.

only two different values, the state space size is 2N .

The BN dynamic behaviour is characterized by a sequence of state up-

dates (i.e., each node changes the value of its Boolean variable according to

the associated Boolean function). Several kinds of update rules and dynam-

ics have been proposed [17]. The most studied one consists of a synchronous

state update for all the nodes, i.e., the Boolean variables are all updated

at the same instant. Such update rule is also the one used in the following

of this thesis. Boolean functions are deterministic, that is the output of a

Boolean function is unambiguously computed depending on its arguments.

The succession of synchronous and deterministic state updates makes the

dynamics of BNs deterministic as well.

The initial state of a BN can be arbitrary or randomly chosen, or it may

depend on the specific application. Since the state space is finite and the

dynamics is deterministic, the state succession assumes this structure:

• initially, the trajectory is characterized by a transient, that is a state

succession in which each state is different from all the previous ones.

The transient can have length 0;

• eventually a state, or a sequence of states, will be repeated. Such

3.2 A formal definition 13

Figure 3.2: The table describes the dynamics of the BN showing the successor

of each state.

sequences are named attractors of the BN and they can be classified

in cycle attractors with period τ > 1 and point attractors with τ = 1.

Point attractors are also known as fixpoints. The set of initial states

that flow towards an attractor is named basin of attraction.

3.2 A formal definition

In more formal terms, a BN is defined as a discrete-state and discrete-time

dynamic system, with binary values.

The state of the BN is an array of N Boolean variables x = (x1, ..., xN),

with xi ∈ {0, 1}.As described in Section 3.1, the dynamics is determined by a succession

of state updates; we denote a state update by the following state transition

function:

F : {0, 1}N → {0, 1}N .

14 Boolean Networks

Since a state update is determined by the Boolean functions and their ar-

guments, in the following we define the state transition function by both of

them.

We introduce formally the definition of projection function in order to

denote the K arguments of the Boolean functions for each node. Let pi be

the projection function that projects the element i from the N -dimensional

space to the K-dimensional space:

pi : {0, 1}N → {0, 1}K 1 ≤ i ≤ N.

Thus pi is a subset of Boolean variables and such subset is the set of arcs

ingoing to xi and its cardinality is equal to K. For example, considering the

BN of Figure 3.1, p1(x) = (x2, x3).

Then for each node xi we define the respective Boolean function:

fi : {0, 1}K → {0, 1}.

Considering a node xi and its inputs, fi associates to each input configuration

a value for xi. The function fi can be described by Boolean expressions or

truth tables. Note that, once defined K, the number of possible functions

for each node is 22K.

Then, considering a state x ∈ {0, 1}N , we can write F by fi:

F(x) = (f1(p1(x)), f2(p2(x)), ..., fN(pN(x))).

Finally, a BN can be completely defined by the tuple B = (Π,Φ, N,K)

where:

N ∈ {1, 2, ...}, K ∈ {0, 1, 2, ...}, K ≤ N ;

Π = {p1, p2, ..., pN};

Φ = {f1, f2, ..., fN}.

The pi functions denote the links between the nodes, that is the topology of

the BN; instead the functional part of the BN is denoted by the fi functions.

3.3 Dynamics 15

3.3 Dynamics

In the following, we define with a more precise terminology the already hinted

concepts of transient, attractor and basin of attraction.

Given t ∈ {0, 1, 2, ...}, let xt be the state of the BN at time t; then, the

state at instant t+ 1 is evaluated as follows:

xt+1 = F(xt).

Defining the space of possible states as SN = {0, 1}N , then, increasing t, the

vector x describes a trajectory in SN . Let x0 be the initial state; since a BN

is a synchronous and deterministic system,

∃t, τ ∈ {0, 1, 2, ..} : F(xt+τ ) = F(xt).

Then, for t′ ≥ t a BN goes through a cyclic trajectory or remains always in

the same state; such trajectory defines the attractor of a BN starting from

the state x0. A BN can have several attractors, depending on the topology

and the Boolean functions fi chosen.

Given a Boolean Network B, let Γ be the state set of an attractor of B

and let t be the instant of time in which the BN goes into the attractor, then:

Γ = {x ∈ SN | x = F(xt′), t′ ≥ t} 1 ≤ |Γ| ≤ 2N .

The attractor represents the stationary part of the dynamic whereas the

trajectory covered for t′ < t represent the transient one.

Naming T the state set of a transient, then:

T = {x ∈ SN | x = F(xt′), t′ < t} 1 ≤ |T | ≤ 2N .

We distinguished between the trajectory and the state set belonging to the

trajectory: that is because the trajectory denotes not only the state set, but

also the state sequence. In fact, a trajectory is a function that matches a

given instant of time with a specific state belonging to SN . Nevertheless, to

16 Boolean Networks

render the notation simpler, we will not keep this distinction anymore.

We can define the attractor basin for an attractor Γ of a BN B as follows:

XΓ = {x0 ∈ SN | ∃t ∈ {0, 1, 2, ..} : Ft(x0) ∈ Γ}.

This formula means that all the initial states leading the BN in the attractor

Γ belong to XΓ.

An attractor is considered stable if, maintaining the connections fixed

between nodes and perturbing the value of a node, after some updates most

of the nodes of the BN assume the same values as before the perturbation.

In biology, this phenomenon is named homeostasis.

3.4 Random Boolean Networks

Besides the classical definition of BN, many variants exist, including asyn-

chronous dynamics and probabilistic update rules [17]. Moreover, extensions

to continuous values have also been studied. Among these variants, the most

studied one is the Random Boolean Network. Random Boolean Networks

(RBNs) are also a brainchild of Kauffman [27] [30].

The peculiarity of a RBN is that functions and connections of the nodes

are generated randomly. This kind of approach is useful when the specific

structure and/or functions of a system are very complex or unknown. Most

of the research activity about this kind of networks concerns the field of evo-

lutionary biology. By analysing the properties of RBNs, researchers made

hypotheses about the origin of life, and also about genes activation mecha-

nisms, cellular differentiation and evolution of complex systems.

Concerning the dynamics of RBNs, three kinds of regime can be distin-

guished: ordered, chaotic and critical. To understand the differences between

these regimes, we can imagine to plot the nodes of a RBN on a square lattice

and let the dynamics flow. Moreover, we can think to colour changing nodes

in green and the frozen ones in red.

3.4 Random Boolean Networks 17

• In the ordered regime, after a transient in which many nodes change

(green), the dynamics stabilises and most of the nodes become static

(red). Therefore, there are only few green “islands” surrounded by a

red “frozen sea”.

• In the chaotic regime, most of the nodes change constantly, so the

scenario is a green sea of changing nodes and some red frozen islands,

that is the opposite than the ordered regime.

The existence of a critical line between these two regimes has been discov-

ered since the early studies about RBNs [45] [3] [46]. In physics, the crossing

of such critical line corresponds to a phase transition.

Resuming the metaphor of the square lattice utilised to illustrate the

ordered regime and the chaotic one, the scenario corresponding to the criti-

cal regime, also named “edge of chaos”, consists of an ordered red sea that

breaks into green islands, and the red islands join and percolate through the

lattice [31].

Features concerning the chaotic regime have been also studied in a more

general context, such as the one of nonlinear equation systems [47], and with

respect to the notion of frustrated chaos in biological networks [5].

An interesting and studied feature of these dynamic regimes is related

to sensivity to initial conditions and robustness to perturbations: as briefly

hinted in Section 3.3, by flipping the state of a node, we can measure how

the perturbation spreads. This can be done comparing the evolution of the

original network and the perturbed one.

In the ordered regime usually the perturbation does not spread, and the

perturbed network returns to the state before the perturbation. Instead, in

the chaotic regime, perturbations tend to propagate throughout the network.

At the edge of chaos, perturbations can propagate but usually not through

all the network.

The propagation of perturbations in RBNs can be measured in several

ways. A well-known technique is the following: given two copies of the same

18 Boolean Networks

RBN and flipping h nodes in one copy, we suppose to map the Hamming dis-

tance between the states of these two copies (i.e., the number of nodes with

different values) onto that obtained after one update of both the two copies.

Repeating this procedure several times and changing value of h between 0

and N (N is the number of nodes) we plot the averages for each value of

h. Finally, it is possible to distinguish between ordered, chaotic and critical

regime measuring the slope of the plot. This plot is the Derrida plot [10],

and modifications of this technique have been proposed [15].

Another important feature about dynamic regimes is that changing the

value of K(the number of inputs of a node), the dynamic regime of the RBNs

changes too. This dependency derives from a particular feature of Boolean

functions: a Boolean function is named canalizing if the output value is de-

termined by only one input value (e.g., in the OR function one input equal

to 1 is enough to have 1 as output). The number of canalizing functions

influences the dynamic behaviour of the RBN. For example, in a RBN with

K = 2, 16 different kinds of functions are possible and 14 of them are canaliz-

ing functions; in these conditions the BN tends to show an ordered behaviour.

More precisely, the RBNs with K ≤ 2 are in the ordered regime, and the

ones with K ≥ 3 are in the chaotic regime. In order to numerically find the

critical line corresponding to the critical regime, we have to introduce the

concept of homogeneity : considering the truth table of a node, homogeneity

is defined as the probability p to have an entry with 0 as output value.

Then, the critical line is defined by the following equation [3]:

2p(1− p) = 1/K.

In the study of gene regulatory networks, the notion of criticality plays a

fundamental role in the solution of the dichotomy between evolvability and

robustness to noise. Researchers conjecture that living beings are in a critical

state [34] [31]. The critical state marries the inherent robustness of ordered

regime and the flexibility of the chaotic regime. Researchers often refer to

this concept as “life at the edge of chaos”.

3.5 Boolean Network design: state of the art 19

0 0,2 0,4 0,6 0,8 1p

0

2,5

5

7,5

10

12,5

15

17,5

20

<K>

chaos

order

Figure 3.3: Relationship between p and K.

3.5 Boolean Network design: state of the art

Most of the research activity on BNs concern the field of evolutionary biol-

ogy; anyway, BNs have also been studied as computational learning systems

[41] [29] [11] and proven capable of tackling hard problems [43].

Despite the high number of analytical studies about the properties and

dynamics of BNs, their synthesis has not been studied thoroughly.

The first contribute in this direction is due to Kauffman and Smith [32]:

an application of evolutionary algorithms to RBNs. The goal is to obtain

RBNs whose dynamics reaches an attractor including a target state. This

study has been followed up by Lemke et al. [35], that add to the goal a

constraint on the attractor length. The main objective of those studies is

to investigate the impact of evolution on RBNs. The results obtained raise

interesting and fundamental questions on the search landscape structure and

the evolutionary dynamics depending on RBN structural characteristics.

More recent studies found that RBNs designed by a simple search al-

20 Boolean Networks

gorithm in order to maximize robustness, present a set of properties which

characterize RBNs in ordered, critical and chaotic regime [48] [16]. More-

over, also other studies about the evolvability of robustness have been pre-

sented [2] [8] [14].

Finally, Roli et al. [42] discuss experimental results on the evolution of

RBNs applying a simple genetic algorithm whose goal is to obtain an attrac-

tor with a given length.

In Chapter 4 we introduce a new methodology to design RBNs in order

to automatically synthesize agent programs.

4. Methodology

In Chapter 2 we have introduced techniques to automatically synthesize

robotic agent programs, reaching the conclusion that it could be useful to

develop a methodology based on Boolean Network (BN) design as further

possibility with respect to the already existent approaches. In fact, BNs rep-

resent a simple agent program model, but they can be characterized by a

complex dynamics (see Chapter 3). Moreover, both the availability of anal-

ysis tools and the developing research on BNs motivate to their use as agent

program model.

The goal of this chapter is to outline such methodology. In order to do

that, in Section 4.1 we describe some background concepts1. Then, in Sec-

tion 4.2 we introduce metaheuristics, the computational method we will use

to design BN. Finally, Section 4.3 reports a description of the methodology.

4.1 Background concepts

A combinatorial problem is a mathematical problem, whose answer depends

on values assumed by a certain set of variables and a certain set of parameters.

This kind of problem can be viewed as the set of all its instances, where an

instance consists of a specification of all the parameters. For each instance,

combinations of variable values form the potential solutions for that instance.

Among combinatorial problems, we distinguish two correlated categories:

1To introduce such concepts we follow Hoos and Stutzle’s approach [25].

22 Methodology

• decision problems: for this problem class, the solutions of a given

instance are specified by a set of logical conditions. They present two

variants:

- decision variant : given a problem instance, the goal is to establish

whether or not a solution exists;

- search variant : given a problem instance, the objective is the

same as the decision variant, with the addition to find a solution

(if it exists). Thus, algorithms able to solve the search variant can

always be used to solve the decision variant. Moreover, for many

combinatorial decision problems, the converse also holds;

• optimisation problems: they present the same features as decision

problems, with the addition of an objective function utilised to evalu-

ate the quality of candidate solutions. Any optimisation problem can

be stated as a minimisation problem or as a maximisation problem,

depending on whether the given objective function is to be minimised

or maximised. Optimisation problems can be also handled as deci-

sion problems setting a bound on the objective function. Optimisation

problems present two variants:

- search variant : given a problem instance, it consists of finding a

solution with minimal (or maximal) objective function value;

- evaluation variant : it consists in finding the optimal value of the

objective function.

Both decision and optimisation problems can be solved by searching solutions

in the space of candidate solutions; yet, given an instance of such problems,

the set of candidate solutions is typically at least exponential in the size of

that instance.

In this context, we are interested in the time required for solving an in-

stance of a combinatorial problem as a function of the size of such instance:

4.1 Background concepts 23

this question represents the core of the computational complexity theory, i.e.,

a theory that studies the classification of computational problems, in terms

of computation time and memory space required to be solved.

The complexity of a computation, concerning a certain class of problems,

is defined according to the size of an instance of the problem and the effi-

ciency of the algorithm adopted. Precisely, we define the complexity of a

problem referring to the complexity of the best algorithm for that problem.

Since generally time complexity is the more restrictive factor, problems are

often categorised into complexity classes with respect to their asymptotic

worst-case time complexity.

Two particularly interesting complexity classes are P and NP . P is the

class of problems that can be solved by a deterministic machine in polyno-

mial time, where deterministic machine means a machine model that takes

decisions unambiguously, basing on its current internal state. Instead, NPis the class of problems that can be solved by a nondeterministic machine in

polynomial time, where nondeterministic machine means a machine model

that, given a current state, takes a decision choosing among a set of possible

decisions. Precisely, this kind of machine does not take random decisions, but

it is an hypothetical machine which has the ability to make correct guesses

for certain decisions.

Every problem in P is also contained in NP , because it is possible to

emulate deterministic calculations on a nondeterministic machine. On the

other hand, a lot of relevant problems are in NP , but they are not contained

in P : this means that we do not know a polynomial time deterministic algo-

rithm to solve these kind of problems, but only exponential time algorithms.

Thus, as soon as an instance of these problems grows, it becomes intractable.

A lot of these hard problems from NP can be translated into each other in

polynomial deterministic time (these translations are also called polynomial

reductions). Given a problem, if each problem in NP can be polynomially

reduced to it, then it is at least as hard as any other problem in NP . This

kind of problems are called NP-complete. Another important class is that of

24 Methodology

NP-hard problems. NP-hard problems are at least as hard as NP-complete

ones. They do not necessarily belong to the class NP themselves, because

their complexity may actually be higher. In particular, NP-hard problems

that are contained in NP are called NP-complete.

A lot of practically relevant combinatorial problems are NP-hard, but

technique to solve them efficiently do exist. To describe a possible way to

solve these kinds of problems, we need to distinguish between two main cat-

egories of solving algorithms:

• exact algorithms: they are guaranteed to find an optimal solution,

in bounded time, for every finite size instance of a problem. On the

other hand, they might need computation times too high for practical

purposes (e.g., NP-hard problems);

• approximate algorithms: they sacrifice the guarantee of finding op-

timal solutions for the sake of getting solutions in a significantly reduced

amount of time.

In the following, we will discuss on the usefulness to adopt an approxi-

mate method to solve hard problems. In particular, in Section 4.2 we in-

troduce metaheuristics, that represent the approximate method adopted in

our methodology.

4.2 Introduction to metaheuristics

Metaheuristics are general search strategies upon which a specific algorithm

for solving an optimisation problem can be designed [7] [43] [25]. They be-

long to the approximate algorithm category and represent the current state-

of-the-art approach for a wide variety of optimisation problems, in areas

such as bioinformatics, logistics, engineering, and business. They combine

4.2 Introduction to metaheuristics 25

diverse concepts for exploring the space of candidate solutions and also apply

learning strategies in order to find (near-)optimal solutions efficiently. These

techniques are successfully applied to optimisation problems since decades

and are particularly effective in tackling problems in which the objective

function is rather complex or even an approximation of the actual optimisa-

tion criterion.

Metaheuristics can be usually classified into trajectory methods, and pop-

ulation based methods. The former kind describes a trajectory over a search

graph (e.g., local search algorithms). Conversely, the latter methods perform

a search process characterized by an iterative sampling of the search space

(e.g., genetic algorithms). Such methods can be combined and integrated,

also with other techniques from Artificial Intelligence and Operations Re-

search, giving origin to Hybrid Metaheuristics [6], which are nowadays the

leading-edge technology in optimisation.

Metaheuristics have been applied for parameter tuning in algorithms [26]

and biological models [39] [38]. Moreover, applications to algorithm de-

sign [53], neural network training [1] [40] and system design [12] [22] have

also been presented. All these applications share a common approach, that

relies in the possibility of modeling the design goals as an optimisation prob-

lem. The decision variables of the problem are the system parameters which

can be set from the external; for example, for a neural network they could

be the network weights. The objective function is implicitly defined by eval-

uating a parameter configuration (i.e., an assignment to decision variables)

by simulating the system. In case of neural networks, the error of the net-

work over the training set can be computed, while for parameter tuning the

performance of the system with respect to a target output can be estimated.

Both local search and population-based methods have been used for system

design and tuning. Hutter et al. [26] and Xu et al. [53] apply an iterated lo-

cal search over the algorithm parameter space, while Nicosia and Sciacca [39]

and Montagna and Roli [38] propose methods based on genetic algorithms

and particle swarm optimisation, respectively.

26 Methodology

Boolean Network

metaheuristic

target

evaluator

Booleanfunctions

objective function

value

simulation

requirements

Figure 4.1: Methodology approach.

4.3 Methodology description

Automatic techniques to synthesize robotic agent programs are characterized

by two main components: the agent program model and the optimisation al-

gorithm (see Section 2.1). In our methodology, the agent program model is

given by a BN and the optimisation algorithm consists of a metaheuristic

technique.

The design of a BN that satisfies given criteria can be modeled as a

constrained combinatorial optimisation problem by properly defining the set

of decision variables, constraints and the objective function. The approach

that we will follow is illustrated in Figure 4.1: the starting point is given by

a RBN (see Section 3.4), whose parameters are the number of ingoing arcs

into each node K and the homogeneity p. The topology of the BN does not

change during the design process. Such process consists of several iterations:

for each iteration, the metaheuristic algorithm operates on decision variables

which encode Boolean functions of the BN. A complete assignment to those

variables defines an instance of a BN. This network is then simulated and

4.3 Methodology description 27

evaluated according to the specific target requirements by a specific software

component. For example, in robotics applications, the simulation of the BN

coincides with the simulation of a robot that tries to achieve a required task;

the behaviour of such robot is governed by the agent program based on such

BN and the objective function is represented by the performance measure of

the robot. Finally, the objective function value is returned to the metaheuris-

tic algorithm that can thus proceed with search. The process ends when a

certain number of iterations is reached or a certain value of the objective

function is obtained; both these parameters are set a priori by the experi-

menter.

Metaheuristics are particularly appropriated for tackling the issue of au-

tomatic design of BNs because they can usually explore huge search spaces

efficiently and heuristic information can be easily integrated. Moreover, since

the design problem can be tackled by subsequent refinements of both the

search model and algorithms, metaheuristics are certainly a very promis-

ing technique for achieving successful results because they can be incremen-

tally developed, starting from a simple strategy to a more sophisticated one.

It is also important to remark that, in design problems in general, finding

a proven optimal solution is not particularly relevant, because the criteria

defining the quality of solutions are an approximation of an (unknown) ac-

tual quality function. Therefore, in these cases, metaheuristics are definitely

preferable over exact algorithms because they have a time complexity that

scales polynomially with the instance size. In addition, metaheuristics can

be easily combined with constraint satisfaction and constraint programming

techniques which can be used for reducing the search space explored.

Since this thesis represents a first attempt to apply the methodology, in

the following we adopt a simple search strategy, that is a stochastic itera-

tive improvement local search algorithm. In general, for a given instance of

a combinatorial problem, the search for solutions takes place in the space

of candidate solutions. The local search process is started by selecting an

initial candidate solution, and then proceeds by iteratively moving from one

28 Methodology

candidate solution to a neighbouring candidate solution, where the decision

on each search step is based on a limited amount of local information only.

In our specific case, the local search operates as follows: each move consists

of choosing randomly both a node of the BN and an entry in the truth table

of the Boolean function characterising such node. Then, the value of such

entry is flipped. If a given move does not lead to an improvement (i.e., a

better value of the objective function), such move is retracted; otherwise, the

move is accepted and the modified BN becomes the new candidate solution.

This kind of local search is known as stochastic local search [25]: in fact,

as first approximation it consists of a local search algorithm in which search

initialisation and move choice are randomised.

In Chapter 5 we execute experiments in an abstract context to validate

the proposed methodology. Then, in Chapter 6 we apply the methodology

to robotics case studies in order to provide a proof of concept.

5. Preliminary studies

In order to validate the methodology described in Chapter 4, we undertake

some experiments whose objective, in general, consists in automatically de-

signing BNs which satisfy target requirements on their trajectory in the state

space. In Section 5.1 we report experiments and results concerning four di-

verse case studies.

Another goal of this chapter is to analyse BNs resulting from these exper-

iments, focusing on the relation between BN dynamic regimes and method-

ology performances. The results of this analysis are reported in Section 5.2.

5.1 Experiments and results

In this section, we apply our methodology on four diverse case studies. The

difference among them is given by diverse target requirements that BNs must

satisfy. Anyway, since these case studies present many common features, we

introduce such aspects before discussing them separately.

Since the adopted optimisation algorithm consists of a stochastic local

search, for each case study we execute 90 independent experiments, where

each of them corresponds to a different BN. Each BN is characterised by 100

nodes and it is autonomous (i.e., BN dynamics flows without receiving any

external input). For each BN, initial connections and functions are randomly

generated with K = 3 (i.e., the number of ingoing arcs for each node is equal

30 Preliminary studies

to 3) and homogeneity p (i.e., considering the truth table of a node, the

probability to have an entry with 0 as output value is equal to p): precisely,

among 90 BNs, 30 are generated with p = 0.5, 30 with p = 0.788675 and

the last 30 with p = 0.85. Such homogeneity values statistically correspond

to chaotic, critical and ordered regime, respectively (see Section 3.4). The

initial state of each BN is randomly generated.

For each experiment, 100000 iterations of the optimisation algorithm are

executed. Each iteration corresponds to a simulation of the respective BN

trajectory lasting T steps, with T = 1000 (i.e., 1000 BN state updates).

Concerning the target requirements, in general they consist of reaching

a target state in different temporal windows. The target state is randomly

generated.

As a first attempt, we used a stochastic local search that, at each itera-

tion, selects randomly a BN node and flips randomly an entry of its Boolean

function. Yet, we noticed that this approach does not provide a good per-

formance. Thus, we decided to use a variant of the described algorithm: it

consists of choosing, at each iteration of the optimisation algorithm, a BN

variable that does not match the target state, and flip randomly an entry

of its Boolean function. The results reported in the following will refer to

this variant of the algorithm; precisely we will report, for each case study,

the respective run length distribution [24], that is the probability of finding

a solution within a certain number of local search steps.

5.1.1 First case study

The goal of the first case study is to design a BN whose trajectory must reach

a given a target state at least once within the temporal interval (0, T ].

At each simulation, a BN is evaluated on the bases of the simulation step

in which the BN presents the highest number of Boolean variables matching

the target state. Thus, let u(t) be the function returning the number of

5.1 Experiments and results 31

0 20000 40000 60000 80000 100000

iterations

0

0.2

0.4

0.6

0.8

1su

cces

s fr

eque

ncy

p=0.5p=0.788675p=0.85

Figure 5.1: Run length distribution of the first case study.

Boolean variables matching the target state at each simulation step t, with

t ∈ (0, T ], the objective function can be described as follows:

mint∈(0,T ]

(1− u(t)

N

).

The results obtained are showed in Figure 5.1: note that all the BNs with

initial homogeneity p = 0.85 reach the goal within 80000 iterations of the

optimisation algorithm. Also BNs initially in critical regime present good

performances, whereas only 10% of chaotic BNs reaches the goal.

5.1.2 Second case study

The goal of the second case study is to design a BN whose trajectory must

reach a given a target state exactly at instant t, with t ∈ (0, T ].

At each simulation, a BN is evaluated checking the quantity of Boolean


0 t - τ t t + τ T

t

0

0,2

0,4

0,6

0,8

1

f(t ;

γ)

γ = 0.5γ = 1γ = 2

Figure 5.2: Function f(t; γ) used in the second case study to assign a certain

reward to partially successful BNs.

variables matching the target state in t. However, we partially reward also

such states in BN trajectory that either have a high number of Boolean

variables matching the target state, or occur in simulation steps close to t.

To implement this reward rule, we define a family of functions f(t; γ) on the

interval [0, T ] as follows:

f(t; γ) =

0 t < t− τ0 t > t+ τ

1−∣∣∣ t−tτ ∣∣∣γ elsewhere

.

The function f(t; γ) is plotted in Figure 5.2: note that BN states close to

t can be rewarded in diverse ways, depending on the value of parameters γ

and τ .


0 20000 40000 60000 80000 100000

iterations

0

0.2

0.4

0.6

0.8

1su

cces

s fr

eque

ncy

p=0.5p=0.788675p=0.85

Figure 5.3: Run length distribution of the second case study.

Let u(t) be the function returning the number of nodes matching the

target state at each simulation step t, with t ∈ (0, T ], the objective function

can be described as follows:

mint∈(0,T ]

(1− f(t; γ)

u(t)

N

).

We tried different values of t, τ and γ, without noticing relevant differ-

ences in the performances. In Figure 5.3 we report the results obtained with

t = 500, τ = 10 and γ = 2: note that the performances are a bit worse with

respect to the first case study, but this can be explained by the fact that this

case study presents a tighter constraint than the first one, that is to reach

the target state in a specific simulation step. However, Figure 5.3 confirms

the discrepancy of performances among chaotic, critical and ordered initial

regimes of BNs, as observed in the first case study.


0 t - τ t T

t

0

0,2

0,4

0,6

0,8

1

f(t ;

γ)

γ = 0.5γ = 1γ = 2

Figure 5.4: Function f(t; γ) used in third and fourth case studies to assign a

certain reward to partially successful BNs.

5.1.3 Third case study

The goal of the third case study is to design a BN whose trajectory must

reach a given a target state at least once within the temporal interval [t, T ],

with t ∈ (0, T ], but not before than t.

As in the second case study, we assign a certain reward also to those BNs

whose trajectory reaches a state either almost congruent to the target one

or a certain number of simulation steps τ before than t. To implement this

reward rule, we define a family of functions f(t; γ) on the interval [0, T ] as

follows:


0 20000 40000 60000 80000 100000

iterations

0

0.2

0.4

0.6

0.8

1su

cces

s fr

eque

ncy

p=0.5p=0.788675p=0.85

Figure 5.5: Run length distribution of the third case study.

f(t; γ) =

0 t < t− τ1−

∣∣∣ t−tτ ∣∣∣γ t− τ ≤ t ≤ t

1 t > t

.

The function f(t; γ) is plotted in Figure 5.4: note that BN states before than

t can be rewarded in diverse ways, depending on the value of parameters γ

and τ .

Let u(t) be the function returning the number of nodes matching the

target state at each simulation step t, with t ∈ (0, T ], the objective function

can be described as follows: 1 if u(t)N

= 1, t ∈ (0, t− τ)

min(

1− f(t; γ)u(t)N

)otherwise

. (5.1)

Note that this objective function does not reward at all those BNs whose

trajectory reach the target state before than t− τ .

In Figure 5.5, we report the results obtained adopting the same parame-


ters as the second case study, that are t = 500, τ = 10 and γ = 2. Note that

the performances are almost the same as the second case study. Moreover,

the discrepancy of performances among chaotic, critical and ordered initial

regimes of BNs is confirmed also in this case.

5.1.4 Fourth case study

The goal of the fourth case study is the same as the third one, with a further

constraint: when a BN trajectory reaches the target state, then such state

must be kept. In other words, the target state must represent a BN fixpoint.

As in the second and third case studies, we assign a certain reward also to

those BNs whose trajectory reaches a state either almost congruent to the

target one or a certain number of simulation steps τ before than t. At each

simulation, let t′ ∈ [t − τ, T ] be the simulation step correspondent to the

state with the highest number of Boolean variables congruent to the target

state. To verify that BN state in t′

is a fixpoint of the BN, it is enough to

check if the state at t′+ 1 is equal to the one in t

′. If this occurs, then we

can assert that the BN trajectory has reached a fixpoint. This statement

is valid because we are considering BNs with deterministic dynamics and

synchronous state updates.

Analysing the requirements, two different main features can be noticed:

first of all, the network has to reach the target state, but not before t − τ .

To evaluate this aspect, we can use the same objective function as the third

case study, that is defined by Equation (5.1). The second aspect consists

of making the target state a fixpoint for the BN. A way to merge these

two aspects is to create an objective function based on a weighted mean, as

follows: 1 if u(t)N

= 1, t ∈ (0, t− τ)

min(αx(t

′) + (1− α)y(t

′))

otherwise.


0 20000 40000 60000 80000 100000

iterations

0

0.2

0.4

0.6

0.8

1su

cces

s fr

eque

ncy

p=0.5p=0.788675p=0.85

Figure 5.6: Run length distribution of the fourth case study.

where x(t′) is defined by Equation (5.1) and y(t

′) is a function that com-

pares BN states in t′

and t′+ 1, returning the ratio between number of not

congruent Boolean variables and the total number of BN nodes. Thus, when

y(t′) = 0, the BN state in t

′represents a fixpoint for the BN.

Different values for α account for different relative importance between

reaching the target state and keeping it. We tried various values of this

parameter (i.e., α ∈ {0.25, 0.5, 0.75}) to explore the effects on the optimisa-

tion process, and we noticed that small values of α (i.e., aiming to favour

the reaching a fixpoint regardless of the target state) correspond to better

performances. In Figure 5.6, we report the results obtained with γ = 0.5

and other parameters setted to the same values as the results showed for the

second and the third case study, that are t = 500, τ = 10 and γ = 2. Note

that the performances are much better with respect to all the previous case

studies: we conjecture that, introducing the constraint of reaching a fixpoint,

the exploration of the search space is different and local search can explore


Ordered networks

Non-optimal Optimal

initial BN 0.7354200 0.7290419

final BN 1.077538 1.158360

Table 5.1: Derrida’s parameters for BNs initially ordered and then designed

by local search.

Critical networks

Non-optimal Optimal

initial BN 0.9500681 0.9419760

final BN 1.165276 1.197605

Table 5.2: Derrida’s parameters for BNs initially critical and then designed

by local search.

Chaotic networks

Non-optimal Optimal

initial BN 1.414794 1.425388

final BN 1.3674872 0.9226405

Table 5.3: Derrida’s parameters for BNs initially chaotic and then designed

by local search.

it more effectively.

5.2 Result analysis

The results described in Section 5.1 showed that the proposed methodol-

ogy can obtain a good performance and they also represent a validation of

the tool. In this section, we analyse the BNs obtained by the optimisation

5.2 Result analysis 39

algorithm in terms of their dynamic regimes. As described in Section 3.4,

Derrida’s plots represent an useful tool for this kind of study. We estimate

Derrida’s parameter of a BN as the ratio between the Riemann integrals of

the Derrida’s curve and the bisectrix, in [0,10].

Our analysis proceeds by considering BNs obtained from all the experi-

ments described in Section 5.1: we separate such BNs in 3 groups, on the ba-

sis of their initial dynamic regime (i.e., ordered, critical and chaotic). Then,

we split each group between successfully designed BNs and not successfully

designed ones. For each subgroup obtained, we compute the average of Der-

rida’s parameters of the initial BNs and the average of Derrida’s parameters

of the BNs designed by local search. The obtained averages are reported in

Tables 5.1, 5.2, and 5.3.

Note that, in general, the Derrida’s parameter of BNs designed by local

search tends to 1 (i.e., the Derrida’s parameter corresponding to the critical

regime). This assertion is more evident if we consider only optimal BNs.

These results are very interesting because, as described in Section 3.4, the

critical regime is the more studied and promising one, since it marries the in-

herent robustness of ordered regime and the flexibility of the chaotic regime.

Future works will concern deeper studies in this direction. However, the goal

of this thesis is to prove the applicability of the methodology to robotics ap-

plications. Thus, now that the methodology has been validated, in Chapter

6 we can try to apply it on that kind of case studies.

6. Robotics applications

In this chapter we apply our methodology on robotics case studies. Precisely,

the goal is to verify its capability to automatically devise a memory based

agent program, that is an agent program able to keep memory of previous

input patterns in an internal state (Section 2.1).

Our starting point is the case study described in Section 6.1. It aims to

test our methodology on a very simple task, that can be actually solved by a

memory less agent program. Since our main goal is to obtain memory based

agent programs, once we find a solution, we move on to more complex tasks.

For this reason, we do not report a complete analysis of this first case study.

In Section 6.2, we apply our methodology to a case study that needs a

memory based agent program to be achieved. This kind of task enables us

to exploit the powerful dynamics of BNs. All the obtained agent programs

have been achieved in simulation. To evaluate their effectiveness in a more

realistic setup, we also port them into a real robotic platform and conduct

extensive experiments.

6.1 Path follower

In order to acquire familiarity with the use of our methodology in robotics

applications, we simulate, as a first attempt, an agent up against a path

following task. It consists of an agent that has to follow a given path, e.g., a

line or a circle.

42 Robotics applications

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

t = 0

Agent

Circularpath

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

t = 0

Agent

Circularpath

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

t = 0

Agent

Circularpath

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

t = 0

Agent

Circularpath

Figure 6.1: Path follower environment.

6.1.1 Task definition

In order to completely define the task, we describe its task environment:

• environment: it consists of a squared arena (1m x 1m) in which a

path is traced. In particular, the path is defined drawing two parallel

lines and the agent follows the path advancing between such lines (see

Figure 6.1). The simulation is time discrete, that is, for each instant

of time, the agent perceives the environment and selects the action to

perform.

• performance measure: given a certain simulation time T , the agent

must satisfy two properties to achieve successfully the task: spending

as much time as possible on the path, without going out, and advancing

as much as possible along the path (e.g., cover a full circle on a circular

path). We define the performance measure by minimising an error

6.1 Path follower 43

function E ∈ [0, 1]. This error function is given by a weighted mean

including two contributes. The first one measures the time the agent

spends along the path: since both the task simulation and the BN

dynamics are time discrete, at each step we can check if the agent is

on the path or not. If the agent is on the path, it is rewarded with 1

point, otherwise it is not rewarded (0 points). The second contribute

measures how much the agent has advanced on the path. Thus, we can

write the error function E as follows:

E = α(

1−∑T

i=1 siT

)+(

1− α)(

1− (pmax − pf )),

where:

∀i ∈ [1, T ], si =

1 if the agent is on track

0 otherwise

and:

- pmax is the position in the arena that the agent can reach in the op-

timal case, i.e., the agent advances on the path at each simulation

step;

- pf is the position in the arena actually reached by the agent.

The performance measure is evaluated minimising E, i.e., the smaller

is E, the better is the agent performance.

Note that this kind of performance measure enables the experimenter

to favour one contribute with respect to the other one by changing the

value of α. For example, if α = 0.25, then the result will be an agent

that covers a long path portion even though it sometimes goes out from

the path.

• sensors: sensors allow the agent to know its position with respect to

the path in each instant of time. Precisely, the agent knows if it is on

the path, or the path is on its right side, or the path is on its left side,

but not the distance to it.


• actuators: the agent is equipped with two motors and each of them

controls a wheel. The speed value can assume for each wheel only two

values, that are ON or OFF.

To achieve this task a memoryless agent program is enough (see Section 2.1)

because, at each instant of time, the only necessary piece of information to

set the wheel speed correctly is whether or not the robot is on the path or

not.

6.1.2 BN setup

On applying our methodology, we use as first attempt a small BN with 10

nodes, i.e., N = 10. Initial connections and functions of the nodes are ran-

domly generated with K = 3 (i.e., the number of ingoing arcs for each node

is equal to 3) and homogeneity p = 0.5 (i.e., considering the truth table of

a node, the probability to have an entry with 0 as output value is equal to

0.5).

For robotics applications, we use BNs with some output nodes and some

input nodes : output nodes are nodes whose Boolean variable, at each instant

of time, is used to set actuator values. In order to do this, it is necessary to

make a suitable mapping between Boolean values and value range of actua-

tors. For example, we may associate a BN output node to an agent wheel:

if, in a given instant of time, the Boolean variable is equal to 1, then the

wheel assumes a certain constant speed. Otherwise, the wheel is stopped.

Input nodes are nodes whose Boolean variable, at each instant of time, is

not determined by BN dynamics, but it is set according to the value of agent

sensors. Thus, in order to set node Boolean variables, it is necessary to make

a suitable mapping between value range of sensors and Boolean values. For

example, we may associate a sound sensor to an input node: if, in a given

instant of time, the sound sensor perceives a sound, then the Boolean vari-

able of the input node assumes value 1, otherwise 0.


Percept x1 x2

Agent on path 0 0

Agent on right of path 0 1

Agent on left of path 1 0

Table 6.1: Mapping between agent sensors and BN input nodes for the path

following task.

Actuators

x3 x4 Left wheel Right wheel

0 0 OFF OFF

0 1 OFF ON

1 0 ON OFF

1 1 ON ON

Table 6.2: Mapping between agent actuators and BN output nodes for the

path following task.

For this case study, in order to map agent sensors and actuators onto the

BN, we set two nodes (x1, x2) as input nodes and two other nodes (x3, x4) as

output nodes. In particular, the mapping between perceptions/actions and

node values is described in Table 6.1 and Table 6.2.

6.1.3 Experiments

The experiments are executed using two different kinds of path: a line and

a circle. We empirically estimated that 1500 simulation steps are enough to

let the agent advance until the end of the path, thus each simulation lasts

1500 steps, i.e., T = 1500.


Executing several independent experiments, we notice that the optimisa-

tion algorithm returns a successful BN after less than 1 hour of computation

time, both when the path is straight and circular. In particular, the error

function returns values less than 0.1: it means that at least in 90% of the

simulation steps the agent advances on the path.

Since the goal of this first experimental session is just to test and set up

our methodology in robotics, we will not dwell deeply on the analysis of the

obtained results. However, we report an example of successful simulation in

Figure 6.2. The ease of obtaining these preliminary results makes it possi-

ble to move on to more complex and interesting tasks (Section 6.2). Since

this represents the main goal of this thesis, we will provide a deeper result

analysis and test on real robots.


(a) t = 0 (b) t = 180

(c) t = 400 (d) t = 650

(e) t = 850 (f) t = 1200

Figure 6.2: An example of successful path following on a circular path.


6.2 Phototaxis and antiphototaxis

The case study analysed in the following consists of an agent that selects

actions with respect to a source light. Precisely, the agent must be able

to perform two different behaviours: going towards the light (phototaxis)

or moving away from it (antiphotaxis). In the beginning, the agent must

perform phototaxis; then, it must switch its behaviour to antiphototaxis

after perceiving a clap. Thus, the agent needs to keep memory about the

perceiving of the clap to select the best action to perform at each simulation

step. This means that a memory based agent program becomes necessary to

achieve this task.

The goal of this section is to demonstrate that our methodology is capable

to automatically build a memory based agent program. Thus, in this context

we do not focus on a statistical analysis concerning the methodology success

rate, but we focus on providing a proof of concept.

6.2.1 Task definition

In order to completely define the task, we describe its task environment:

• environment: it consists of a squared arena (1m x 1m) with a light

source positioned in the origin (see Figure 6.3).

• performance measure: in the beginning of the experiment, the agent

is positioned close to the opposite vertex of the arena with respect to

the light. Then, given a certain simulation time T , the agent must

satisfy two properties to achieve successfully the task:

- it must go towards the light at each simulation step, until the

perceiving of a clap (phototaxis);

6.2 Phototaxis and antiphototaxis 49

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

t = 0

Light

Agent

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

t = 0

Light

Agent

Figure 6.3: Phototaxis and antiphototaxis environment.

- let tc be the instant of time in which the agent perceives the clap,

then the agent must move away from the light (antiphototaxis)

for all the simulation steps subsequent to tc.

We define the performance measure by minimising an error function

E ∈ [0, 1]. This error function is given by a weighted mean including

the phototaxis contribute and the antiphototaxis one: in fact, since

both the simulation and the BN dynamics are time discrete, at each

step we can check if the agent is moving in the correct direction with

respect to the light, and reward it with 1 point, or not (0 points). Thus,

we can write the error function E as follows:

E = α(

1−∑tc

i=1 sitc

)+(

1− α)(

1−∑T

i=tc+1 si

T − (tc + 1)

),

where:

∀i ∈ [1, tc], si =

1 if the agent went towards to the light

0 otherwise


∀i ∈ [tc + 1, T ], si =

1 if the agent moved away from the light

0 otherwise

The performance measure is evaluated minimising E, i.e., the smaller

is E, the better is the performance measure.

Note that this kind of performance measure allows the experimenter

to favour one term with respect to the other one changing the value of

α. For example, if α = 0.25, the result will be an agent that performs

well phototaxis, but cannot perform antiphototaxis at all, or start to

perform it several steps after the clap.

• sensors: light sensors allow the agent to know its position with respect

to the light in each instant of time. The agent has a circular body and

it is equipped with 4 light sensors whose value is binary (ON/OFF).

Diverse combinations of switched on sensors allow the agent to perceive

the light in 8 different angles of its body. We denote the 8 possible

perceptions assigning to each of them a numerical identifier, naming it

“percept ID”. The light sensor disposition and the 8 percept IDs are

outlined in Figure 6.4. Moreover, the agent is equipped with a sound

sensor whose value can be ON or OFF: at each simulation step the

value is ON if the agent perceives a clap, OFF otherwise.

• actuators: the agent is equipped with two motors and each of them

controls a wheel. The speed value can assume for each wheel only two

values, that are ON or OFF.

To achieve this task, it is necessary to produce a memory based agent pro-

gram. In fact, at each simulation step, to choose between phototaxis and

antiphototaxis, the agent must remember whether or not a clap has been

perceived.


2

1

8

3

4

5

6

7

Figure 6.4: Agent light sensors exploited in phototaxis and antipho-

totaxis case study: the 4 circles represent the light sensors and the

number labels outline the 8 percept IDs.

6.2.2 BN setup

In this case study we use a BN with more nodes with respect to the one

utilised for the path following task (Section 6.1.2). The reason of this choice

derives from the complexity required: in this case study, the BN dynamics

needs to keep an internal state, thus, in order to create this kind of dynamics,

we think it appropriate to use an higher number of nodes than the one used

by a BN associated to a memoryless agent program. Moreover, in this case

study, the number of sensors is higher than the one of the agent used in

the path following task: this means that also an higher number of BN input

nodes will be required. Therefore, we use as a first attempt a BN with 20

nodes (N = 20). Initial connections and functions of the nodes are randomly

generated with K = 3 (i.e., the number of ingoing arcs for each node is equal

to 3) and homogeneity p = 0.5 (i.e., considering the truth table of a node,

the probability to have an entry with 0 as output value is equal to 0.5).

In order to map sensors and actuators onto the BN, we use one input


Percept ID x2 x3 x4 x5

1 1 0 0 0

2 1 1 0 0

3 0 1 0 0

4 0 1 1 0

5 0 0 1 0

6 0 0 1 1

7 0 0 0 1

8 1 0 0 1

Table 6.3: Mapping between light percept IDs and BN input nodes in the

phototaxis and antiphototaxis case study.

Percept x1

Clap perceived 1

Clap not perceived 0

Table 6.4: Mapping between clap perceptions and BN input node in the

phototaxis and antiphototaxis case study.

node (x1) to insert the clap sensor value, four input nodes (x2, x3, x4, x5)

to insert the four light sensor values and two other nodes (x6, x7) as output

nodes to pilot the two wheels. The mapping between perceptions/actions

and node values is described in Tables 6.3, 6.4 and 6.5.

6.2.3 Experiment outline

Considering the size of the arena and the speed of the agent wheels, we em-

pirically estimated that 1000 simulation steps are enough to let the agent

achieve the task.


Actuators

Left wheel Right wheel x6 x7

OFF OFF 0 0

OFF ON 0 1

ON OFF 1 0

ON ON 1 1

Table 6.5: Mapping between actuators and BN output nodes in the photo-

taxis and antiphototaxis case study.

Since the optimisation algorithm consists of a stochastic local search, we

execute 30 independent experiments and each of them corresponds to an ini-

tial different BN (randomly generated as described in Section 6.2.2). In each

experiment we train the agent to achieve the task in a simulated environment,

regardless initial conditions, such as its initial position in the arena and ori-

entation. The set of initial conditions form the training set. At the end of the

training process, we test obtained BNs in a simulated environment on further

initial conditions. Then, we port into a real robotic platform the best BN

resulting from such testing process and we undertake extensive experiments

in order to evaluate its effectiveness. In order to obtain a suitable agent

program, we proceed iterating the whole process (i.e., training in simulated

environment, testing in simulated environment and testing in real robotic

platform).

In Sections 6.2.5, 6.2.6 and 6.2.7 we outline features and results concern-

ing three different trainings and the respective tests on real robots.

Before discussing such trainings and testing on real robots, we describe

how we embed a BN in a real robot.


28

3

4

5

6

7

1

IR 0

IR 1

IR 2

IR 3IR 4

IR 5

IR 6

IR 7

Figure 6.5: Disposition and IDs of e-puck proximity sensors.

6.2.4 Robot setup

In order to test on a real robot the agent programs obtained by trainings,

we use the e-puck robot. Here we focus only on e-puck features that are

necessary to achieve the task; a more detailed description of the e-puck is

reported in Appendix A.

To perceive the light, the robot is equipped with 8 IR proximity sensors

whose disposition is showed in Figure 6.5. Since the disposition of proximity

sensors is different with respect to the light sensors of the simulated agent,

we combine the values of e-puck proximity sensors to obtain the same config-

uration as the simulation one (see Figure 6.4). The correspondence between

percept IDs and BN input node values, is the same as the simulated agent:

a complete summary of these mappings is reported in Table 6.6.

Concerning sound perception, the e-puck is equipped with 3 microphones;

the clap can be perceived by setting thresholds on the readings and keeping

the highest value. Then, the BN input node concerning the clap perception

is updated according to the same schema as in the simulation (see Table 6.4).


min(e-puck sensors) Percept ID x2 x3 x4 x5

mean(IR0,IR7) 1 1 0 0 0

IR1 2 1 1 0 0

IR2 3 0 1 0 0

mean(IR2,IR3) 4 0 1 1 0

mean(IR3,IR4) 5 0 0 1 0

mean(IR4,IR5) 6 0 0 1 1

IR5 7 0 0 0 1

IR6 8 1 0 0 1

Table 6.6: Mapping between e-puck IR proximity sensors and BN input

nodes in phototaxis and antiphototaxis case study. The minimum among

the values in the first column, determines the agent percept ID and BN

input node values.

Finally, the BN output node values determine the velocity of e-puck

wheels according to the same mapping as in simulation (see Table 6.5).

6.2.5 First training: Zoolander

The goal of the training is to obtain an agent able to achieve the task re-

gardless of its specific initial position, its orientation and instant of time in

which the clap will be perceived.

In order to reach this goal, we use a training set composed of 30 dif-

ferent initial conditions for each experiment. Such conditions are generated

randomly according to the following parameters:

• Position: x ∈ [0.5, 1] ∧ y ∈ [0.5, 1];

• Orientation: θ ∈ [0, 2π];


• Clap time: tc ∈ [400, 600].

For each independent experiment we execute 100000 iterations of the op-

timisation algorithm, corresponding about to 4 days of computation time.

Anyway, in some experiments 5000 iterations (i.e., 5 hours of computation

time) are enough to obtain a BN with good performance measure, that is

the median of error functions less than 0.1.

At the end of the training, we test the obtained agent programs on further

initial conditions to verify their actual skill to achieve the task regardless of

the initial conditions.

The results of this first training and relative testing are showed in Fig-

ure 6.6: in the training graph (Figure 6.6(a)) we can observe that the 6 BNs

on the left obtain a good performance measure, i.e., the medians of error

functions are less than 0.1. Concerning the testing results (Figure 6.6(b)), in

general the performance measures are a bit worse, but however for the first 6

BNs the medians of error functions are less than 0.1 as in the training graph.

Porting the best BN obtained on the e-puck, we observe immediately

that the robot performs successfully the task only in some specific cases.

Precisely, when it is performing phototaxis, it may happen that the light is

initially in front of the robot and after some time it is perceived off-center: if

the robot perceives the light on its right, then it is able to correct its orien-

tation and straighten up the light1; otherwise, if the light is perceived on its

left, then the robot starts to perform antiphototaxis without perceiving any

clap2. Figure 6.6(b) reports this problem: since the BN used on the e-puck

is the first one on the left, note that there are several outliers (i.e., high error

function values).

We label the result obtained from this training as Zoolander, because the

obtained behaviour recalls the main character of the homonym movie [52]:

he is a top model that shows an inability to turn left.

1See Video 1 available at http://iridia.ulb.ac.be/supp/IridiaSupp2010-010/2See Video 2 available at http://iridia.ulb.ac.be/supp/IridiaSupp2010-010/

http://iridia.ulb.ac.be/supp/IridiaSupp2010-010/



0.0

0.2

0.4

0.6

0.8

1.0

30 trained BNs ordered by median of error functions

err

or

fun

ctio

n

(a)

0.0

0.2

0.4

0.6

0.8

1.0

Testing on the 30 trained BNs

err

or

fun

ctio

n

(b)

Figure 6.6: Results of the first training (a) and the relative testing (b).


In order to fix this problem, we apply a new training on the same BNs,

adding a noise component to increase the robustness of the agent program.

6.2.6 Second training: Garrincha

The goal of this training is to fix the Zoolander problem, i.e., the robot must

be able to straighten up towards the light either when it perceives the light

on the left or on the right. In order to reach this goal, we apply the same

training set as the first training (described in Section 6.2.5) with the addition

of actuation noise. This noise consists of a change of the agent orientation in

a certain instant of time tn during the simulation. Let θn be the orientation

variation, then the value of θn must be such that the light is perceived off

center in tn+1. Thus, each element of the training set is characterized by the

following parameters:

• Position: x ∈ [0.5, 1] ∧ y ∈ [0.5, 1];

• Orientation: θ ∈ [0, 2π];

• Clap time: tc ∈ [400, 600];

• Orientation variation: θn ∈ [−π8, π

8];

• Noise time: tn ∈ [100, 400].

Moreover, considering that in some experiments of the Zoolander training

we obtained good performances after 5 hours of computation time, in this

training we decide to reduce the number of iterations of the optimisation

algorithm from 100000 iterations to 25000, correspondent to 1 day of com-

putation time.

At the end of the experiments, we notice that the obtained agent pro-

grams are able to correct their attitude with respect to the light after the


application of the actuation noise, but they are not able to perform antipho-

totaxis after the perceiving of the clap. To fix this behaviour, we repeat the

same experiments, but splitting the training in two phases:

• in the first 5000 iterations of the optimisation algorithm, the simula-

tions last only 500 time steps and they do not present any clap. The

goal is to obtain agent programs able to perform phototaxis and to

show robustness to actuation noise;

• in the subsequent 20000 iterations, the simulations last 1000 time steps

and they involve the clap. Thus, the idea is to train the agent gradually.

In the testing, we add also a component of sensor noise in order to verify

more tightly the robustness of the obtained agent programs.

The results of training and testing are showed in Figure 6.7: since it is

harder to achieve the task in a noisy environment with respect to an ideal one,

in general the performance measure values are worse than the first training

(Figure 6.6). However, observing the second training graph (Figure 6.7(a)),

we note 2 BNs with a median of error functions close to 0.1 and compact

span values. Concerning the testing graph (Figure 6.7(b)), since we add a

component of sensor noise not provided in the training, it is expected to

obtain worse performances. Nevertheless, also in testing we obtain 1 BN

with good performance measure, either in terms of its error function median

or in terms of span values compactness. Moreover, note that most of the BNs

show a median of error functions equal to 0.5: this is because most of the

agent programs obtained are able to perform phototaxis, but they are not

able to perform antiphototaxis at all. Nevertheless, since our goal is to show

a proof of concept, that is an agent program based on BN that achieves the

task, in this thesis we do not try to obtain an higher percentage of successful

BNs. However, it is clear that some features of our methodology can be

improved or changed in order to obtain an higher number of successes.

Porting the best BN obtained on the e-puck robot, we notice that the

60 Robotics applications0

.00

.20

.40

.60

.81

.0


err

or

fun

ctio

n

(a)

0.0

0.2

0.4

0.6

0.8

1.0

Testing on the 30 trained BNs

err

or

fun

ctio

n

(b)

Figure 6.7: Results of the second training attempt (a) and the relative testing

(b).


agent program presents an imperfection concerning the bearing of the robot:

its movement is not smooth, it is meandering because the dynamics of the

specific BN makes the robot wheels alternately activated3. We label the result

obtained from this training Garrincha, because the meandering movement

of the robot recalls the famous Brazilian footballer Garrincha [51].

To solve this imperfection we apply a post processing technique: each

wheel is activated or deactivated according to the average of the relative

output node values over a temporal window. In this way, the robot walks

smoothly and achieves the task at the same time, showing two important

properties4:

• every time the robot perceives the light off center, either on its left or

its right, it is able to correct its attitude with respect to the light the

light and to achieve the task;

• the robot clearly shows memory of the previous perceptions: if we turn

the robot by π, either during phototaxis or antiphototaxis, it is able to

recover the correct position and to achieve the task.

6.2.7 Third training: back and forth

Although we have already reached the goal to obtain a memory based agent

program with our methodology, we try to make the task more complex in-

creasing the number of clap within the same simulation. In particular, we

execute a new training, with the same parameters of the previous ones, but

presenting 3 claps in the same simulation. In order to do that, we need a

simulation time longer than the previous training because with three claps

the agent has to perform two phototaxis and two antiphototaxis. Consid-

ering the speed of the agent and the size of the arena, we estimated that a

3See Video 3 available at http://iridia.ulb.ac.be/supp/IridiaSupp2010-010/4See Video 4 available at http://iridia.ulb.ac.be/supp/IridiaSupp2010-010/



62 Robotics applications0

.00

.20

.40

.60

.81

.0


err

or

fun

ctio

n

Figure 6.8: Results of the third training attempt.

simulation lasting 2500 steps is enough (T = 2500). Then, let t′c and t′′c be

the instants of time in which the second and the third claps take place; such

parameters are randomly generated in the following intervals:

• t′c ∈ [1150, 1300];

• t′′c ∈ [1700, 1850].

The results of this training are showed in Figure 6.8: since the task is more

complex, it is expected that in general the performance measures have worse

values with respect to the previous trainings. However, we obtain 2 BNs

with median of error functions less or equal to 0.1.

Porting the best BN obtained on the e-puck, we notice that the robot


achieves the task5.

Finally, the results obtained in the phototaxis and antiphototaxis case

study represent a proof of concept of the proposed methodology. In Chap-

ter 7, we describe the conclusions of this thesis and we outline future works

in order to improve the methodology, both concerning the agent program

model (i.e., the BN) and the optimisation algorithm.

5See Video 5 available at http://iridia.ulb.ac.be/supp/IridiaSupp2010-010/


7. Conclusions

In this thesis we have proposed an automatic methodology to synthesize

robotic agent programs. This kind of techniques are characterized by two

main components: the agent program model and the optimisation algorithm.

In our methodology, the agent program model is given by a Boolean Network

(i.e., the BN) and the optimisation algorithm consists of a metaheuristic tech-

nique. Among metaheuristics, in this work we have adopted a stochastic local

search algorithm.

We have modeled the BN design as a constrained combinatorial optimi-

sation problem by properly defining the set of decision variables, constraints

and the objective function. Precisely, Boolean functions of nodes represent

the decision variables; they are initially randomly generated and the opti-

misation algorithm must determine their optimal values according to the

objective function.

The methodology has been first validated by experiments on abstract

case studies. Analysing BNs obtained by automatic design, we noticed their

tendency to the critical dynamic regime, that is the most interesting and

studied one, because researchers conjecture that also living beings are in a

critical state and refer to this concept as “life at the edge of chaos”. More-

over, critical regime efficiently copes with robustness and flexibility at the

same time.

After the validation, we applied the methodology on robotics case studies.

In order to acquire familiarity with this context, we experimented a simple

66 Conclusions

task (i.e., path following), in which the robot selects its actions basing only on

current sensory inputs. We obtained good performances in a short computa-

tion time, thus we decided to move on more complex scenarios. In particular,

we focused on a task (i.e., phototaxis and antiphototaxis) in which the robot

is required to keep a sort of internal memory to achieve a given goal. At

the end of the automatic design process, the obtained agent programs have

been ported into a real robotic platform and we undertook extensive exper-

iments in order to evaluate their effectiveness. The obtained results show

that BN dynamics is suitable to realise complex behaviours notwithstanding

the simplicity of the model. Moreover, results have been obtained in limited

computation time by a simple stochastic local search algorithm. However,

the goal of this thesis has been to provide a proof of concept, without fo-

cusing on statistic properties of the methodology, such as its success rate on

robotics case studies.

Future work will consist of analysing the BN obtained by automatic de-

sign in order to understand their dynamical properties and reuse them as

building blocks for future robotics applications, even more complex.

In order to improve the methodology, we plan to experiment with variants

of the adopted BN model (e.g., asynchronous dynamics and probabilistic up-

date rules). Moreover, both the number of BN nodes and BN topology could

be subject to automatic design as well as Boolean functions. Improvements

will concern also the optimisation algorithm: for example, we will develop

hybrid techniques with the aim of addressing advanced design goals. We

plan to pursue this research by decomposing the problem into interdepen-

dent sub-problems and solve each of them with the most suitable technique.

For example, BN design can be decomposed into two phases: in the first

phase, the BN architecture is defined (for instance by an Evolutionary Al-

gorithm or Ant Colony Optimisation) and in the second phase the functions

regulating the BN are defined (for instance, by means of an Iterated Local

Search).

A. The e-puck robot

The e-puck (Figure A.1) is a robot designed by Dr. Francesco Mondada and

Michael Bonani in 2006 at EPFL with the purpose to be both a research

and an educational tool in universities [37]. Both the hardware design and

the software libraries have been entirely developed as open-source projects;

free access to related documents has greatly increased its success and it has

stimulated researchers to develop their own extensions to enrich the robot

capabilities.

The core of the robot is a dsPIC processor, a low-energy microcontroller

produced by Microchip Technologies. In addition, Microchip provides a com-

plete toolchain based on the popular GCC compiler [49]. As shown in Table

A.1, a standard e-puck is equipped with several sensors and actuators.

Skilled readers should have noted the main bottlenecks of this system.

A slow CPU and a low quantity of RAM constraint the computational ca-

pabilities. For example, in the context of this thesis, we can port on the

e-puck Boolean Networks (i.e., the agent program) with a maximum number

of nodes (i.e., 216) and a maximum number of ingoing arcs into each node

(16). These limits become more explicit whether we compare the e-puck

with other robots commonly used in educational fields like, for example, the

Mindstorms NXT. In fact the Lego Mindstorms NXT provides a powerful

32-bit ARM7 microcontroller with 64 KB of RAM, eight times the amount

provided by the e-puck [33].

68 The e-puck robot

Figure A.1: The e-puck robot.

Dimensions 70 mm diameter, 55 mm height, 150 g

Battery autonomy 5Wh LiION providing about 3 hours autonomy

Processor dsPIC 30F6014A @ 60 Mhz (∼15 MIPS)

16 bit microcontroller with DSP core

Memory 8 KB RAM, 144 KB FLASH

Motors 2 stepper motors

Speed Max: 15 cm/s

IR sensors 8 infra-red sensors for ambient light

and proximity measuring

Camera VGA color camera with resolution of 480x640

Microphones 3 omni-directional microphones

Accelerometer 3D accelerometer along the X, Y and Z axis

LEDs 8 red LEDs on the ring, green LEDs in the body,

1 strong red LED in front

Speaker On-board speaker capable of WAV

and tone sound playback

Connections Serial port, Bluetooth

API language GNU C, C99 (partially)

Table A.1: e-puck features [9]

Bibliography

[1] E. Alba and R. Marti. Metaheuristic procedures for training neural net-

works. Springer-Verlag New York Inc, 2006.

[2] M. Aldana, E. Balleza, S.A. Kauffman, and O. Resendiz. Robustness

and evolvability in genetic regulatory networks. Journal of theoretical

biology, 245(3):433–448, 2007.

[3] U. Bastolla and G. Parisi. A numerical study of the critical line of

Kauffman networks. Journal of theoretical biology, 187(1):117–133, 1997.

[4] R.D. Beer and J.C. Gallagher. Evolving dynamical neural networks for

adaptive behavior. Adaptive behavior, 1(1):91, 1992.

[5] H. Bersini and V. Calenbuhr. Frustrated chaos in biological networks.

Journal of theoretical biology, 188:187–200, 1997.

[6] C. Blum, M.J.B. Aguilera, A. Roli, and M. Sampels. Hybrid Metaheuris-

tics: An Emerging Approach to Optimization. Studies In Computational

Intelligence, page 290, 2008.

[7] C. Blum and A. Roli. Metaheuristics in combinatorial optimiza-

tion: Overview and conceptual comparison. ACM Computing Surveys

(CSUR), 35(3):268–308, 2003.

[8] S. Braunewell and S. Bornholdt. Reliability of genetic networks is evolv-

able. Physical Review E, 77(6):60902, 2008.

70 BIBLIOGRAPHY

[9] CYBEROBOTICS. e-Puck specification, 2007.

[10] B. Derrida and G. Weisbuch. Evolution of overlaps between configu-

rations in random Boolean networks. Journal de physique, 47(8):1297–

1303, 1986.

[11] M. Dorigo. Learning by probabilistic boolean networks. In Proceedings

of World Congress on Computational Intelligence – IEEE International

Conference on Neural Networks, pages 887–891, Orlando, Florida, USA,

1994.

[12] J. Dreo. Metaheuristics for hard optimization: methods and case studies.

Springer Verlag, 2006.

[13] J.L. Elman. Finding structure in time. Connectionist Psychology: A

Text with Readings, 1999.

[14] A. Esmaeili and C. Jacob. Evolution of discrete gene regulatory models.

In Proceedings of the 10th annual conference on Genetic and evolution-

ary computation, pages 307–314. ACM, 2008.

[15] C. Fretter, A. Szejka, and B. Drossel. Perturbation propagation in ran-

dom and evolved boolean networks. May 2009.

[16] C. Fretter, A. Szejka, and B. Drossel. Perturbation propagation in ran-

dom and evolved Boolean networks. New Journal of Physics, 11:033005,

2009.

[17] C. Gershenson. Introduction to random boolean networks. CoRR,

nlin.AO/0408006, 2004.

[18] D.E. Goldberg. Genetic algorithms in search, optimization, and machine

learning. Addison-Wesley Professional, Upper Saddle River,NJ, USA,

1989.

[19] D. Graupe. Principles of artificial neural networks. World Scientific Pub

Co Inc, 2007.

BIBLIOGRAPHY 71

[20] I. Harvey, E.D. Paolo, R. Wood, M. Quinn, and E. Tuci. Evolutionary

robotics: A new scientific tool for studying cognition. Artificial Life,

11(1-2):79–98, 2005.

[21] S. Haykin. Neural Networks: A Comprehensive Foundation. Prentice

Hall, 1999.

[22] P.F. Hingston, L.C. Barone, and M. Zbigniew. Design by Evolution:

Advances in Evolutionary Design. Natural Computing Series, page 352,

2008.

[23] J.H. Holland. Adaptation in Natural and Artificial Systems: An Intro-

ductory Analysis with Applications to Biology, Control, and Artificial

Intelligence. The MIT Press, April 1992.

[24] H.H. Hoos and T. Stutzle. Towards a characterisation of the behaviour

of stochastic local search algorithms for SAT. Artificial Intelligence,

112(1-2):213–232, 1999.

[25] H.H. Hoos and T. Stutzle. Stochastic local search: Foundations and

applications. Morgan Kaufmann, 2005.

[26] F. Hutter, H.H. Hoos, K. Leyton-Brown, and T. Stutzle. ParamILS:

an automatic algorithm configuration framework. Journal of Artificial

Intelligence Research, 36(1):267–306, 2009.

[27] S.A. Kauffman. Metabolic stability and epigenesis in randomly con-

nected genetic nets. Journal of Theoretical Biology, 22:437–467, 1968.

[28] S.A. Kauffman. Requirements for evolvability in complex systems: or-

derly dynamics and frozen components. Physica D: Nonlinear Phenom-

ena, 42(1-3):135–152, 1990.

[29] S.A. Kauffman. Antichaos and adaptation. Scientific American,

265(2):78, 1991.

72 BIBLIOGRAPHY

[30] S.A. Kauffman. The Origins of Order: Self-Organization and Selection

in Evolution. Oxford University Press, USA, 1 edition, 6 1993.

[31] S.A. Kauffman. Investigations. Oxford University Press, USA, 2002.

[32] S.A. Kauffman and R.G. Smith. Adaptive automata based on Darwinian

selection. Physica D: Nonlinear Phenomena, 22(1-3):68–82, 1986.

[33] F. Klassner. A case study of lego mindstorms’TMsuitability for artifi-

cial intelligence and robotics courses at the college level. In SIGCSE

’02: Proceedings of the 33rd SIGCSE technical symposium on Computer

science education, pages 8–12, New York, NY, USA, 2002. ACM.

[34] C.G. Langton. Computation at the edge of chaos: phase transitions and

emergent computation. Phys. D, 42(1-3):12–37, 1990.

[35] N. Lemke, J. Mombach, and B.E.J. Bodmann. A numerical investigation

of adaptation in populations of random boolean networks. Physica A:

Statistical Mechanics and its Applications, 301(1-4):589–600, 2001.

[36] M. Mataric and D. Cliff. Challenges in evolving controllers for physical

robots. Robotics and autonomous systems, 19(1):67–84, 1997.

[37] F. Mondada, M. Bonani, X. Raemy, J. Pugh, C. Cianci, A. Klaptocz,

S. Magnenat, J. C. Zufferey, D. Floreano, and A. Martinoli. The e-

puck, a Robot Designed for Education in Engineering. In Proceedings

of the 9th Conference on Autonomous Robot Systems and Competitions,

volume 1, pages 59–65, Portugal, 2009. IPCB: Instituto Politecnico de

Castelo Branco.

[38] S. Montagna and A. Roli. Parameter Tuning of a Stochastic Biological

Simulator by Metaheuristics. AI* IA 2009: Emergent Perspectives in

Artificial Intelligence, pages 466–475, 2009.

BIBLIOGRAPHY 73

[39] G. Nicosia and E. Sciacca. Robust parameter identification for biological

circuit calibration. In 8th IEEE International Conference on BioInfor-

matics and BioEngineering, 2008. BIBE 2008, pages 1–6, 2008.

[40] S. Nolfi and D. Floreano. Evolutionary Robotics: The Biology, Intelli-

gence, and Technology of Self-Organizing Machines. Cambridge, MA:

MIT Press/Bradford Books, 2000.

[41] S. Patarnello and P. Carnevali. Learning networks of neuron with

Boolean logic. Europhysics Letters, 4(4):503–508, 1986.

[42] A. Roli, C. Arcaroli, M. Lazzarini, and S. Benedettini. Boolean Networks

Design by Genetic Algorithms.

[43] A. Roli and M. Milano. Magma: A multiagent architecture for meta-

heuristics. IEEE Trans. on Systems, Man and Cybernetics-Part B,

34:2004, 2002.

[44] S. Russell and P. Norvig. Artificial Intelligence: A Modern Approach.

Prentice-Hall, Englewood Cliffs, NJ, 2nd edition edition, 2003.

[45] R.V. Sole and B. Luque. Phase transitions and antichaos in generalized

Kauffman networks. Physics Letters A, 196(1-2):331–334, 1994.

[46] R.V. Sole, B. Luque, and S.A. Kauffman. Phase transitions in random

networks with multiple states. Working papers, Santa Fe Institute, 2000.

[47] S.H. Strogatz. Nonlinear dynamics and chaos: With applications to

physics, biology, chemistry, and engineering. Westview Pr, 2000.

[48] A. Szejka and B. Drossel. Evolution of canalizing Boolean networks. The

European Physical Journal B-Condensed Matter and Complex Systems,

56(4):373–380, 2007.

[49] Microchip Technology. dsPIC Language Tools Libraries, 2004.

74 BIBLIOGRAPHY

[50] L.F. Terrence. Feedforward neural network methodology. Springer Ver-

lag, 1999.

[51] Wikipedia. Garrincha — Wikipedia, the free encyclopedia, 2010. [On-

line; accessed 05-July-2010].

[52] Wikipedia. Zoolander — Wikipedia, the free encyclopedia, 2010. [On-

line; accessed 05-July-2010].

[53] L. Xu, F. Hutter, H.H. Hoos, and K. Leyton-Brown. SATzilla: portfolio-

based algorithm selection for SAT. Journal of Artificial Intelligence

Research, 32(1):565–606, 2008.

List of Figures

2.1 Agent representation . . . . . . . . . . . . . . . . . . . . . . . 6

3.1 Example of Boolean Network . . . . . . . . . . . . . . . . . . 12

3.2 Boolean Network dynamics . . . . . . . . . . . . . . . . . . . . 13

3.3 Relationship between homogeneity p and number of ingoing

arcs K . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.1 Methodology approach . . . . . . . . . . . . . . . . . . . . . . 26

5.1 Run length distribution of the first preliminary case study . . 31

5.2 Function f(t; γ) used in the second preliminary case study . . 32

5.3 Run length distribution of the second case study . . . . . . . . 33

5.4 Function f(t; γ) used in third and fourth preliminary case studies 34

5.5 Run length distribution of the third preliminary case study . . 35

5.6 Run length distribution of the fourth preliminary case study . 37

6.1 Path follower environment. . . . . . . . . . . . . . . . . . . . . 42

6.2 An example of successful path following on a circular path. . . 47

6.3 Phototaxis and antiphototaxis environment. . . . . . . . . . . 49

6.4 Agent light sensors exploited in the phototaxis and antipho-

totaxis case study . . . . . . . . . . . . . . . . . . . . . . . . . 51

76 LIST OF FIGURES

6.5 e-puck proximity sensors . . . . . . . . . . . . . . . . . . . . . 54

6.6 Results of the first training and the relative testing . . . . . . 57

6.7 Results of the second training attempt and the relative testing 60

6.8 Results of the third training attempt. . . . . . . . . . . . . . . 62

A.1 The e-puck robot. . . . . . . . . . . . . . . . . . . . . . . . . . 68

List of Tables

5.1 Derrida’s parameters for BNs initially ordered . . . . . . . . . 38

5.2 Derrida’s parameters for BNs initially critical . . . . . . . . . 38

5.3 Derrida’s parameters for BNs initially chaotic . . . . . . . . . 38

6.1 Sensor mapping for the path following task . . . . . . . . . . . 45

6.2 Actuator mapping for the path following task . . . . . . . . . 45

6.3 Light sensor mapping for the phototaxis and antiphototaxis

case study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

6.4 Clap sensor mapping for the phototaxis and antiphototaxis

case study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

6.5 Actuator mapping for the phototaxis and antiphototaxis case

study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

6.6 e-puck IR proximity sensor mapping for the phototaxis and

antiphototaxis case study . . . . . . . . . . . . . . . . . . . . . 55

A.1 e-puck features . . . . . . . . . . . . . . . . . . . . . . . . . . 68

towards boolean network design for robotic...

Documents