evolution of coordination and communication in groups of embodied agents

63
Evolution of Coordination and Communication in Groups of Embodied Agents A PhD Thesis Presentation by Olaf Witkowski Department of Computer Science University of Tokyo 19 January 2015

Upload: olaf-witkowski

Post on 17-Jul-2015

85 views

Category:

Science


0 download

TRANSCRIPT

Page 1: Evolution of Coordination and Communication in Groups of Embodied Agents

Evolution of Coordination and Communication in Groups of Embodied Agents

A PhD Thesis Presentation by Olaf Witkowski

!Department of Computer Science

University of Tokyo19 January 2015

Page 2: Evolution of Coordination and Communication in Groups of Embodied Agents

• Biological cells, insect swarms, bird flocks all self-organize in groups displaying a collective behavior.

• Individuals interacting together produce adaptive behavior, i.e. behavior that increases their chances of survival and reproduction.

Introduction

2

Myxobacteria form wolf packs to share

digestive enzymes

Honey bees exchange information to optimize foraging

Weaver ants build bridges

with their own bodies

Bigeye fish form schools

to avoid predation

Page 3: Evolution of Coordination and Communication in Groups of Embodied Agents

Research questions

• In which conditions does collective behavior emerge in a group of autonomous agents?

• Can individuals work together more effectively when they rely on a communication system?

3

Page 4: Evolution of Coordination and Communication in Groups of Embodied Agents

Significance is twofold

• This thesis is relevant to both scientific and technological purposes.

• First, it contributes to shed light on the evolution of coordination and communication.

• Second, a better understanding of the fundamental principles of collective behavior may also lead to innovative methods in multi-agents systems, ubiquitous computing devices and swarm computation.

4

Page 5: Evolution of Coordination and Communication in Groups of Embodied Agents

Outline

• Introduction & Background

• Methods

• Gene-culture coevolution (ch. 7)

• Synchronization vs. variability (ch. 6)

• 3D signal-swarming models (ch. 4)

• 3D spatial Prisoner’s Dilemma (ch. 5)

• Conclusion

}contributions

5

Page 6: Evolution of Coordination and Communication in Groups of Embodied Agents

Methods

6

Page 7: Evolution of Coordination and Communication in Groups of Embodied Agents

Methods

• 3-block model = Darwinian evolution + “Robots” + Environment

7

������� �����

Darwinian evolution

Robots

Environment

Page 8: Evolution of Coordination and Communication in Groups of Embodied Agents

Methods — Agent-Based Model (ABM)

• Agent-based modeling: computational models which simulate the actions and interactions of autonomous creatures in a simulated environment.

• The agent’s actions impact on its survival, just like in real environments.

8Example of ABM by Wischmann, Floreano & Keller (2012)

Page 9: Evolution of Coordination and Communication in Groups of Embodied Agents

• Artificial Neural network (McCulloch & Pitts 1943, Rosenblatt 1958)

Methods — Artificial Neural Networks (ANN)

9

Neural network (“brain”)Neuron

Connection weights

:

Page 10: Evolution of Coordination and Communication in Groups of Embodied Agents

• Connection weights encoded in a genotype & evolved by a genetic algorithm (Fisher 1958, Holland 1995).

Methods — Genetic Algorithm (GA)

10

Genotype = vectors of ANN connection weights = (w1, w2, … wn)

The fitness value of each genotype is determined by the agent’s performances on a predefined task.

w1w2w3 - - - wn

w1w2w3 - - - wn

w1w2w3 - - - wn

Population of genotypes

Evolution environment

GA operators

:

Page 11: Evolution of Coordination and Communication in Groups of Embodied Agents

Methods — Evolutionary Robotics (ER)

• Evolutionary robotics = Genetic algorithm + Agent-based modeling

11

������� �����

Darwinian evolution

Robots

Environment

����������� ��� �

�� ���

����

��������

!�

Cliff, Harvey & Husbands (1992) Floreano & Mondada (1994)

Page 12: Evolution of Coordination and Communication in Groups of Embodied Agents

Methods - Asynchronous GA

• The generations of genotypes are overlapping: each agent’s fitness is evaluated every iteration.

• When the agent gets enough energy, it replicates: the offspring is added to the running simulation.

12

Page 13: Evolution of Coordination and Communication in Groups of Embodied Agents

!�

Outline

• Introduction & Background

• Methods

• Gene-culture coevolution (ch. 7)

• Synchronization vs. variability (ch. 6)

• 3D signal-swarming models (ch. 4)

• 3D spatial Prisoner’s Dilemma (ch. 5)

• Conclusion

}contributions

13

Generic gene/culture coordination

Spatial coordination with communication

0D

3D

0-2D

Seasonal coordination through communication

Page 14: Evolution of Coordination and Communication in Groups of Embodied Agents

Neutral selection in gene-culture coevolution

14

Goal: analyze the evolution of generic communication in a gene-culture model

Signal matching task

Page 15: Evolution of Coordination and Communication in Groups of Embodied Agents

Spread of Indo-European languages through time

Bouckaert et al. (2012), Mapping the Origins and Expansion of the Indo-European Language Family, Science, vol. 337, no. 6097, pp. 957-960.

15

Page 16: Evolution of Coordination and Communication in Groups of Embodied Agents

• Gene-culture models have been used to investigate language evolution, due to the lack of empirical data (Boyd & Richerson 1992, Christiansen & Kirby 2003).

• We use genetic algorithm, artificial neural networks, and different social networks for learning.

16

Signal matching task

Neutral selection in gene-culture coevolution

Page 17: Evolution of Coordination and Communication in Groups of Embodied Agents

Neutral selection in gene-culture coevolution

17

SignalSignalSignal

• Agents produce signals

match match

• Agents need to match their signals with their neighbours

• Best performing agents are selected and replicated through genetic algorithm

Page 18: Evolution of Coordination and Communication in Groups of Embodied Agents

Neutral selection in gene-culture coevolution

• Culture: each agent learns by imitating its neighbors’ signals

18

Learner TeacherLearning phase

Social network

Learner NeighborEvaluation phase

• Gene: each agent is then evaluated for reproduction

Page 19: Evolution of Coordination and Communication in Groups of Embodied Agents

• If the learned culture becomes uniform over the population, the selection pressure on the genes is relaxed, leading to a neutral selection space.

19

Neutral selection in gene-culture coevolution

Social networks: Learning in lattice ; fitness in lattice ; reproduction in row

Genes: = weights before learning

Cultures: = weights after learning

TimeReproduction

network = rows

Communication network = lattice

Page 20: Evolution of Coordination and Communication in Groups of Embodied Agents

• In this model the agents’ task was directly to coordinate their communication.

• The results show neutral selection, offers new insights with the analogy to Potts model/Oscillators theory/Swarming models.

Conclusion

20

• Next, we will go further by studying tasks that indirectly require to coordinate via communication.

Task

Page 21: Evolution of Coordination and Communication in Groups of Embodied Agents

Synchronization in dynamic environments

21

Goal: study agent strategies for variable resource, using energy saving vs. synchronisation via communication

Resource variationSignal

Page 22: Evolution of Coordination and Communication in Groups of Embodied Agents

Animal behavior in winter Source: National Geographic & BBC documentaries, 2014

22

Food hoarding

Bird migration

Hibernation

Page 23: Evolution of Coordination and Communication in Groups of Embodied Agents

• Population of agents in an environment with seasonal food availability

• Each agent controlled by a simple neural network evolved by genetic algorithm

Synchronization in dynamic environments

23Simple neural network (Elman 1990)

Page 24: Evolution of Coordination and Communication in Groups of Embodied Agents

Synchronization in dynamic environments

24

Dimensions 1D 2D 0D

Model

Ring world!!!!

Grid world!!!!

Action-based!!!!

Results Synced wake-up using signaling

Synced wake-up using signaling

Speciated resource saving

behaviors

FP -x : Food Patch x ; x { 0 ,..., P }

A - y : Agent y ; y { 0 ,..., N }

A - y ( sv ) : sv { 0 ,..., Patch Spacing } Agent y signal value

FP -0

A-0

FP -5

FP -1

FP -4

FP -2

FP -P

FP -6

FP -8

FP -7 FP -3

A-N

A-0 ( 0 ) A-0 ( 0 )

A-N ( sv )

A-N ( sv )

...

3 experimental setups

Page 25: Evolution of Coordination and Communication in Groups of Embodied Agents

Synchronization in dynamic environments

• Signaling agents showed better collective performances than non-signalling agents.

• The agents wake-up from hibernation based on other agents’ signals.

25

FP -x : Food Patch x ; x { 0 ,..., P }

A - y : Agent y ; y { 0 ,..., N }

A - y ( sv ) : sv { 0 ,..., Patch Spacing } Agent y signal value

FP -0

A-0

FP -5

FP -1

FP -4

FP -2

FP -P

FP -6

FP -8

FP -7 FP -3

A-N

A-0 ( 0 ) A-0 ( 0 )

A-N ( sv )

A-N ( sv )

...

Ring map Food

FoodAgent

AgentLattice map 2D

1D

Summer

Winter

Page 26: Evolution of Coordination and Communication in Groups of Embodied Agents

Population vs size vs time: shows evolutionary stable strategy

26

• Without direct communication, agents develop specific strategies to survive winters.

• Strategies: fast reproduction, resource saving and hibernation.

Synchronization in dynamic environments

Number of individuals

Agent’s size

Time step

Action-cost model: cycles detected

Small agents Mid-sized agents Large agents

Page 27: Evolution of Coordination and Communication in Groups of Embodied Agents

• In dynamic environments, agents synchronize foraging with seasons using communication.

• Without direct communication, agents use specific strategies to save resource.

• Next, we will consider static resources in a minimalist system

Resource variationSignal

Conclusion

27

Olaf Witkowski, Geoff Nitschke and Takashi Ikegami. July 2012. When is happy hour: An agent’s concept of time. Proceedings of the Thirteenth International Conference on The Synthesis and Simulation of Living Systems, 13, 544–545.!

Olaf Witkowski and Geoff Nitschke. September 2013. The Transmission of Migratory Behaviors. Proceedings of the Twelveth European Conference on Artificial Life, 12, 1218–1220.!

Olaf Witkowski and Nathanaël Aubert. July 2013. Size Does Matter : The Impact of Size on Hoarding Behaviour. Proceedings of the Thirteenth International Conference on The Synthesis and Simulation of Living Systems, 13, 542–543.

Page 28: Evolution of Coordination and Communication in Groups of Embodied Agents

Signal-based swarming

28

Goal: use minimalist 3D simulation to explore the emergence of swarming based on signaling

!�

Page 29: Evolution of Coordination and Communication in Groups of Embodied Agents

29

Starling murmuration A Bird Ballet by Neels Castillon

Page 30: Evolution of Coordination and Communication in Groups of Embodied Agents

Signal-based swarming

• Reynolds’ basic flocking model (1986) consisted of three simple steering behaviors that determined how individual boids should manoeuver based on their velocity and position within the flock.

30

Separation Alignment Cohesion

Page 31: Evolution of Coordination and Communication in Groups of Embodied Agents

Signal-based swarming

• Gradual improvements of the model, adding rules or fixed leaders (Mataric 1992, Hartman & Benes 2006, Cucker & Huepe 2008, Su et al. 2009, Yu et al. 2010, Chiew et al. 2013)

• Swarming can be developed using an evolutionary robotics approach, often with complex sensors and pressures such as predators (Tu and Terzopoulos 1994, Ward et al. 2001, Olson et al. 2013)

31

Hartm

an &

Ben

es (2

006)

Page 32: Evolution of Coordination and Communication in Groups of Embodied Agents

Signal-based swarming

32

• In our 3D simulation, blind sound-emitting agents look for a hidden food resource. An asynchronous reproduction scheme is used to evolve the agents’ controllers.

• The models shows (a) emergence of collective motion from the combination of signaling system and foraging task, and (b) clustering improves the search.

Page 33: Evolution of Coordination and Communication in Groups of Embodied Agents

Signal-based swarming

• Each agent is equipped with 1 signaling device and 6 sensors.

• The sensors detect signals produced by other agents from 6 directions.

33

signalemitter

receiver

1

2

34

5

6

Page 34: Evolution of Coordination and Communication in Groups of Embodied Agents

Simulation — Agent survival & reproduction

Energy cost = 0.01 + [ 0.0 ; 0.001 ]

34

> 10

Energy -> replication with mutation

= 02

No energy -> death

�����

Energy gain = ________________Carrying capacityDistance to goal_______________

��������� �������Survival

Page 35: Evolution of Coordination and Communication in Groups of Embodied Agents

Model — Neural controller

35

M1 = pitch M2 = yaw S = produced signal S1..6 = sensed signal

Elman simple recurrent network architecture

(Elman 1990)

Page 36: Evolution of Coordination and Communication in Groups of Embodied Agents

Results — Emergence of swarming

• Agents self-organize into swarms without any other external control than the fitness they get from being closer to the goal.

• The agents go through three phases: (1) random motion (2) dynamic changing clusters and (3) compact ball around resource

36

(1) (2) (3)(1) (2) (3)

Page 37: Evolution of Coordination and Communication in Groups of Embodied Agents

0 2 4 6 8 10 12

x 105

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Time steps

Ave

rag

e n

um

be

r o

f n

eig

hb

ors

Average number of neighbors (10 runs) with signalling ON vs OFF

signalling ON

signalling OFF

Results — Neighborhood analysis

37

← signal on

← signal offAver

age

num

ber o

f neig

hbou

rs

Average number of neighbors (10 runs)

Time steps

Page 38: Evolution of Coordination and Communication in Groups of Embodied Agents

Results — distance to goal areas (signal on/off)

signal on

signal off

0 2 4 6 8 10 12

x 105

0

50

100

150

200

250

300

350

400

450

500

Dis

tan

ce to g

oal

Average distance to goal every iteration (silent control simulation)

Simulation steps38

Dist

ance

to g

oal

Average distance to goal (signal on)

Time steps

0 2 4 6 8 10 12

x 105

0

50

100

150

200

250

300

350

400

450

500Average distance to goal every iteration (regular run)

Dis

tance

to g

oal

Simulation steps

Dist

ance

to g

oal

Average distance to goal (signal on)

Time steps

Page 39: Evolution of Coordination and Communication in Groups of Embodied Agents

• The transfer entropy (Schreiber 2000) T from a random process X to another process Y is a measure of the amount of directed transfer of information from X to Y:

!

where H is the Shannon entropy (Shannon & Weaver 1949).

Results — Transfer entropy

39

Page 40: Evolution of Coordination and Communication in Groups of Embodied Agents

Results — Measure of following behavior

40

← signal on

← signal off

The transfer entropy from a random process X to another process Y is a measure of the amount of directed transfer of information from X to Y, defined as:

Inw

ard

neig

hbou

rhoo

d tra

nsfe

r ent

ropy

Time steps

Inward neighbourhood transfer entropy

Page 41: Evolution of Coordination and Communication in Groups of Embodied Agents

Results — Measure of individual leadership

41

The transfer entropy from a random process X to another process Y is a measure of the amount of directed transfer of information from X to Y, defined as:

Out

war

d ne

ighb

ourh

ood

trans

fer e

ntro

py

Time steps

Outward neighborhood transfer entropy

Page 42: Evolution of Coordination and Communication in Groups of Embodied Agents

Phylogenetic tree & neutral selection

4242

Principal Component Analysis (color = iteration, radius = swarming)

PC 2

PC 1

Simulation tim

e

Biplot of a PCA on genotypes of all agents in a typical run, over one million iterations. Each circle represents one agent’s genotype, the diameter representing the average number of

neighbors around the agent over its lifetime, and the color showing its time of death.

Page 43: Evolution of Coordination and Communication in Groups of Embodied Agents

!�

• In this chapter, we used a minimalist model to demonstrate the emergence of swarming behavior.

• The agents exchange signals in order to swarm together, which in turn improves their foraging.

Conclusion

43

Olaf Witkowski and Takashi Ikegami. Expected mid-2015 (In preparation). Signal-based swarming and neutral selection. Submitted to PLoS Computational Biology. <Paper>!

Olaf Witkowski, Geoff Nitschke and Takashi Ikegami. January 2015 (In press). Signal drives genetic diversity: an agent-based approach to speciation. Proceedings of the Twentieth International Symposium on Artificial Life and Robotics, 20. <Paper, Talk>!

Olaf Witkowski and Takashi Ikegami. July 2014. Asynchronous evolution: Emergence of signal- based swarming. Proceedings of the Fourteenth International Conference on the Simulation and Synthesis of Living Systems, 14, 302–309. <Paper, Talk>

!�

• Next, we will explore the same model with a different task.

PD

Page 44: Evolution of Coordination and Communication in Groups of Embodied Agents

Swarming in dynamic 3D Prisoner’s Dilemma

44

Goal: find impact of cooperation/defection game on agents’ collective behavior

!�

PD

Page 45: Evolution of Coordination and Communication in Groups of Embodied Agents

Food sharing in vampire bats

Attenborough, D. (2011). Friends and Rivals. BBC documentary.

45

Page 46: Evolution of Coordination and Communication in Groups of Embodied Agents

Iterated Prisoner’s Dilemma (IPD)

• Prisoner’s Dilemma (Flood & Dresher 1950) Each player can Cooperate (C) or Defect (D)

• Iterated version (Axelrod 1984)

• Spatial version (Nowak & May 1993)

• Our version: dynamic & spatial

46

Spatial prisoner’s Dilemma

PD Reward matrix

Page 47: Evolution of Coordination and Communication in Groups of Embodied Agents

Dynamic Spatial IPD

47

• Agent moves on 3D map

• Agent controls direction (constant speed)

• Communication through signals (2 channels) to detect “friendly neighbors”

• Agent chooses to cooperate/defect

Cooperation (blue) or Defection (red)

Simulation visualization

Page 48: Evolution of Coordination and Communication in Groups of Embodied Agents

Differences with previous model

Task: play Prisoner’s Dilemma Reproduction: offspring added locally

Task: distance to resource Reproduction: offspring added globally

48

Ch. 4 Ch. 5

Page 49: Evolution of Coordination and Communication in Groups of Embodied Agents

Agent’s Controller

49

Movement CommunicationCooperate or Defect

Sensors

Hidden Units

Context Units

I12

Elman (1990)

Previous controller

Page 50: Evolution of Coordination and Communication in Groups of Embodied Agents

• We extend the reward per iteration from Chiong & Kirley (2012) to take into account spatial continuity:

Coop. vs Def. Costs & Payoff Matrix

50

Figure 2: Architecture of the agents controller, composedof 12 input neurons, 10 hidden neurons, 10 context neuronsand 5 output neurons.

spacial continuity. It is defined by:8>>>>>>>>><

>>>>>>>>>:

C : bX

coop2radius

1

1 + distance(coop,me)

�c

X

any2radius

1

1 + distance(any,me)

D : bX

coop2radius

1

1 + distance(coop,me)

(1)

With b the bonus, c the cooperation cost, b > c > 0,and distance the Euclidian distance between two agents. Ra-dius represent the sphere of radius radius around the agent.Note that the agent itself is not considered part of its neigh-borhood. The distance is not part of the original fitness,which made sense since Chiong and Kirley (2012) are bas-ing their simulation on a lattice, where the distance is alwaysthe same. Our version integrates nicely the fact that interac-tions with distant agents should be much weaker than withcloser ones.

Another advantage of this fitness is that defection can alsobe assimilated to not playing (no cost). Note that there isalso no cost and no reward for cooperating when alone.

We can see that this fitness is equivalent to the traditionalPD game, since, for two agents A and B at a distance d ofeach other, (1) yields the payoff matrix:

C D

C

(b� c)

1 + d

� c

1 + d

D

b

1 + d

0

It is clear that for the conditions b > c > 0, this matrixcorrespond to a PD.

Based on the outcome of the match, agents can choose anew direction, which is similar to leaving the group in the

Initial energy 2Maximum age 5000Maximum energy 20Maximum population size 500Population threshold 100Reproduction threshold 10Reproduction cost 2Reproduction radius 2Survival cost per turn 2Mutation rate (per gene) 0.05

Table 1: Parameters used for the simulation.

walk away strategy (Aktipis 2004), the main difference be-ing that, in our case, it is also possible for groups to split. Itis also similar in another aspect: there is a cost to leaving agroup, as a lone agent may need time to meet others.

Evolution/Parameters

Evolution is done continuously. Agents with negative orzero energy are removed, while agents with energy abovea threshold are forced to reproduce, within the limits of oneinfant per time step. The reproduction cost is low enough,considering the threshold, to not put the life of the agent atrisk. Table 1 indicates the various parameters used for evo-lution.

Results

Results were obtained on a set of 10 runs, with additionalsets used for control. In our setting, all agents have a con-stant speed, but can choose in which direction they are head-ing. This allows for pseudo-static behaviors by looping incircles.

While some characteristics, such as agents’ movement,were strongly run dependent, the overall dynamics of thesystem was not. At the beginning of the run, the envi-ronment is seeded with random agents. Since all weightsin their neural network are set at random, roughly half ofthe agents initially choose to cooperate while the other halfchoose to defect. This leads to a fast extinction of coopera-tors (Figure 3, until approximately 50000 time steps), until agroup emerges strong enough to survive. The second phasefollows, in which cooperators are quickly increasing in num-ber due to the autocatalytic nature of this strategy (Figure 3).A third step happens eventually, where defectors invade thecluster, followed either by the survival of the cluster due tocooperators running away or a reboot of the cycle. In caseof survival, oscillations in the proportion of cooperators canbe observed. However, this phenomenon is averaged awayover multiple runs, since period and phase of the oscillationsare not correlated from one experiment to the other.

As a control, we ran the simulation after removing thepossibility for agents to move. In this case, cooperators

Figure 2: Architecture of the agents controller, composedof 12 input neurons, 10 hidden neurons, 10 context neuronsand 5 output neurons.

spacial continuity. It is defined by:8>>>>>>>>><

>>>>>>>>>:

C : bX

coop2radius

1

1 + distance(coop,me)

�c

X

any2radius

1

1 + distance(any,me)

D : bX

coop2radius

1

1 + distance(coop,me)

(1)

With b the bonus, c the cooperation cost, b > c > 0,and distance the Euclidian distance between two agents. Ra-dius represent the sphere of radius radius around the agent.Note that the agent itself is not considered part of its neigh-borhood. The distance is not part of the original fitness,which made sense since Chiong and Kirley (2012) are bas-ing their simulation on a lattice, where the distance is alwaysthe same. Our version integrates nicely the fact that interac-tions with distant agents should be much weaker than withcloser ones.

Another advantage of this fitness is that defection can alsobe assimilated to not playing (no cost). Note that there isalso no cost and no reward for cooperating when alone.

We can see that this fitness is equivalent to the traditionalPD game, since, for two agents A and B at a distance d ofeach other, (1) yields the payoff matrix:

C D

C

(b� c)

1 + d

� c

1 + d

D

b

1 + d

0

It is clear that for the conditions b > c > 0, this matrixcorrespond to a PD.

Based on the outcome of the match, agents can choose anew direction, which is similar to leaving the group in the

Initial energy 2Maximum age 5000Maximum energy 20Maximum population size 500Population threshold 100Reproduction threshold 10Reproduction cost 2Reproduction radius 2Survival cost per turn 2Mutation rate (per gene) 0.05

Table 1: Parameters used for the simulation.

walk away strategy (Aktipis 2004), the main difference be-ing that, in our case, it is also possible for groups to split. Itis also similar in another aspect: there is a cost to leaving agroup, as a lone agent may need time to meet others.

Evolution/Parameters

Evolution is done continuously. Agents with negative orzero energy are removed, while agents with energy abovea threshold are forced to reproduce, within the limits of oneinfant per time step. The reproduction cost is low enough,considering the threshold, to not put the life of the agent atrisk. Table 1 indicates the various parameters used for evo-lution.

Results

Results were obtained on a set of 10 runs, with additionalsets used for control. In our setting, all agents have a con-stant speed, but can choose in which direction they are head-ing. This allows for pseudo-static behaviors by looping incircles.

While some characteristics, such as agents’ movement,were strongly run dependent, the overall dynamics of thesystem was not. At the beginning of the run, the envi-ronment is seeded with random agents. Since all weightsin their neural network are set at random, roughly half ofthe agents initially choose to cooperate while the other halfchoose to defect. This leads to a fast extinction of coopera-tors (Figure 3, until approximately 50000 time steps), until agroup emerges strong enough to survive. The second phasefollows, in which cooperators are quickly increasing in num-ber due to the autocatalytic nature of this strategy (Figure 3).A third step happens eventually, where defectors invade thecluster, followed either by the survival of the cluster due tocooperators running away or a reboot of the cycle. In caseof survival, oscillations in the proportion of cooperators canbe observed. However, this phenomenon is averaged awayover multiple runs, since period and phase of the oscillationsare not correlated from one experiment to the other.

As a control, we ran the simulation after removing thepossibility for agents to move. In this case, cooperators

Figure 2: Architecture of the agents controller, composedof 12 input neurons, 10 hidden neurons, 10 context neuronsand 5 output neurons.

spacial continuity. It is defined by:8>>>>>>>>><

>>>>>>>>>:

C : bX

coop2radius

1

1 + distance(coop,me)

�c

X

any2radius

1

1 + distance(any,me)

D : bX

coop2radius

1

1 + distance(coop,me)

(1)

With b the bonus, c the cooperation cost, b > c > 0,and distance the Euclidian distance between two agents. Ra-dius represent the sphere of radius radius around the agent.Note that the agent itself is not considered part of its neigh-borhood. The distance is not part of the original fitness,which made sense since Chiong and Kirley (2012) are bas-ing their simulation on a lattice, where the distance is alwaysthe same. Our version integrates nicely the fact that interac-tions with distant agents should be much weaker than withcloser ones.

Another advantage of this fitness is that defection can alsobe assimilated to not playing (no cost). Note that there isalso no cost and no reward for cooperating when alone.

We can see that this fitness is equivalent to the traditionalPD game, since, for two agents A and B at a distance d ofeach other, (1) yields the payoff matrix:

C D

C

(b� c)

1 + d

� c

1 + d

D

b

1 + d

0

It is clear that for the conditions b > c > 0, this matrixcorrespond to a PD.

Based on the outcome of the match, agents can choose anew direction, which is similar to leaving the group in the

Initial energy 2Maximum age 5000Maximum energy 20Maximum population size 500Population threshold 100Reproduction threshold 10Reproduction cost 2Reproduction radius 2Survival cost per turn 2Mutation rate (per gene) 0.05

Table 1: Parameters used for the simulation.

walk away strategy (Aktipis 2004), the main difference be-ing that, in our case, it is also possible for groups to split. Itis also similar in another aspect: there is a cost to leaving agroup, as a lone agent may need time to meet others.

Evolution/Parameters

Evolution is done continuously. Agents with negative orzero energy are removed, while agents with energy abovea threshold are forced to reproduce, within the limits of oneinfant per time step. The reproduction cost is low enough,considering the threshold, to not put the life of the agent atrisk. Table 1 indicates the various parameters used for evo-lution.

Results

Results were obtained on a set of 10 runs, with additionalsets used for control. In our setting, all agents have a con-stant speed, but can choose in which direction they are head-ing. This allows for pseudo-static behaviors by looping incircles.

While some characteristics, such as agents’ movement,were strongly run dependent, the overall dynamics of thesystem was not. At the beginning of the run, the envi-ronment is seeded with random agents. Since all weightsin their neural network are set at random, roughly half ofthe agents initially choose to cooperate while the other halfchoose to defect. This leads to a fast extinction of coopera-tors (Figure 3, until approximately 50000 time steps), until agroup emerges strong enough to survive. The second phasefollows, in which cooperators are quickly increasing in num-ber due to the autocatalytic nature of this strategy (Figure 3).A third step happens eventually, where defectors invade thecluster, followed either by the survival of the cluster due tocooperators running away or a reboot of the cycle. In caseof survival, oscillations in the proportion of cooperators canbe observed. However, this phenomenon is averaged awayover multiple runs, since period and phase of the oscillationsare not correlated from one experiment to the other.

As a control, we ran the simulation after removing thepossibility for agents to move. In this case, cooperators

Page 51: Evolution of Coordination and Communication in Groups of Embodied Agents

!(a) seek and destroy (b) cluster with high mobility / high reproduction rate

Simulation

51

Cooperation (blue) or Defection (red)

Simulation visualization

Observed behaviors:

!!

(b)

Page 52: Evolution of Coordination and Communication in Groups of Embodied Agents

Simulation - Cooperators increase

52

Cooperation proportion

Pro

porti

on o

f coo

pera

tors

in

the

popu

latio

n

Time steps

Cooperation (blue) or Defection (red)

Simulation visualization

Page 53: Evolution of Coordination and Communication in Groups of Embodied Agents

Simulation - Cooperators’ invasion

53

Cooperation (blue) or Defection (red)

Simulation visualization

Page 54: Evolution of Coordination and Communication in Groups of Embodied Agents

Simulation - Cooperators’ stronger signal

54

Signaling strength

Pro

porti

on o

f coo

pera

tors

in

the

popu

latio

n

Time steps

Cooperation (blue) or Defection (red)

Simulation visualization

Page 55: Evolution of Coordination and Communication in Groups of Embodied Agents

Simulation - Cooperators are moving faster

55

Average displacement of agents over a 100 steps sliding window

Pro

porti

on o

f coo

pera

tors

in

the

popu

latio

n

Time steps

Page 56: Evolution of Coordination and Communication in Groups of Embodied Agents

Conclusion

• In this chapter, we gained the insight that cooperation requires grouping of collaborating agents.

• This grouping emerges as a swarming behavior degenerated from the previous chapter, using the communication channel to find other cooperators.

56

• Rappel goals: add schema: conclusions summary

Olaf Witkowski and Nathanaël Aubert-Kato. July 2014. Pseudo-static cooperators: Moving isn’t always about going somewhere. Proceedings of the Fourteenth International Conference on the Simulation and Synthesis of Living Systems, 14, 392–397. <Paper, Talk>

!� PD

Page 57: Evolution of Coordination and Communication in Groups of Embodied Agents

Conclusion

57

Page 58: Evolution of Coordination and Communication in Groups of Embodied Agents

Conclusion

3D signal-swarming models (ch. 4)

3D spatial Prisoner’s Dilemma (ch. 5)

Synchronization vs variability (ch. 6)

Gene-culture coevolution (ch. 7)

Summary of the specific focus of every chapter

PD

58

Page 59: Evolution of Coordination and Communication in Groups of Embodied Agents

• In this thesis, using evolutionary robotics, we demonstrated how groups of agents can evolve efficient collective behavior based on communication.

• The way groups of animals come to cooperate by exchanging information is essential to optimize their behavior in an environment.

• Future swarm computation will need to build robots that are not directly controlled by human rules, but interact with each other to solve problems.

Conclusion

59

Page 60: Evolution of Coordination and Communication in Groups of Embodied Agents

I am so thankful to…

Takashi Ikegami !Nathanaël Aubert-Kato, Geoff Nitschke, Julien Hubert, Luke McCrohon !

Everyone in Ikegami Lab !Jun’ichi Tsujii, Reiji Suda, Masami Hagiya, all the committee members !

My loving family & truly awesome friends !

Page 61: Evolution of Coordination and Communication in Groups of Embodied Agents

Thank you

Page 62: Evolution of Coordination and Communication in Groups of Embodied Agents

Publications and conferences

Olaf Witkowski and Takashi Ikegami. Expected mid-2015 (In preparation). Signal-based swarming and neutral selection. PLoS Computational Biology. <Paper>!

Olaf Witkowski, Geoff Nitschke and Takashi Ikegami. January 2015 (In press). Signal drives genetic diversity: an agent-based approach to speciation. Proceedings of the Twentieth International Symposium on Artificial Life and Robotics, 20. <Paper, Talk>!

Olaf Witkowski and Nathanaël Aubert-Kato. July 2014. Pseudo-static cooperators: Moving isn’t always about going somewhere. Proceedings of the Fourteenth International Conference on the Simulation and Synthesis of Living Systems, 14, 392–397. <Paper, Talk>!

Olaf Witkowski and Takashi Ikegami. July 2014. Asynchronous evolution: Emergence of signal- based swarming. Proceedings of the Fourteenth International Conference on the Simulation and Synthesis of Living Systems, 14, 302–309. <Paper, Talk>!

Olaf Witkowski and Geoff Nitschke. September 2013. The Transmission of Migratory Behaviors. Proceedings of the Twelveth European Conference on Artificial Life, 12, 1218–1220. <Paper, Talk>!

Olaf Witkowski and Nathanaël Aubert. July 2013. Size Does Matter : The Impact of Size on Hoarding Behaviour. Proceedings of the Thirteenth International Conference on The Synthesis and Simulation of Living Systems, 13, 542–543. <Extended Abstract, Talk>!

Olaf Witkowski, Geoff Nitschke and Takashi Ikegami. July 2012. When is happy hour: An agent’s concept of time. Proceedings of the

Thirteenth International Conference on The Synthesis and Simulation of Living Systems, 13, 544–545. <Extended Abstract, Poster>!

Olaf Witkowski and Nathanaël Aubert. May 2012. Size Does Matter : The Impact of Size on Hoarding Behaviour. Bio UT International Life Sciences Symposium. <Abstract, Poster>!

Olaf Witkowski, Geoff Nitschke and Takashi Ikegami. March 2012. Time To Migrate: The Effect of Lifespan on Imitation and Culturally Learned Migration. Seventh International Workshop on Natural Computing. <Abstract, Talk>!

Luke McCrohon and Olaf Witkowski. August 2011. Devil in the details: Analysis of a coevolutionary model of language evolution via relaxation of selection. Advances in Artificial Life, ECAL 2011. Proceedings of the Eleventh European Conference on the Synthesis and Simulation of Living Systems, 522–529. <Paper, Talk>!

Olaf Witkowski. September 2011. A Two-Speed Language Evolution: Exploring the Linguistic Carrying Capacity. Proceedings of Ways to Protolanguage 2 (Protolang 2011). <Paper, Talk>!

Olaf Witkowski. July 2011. Can Cultural Adaptation Lead to Evolutionary Suicide? At HBES 2011 (23rd Annual Human Behavior & Evolution Society Conference). <Abstract, Poster>!

Olaf Witkowski. August 2010. A Two-Speed Language Evolution. At Freelinguistics 2010 (4th Annual International Free Linguistics Conference). <Abstract, Talk>!

(In reverse chronological order)

Page 63: Evolution of Coordination and Communication in Groups of Embodied Agents

72