language evolution as a darwinian process: … · language evolution as a darwinian process:...

15
RESEARCH REPORT Language evolution as a Darwinian process: computational studies Pierre-Yves Oudeyer Æ Fre ´de ´ ric Kaplan Received: 28 August 2006 / Revised: 6 December 2006 / Accepted: 11 December 2006 / Published online: 12 January 2007 Ó Marta Olivetti Belardinelli and Springer-Verlag 2007 Abstract This paper presents computational experi- ments that illustrate how one can precisely conceptu- alize language evolution as a Darwinian process. We show that there is potentially a wide diversity of rep- licating units and replication mechanisms involved in language evolution. Computational experiments allow us to study systemic properties coming out of popula- tions of linguistic replicators: linguistic replicators can adapt to specific external environments; they evolve under the pressure of the cognitive constraints of their hosts, as well as under the functional pressure of communication for which they are used; one can ob- serve neutral drift; coalitions of replicators may ap- pear, forming higher level groups which can themselves become subject to competition and selection. Keywords Language evolution Language change Darwinian evolution Genes Memes Replicators Learning bias Functional pressure Self-organization Mapping biology to language Biological evolution happens through the differential replication and selection of genes, with variation coming from mutations or cross-over. Cultural evolu- tion, and in particular language evolution, happens through the social interactions between people, medi- ated by many different kinds of behaviours and con- texts. While these two processes seem at first sight rather different, researchers have in fact considered strong parallels between the two since the invention of the theory of natural selection by Charles Darwin (1859). As a matter of fact, soon after Darwin pub- lished his book On the Origins of Species, describing how mechanisms of inheritance, variation, and selec- tion could explain the evolution of biological organ- isms, August Schleicher published a book using very similar concepts to describe the birth and death of languages, and even the phenomena of language spe- ciation (Schleicher 1863). Interestingly, Schleicher had developed the notion of language trees, representing their genealogy, which inspired the biologists’ clado- gram or phylogenetic trees (Mufwene 2005). This mapping between biological evolution and language evolution was put forward again and made more detailed when the substrate and the structure of the genetic code was discovered (Watson and Crick 1953). There are two strands of correspondences that were developed. A first strand tried to map units and structures in the genetic space directly to units and structures in the linguistic space. It was based on the observation that genetic information was organized as a sequence of discrete nucleotides, forming DNA molecules. The particularity of this organization is that there is a molecular system for reading the sequence, in which nucleotides are read three by three (these groups are called codons) and command the formation of a corresponding sequence of amino-acids, which will then fold up and form a three-dimensional protein. The P.-Y. Oudeyer (&) Sony Computer Science Laboratory Paris, Paris, France e-mail: [email protected] URL: http://www.csl.sony.fr/~py F. Kaplan Ecole Polytechnique Federale de Lausanne, CRAFT, Lausanne, Switzerland URL: http://www.fkaplan.com 123 Cogn Process (2007) 8:21–35 DOI 10.1007/s10339-006-0158-3

Upload: vankhue

Post on 29-Aug-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Language evolution as a Darwinian process: … · Language evolution as a Darwinian process: computational studies ... Darwin ian process . ... selves in the gene pool by leapi ng

RESEARCH REPORT

Language evolution as a Darwinian process:computational studies

Pierre-Yves Oudeyer Æ Frederic Kaplan

Received: 28 August 2006 /Revised: 6 December 2006 /Accepted: 11 December 2006 / Published online: 12 January 2007! Marta Olivetti Belardinelli and Springer-Verlag 2007

Abstract This paper presents computational experi-ments that illustrate how one can precisely conceptu-alize language evolution as a Darwinian process. Weshow that there is potentially a wide diversity of rep-licating units and replication mechanisms involved inlanguage evolution. Computational experiments allowus to study systemic properties coming out of popula-tions of linguistic replicators: linguistic replicators canadapt to specific external environments; they evolveunder the pressure of the cognitive constraints of theirhosts, as well as under the functional pressure ofcommunication for which they are used; one can ob-serve neutral drift; coalitions of replicators may ap-pear, forming higher level groups which can themselvesbecome subject to competition and selection.

Keywords Language evolution ! Language change !Darwinian evolution ! Genes ! Memes ! Replicators !Learning bias ! Functional pressure ! Self-organization

Mapping biology to language

Biological evolution happens through the differentialreplication and selection of genes, with variation

coming from mutations or cross-over. Cultural evolu-tion, and in particular language evolution, happensthrough the social interactions between people, medi-ated by many different kinds of behaviours and con-texts. While these two processes seem at first sightrather different, researchers have in fact consideredstrong parallels between the two since the invention ofthe theory of natural selection by Charles Darwin(1859). As a matter of fact, soon after Darwin pub-lished his book On the Origins of Species, describinghow mechanisms of inheritance, variation, and selec-tion could explain the evolution of biological organ-isms, August Schleicher published a book using verysimilar concepts to describe the birth and death oflanguages, and even the phenomena of language spe-ciation (Schleicher 1863). Interestingly, Schleicher haddeveloped the notion of language trees, representingtheir genealogy, which inspired the biologists’ clado-gram or phylogenetic trees (Mufwene 2005).

This mapping between biological evolution andlanguage evolution was put forward again and mademore detailed when the substrate and the structure ofthe genetic code was discovered (Watson and Crick1953). There are two strands of correspondences thatwere developed. A first strand tried to map units andstructures in the genetic space directly to units andstructures in the linguistic space. It was based on theobservation that genetic information was organized asa sequence of discrete nucleotides, forming DNAmolecules. The particularity of this organization is thatthere is a molecular system for reading the sequence, inwhich nucleotides are read three by three (these groupsare called codons) and command the formation of acorresponding sequence of amino-acids, which willthen fold up and form a three-dimensional protein. The

P.-Y. Oudeyer (&)Sony Computer Science Laboratory Paris, Paris, Francee-mail: [email protected]: http://www.csl.sony.fr/~py

F. KaplanEcole Polytechnique Federale de Lausanne, CRAFT,Lausanne, SwitzerlandURL: http://www.fkaplan.com

123

Cogn Process (2007) 8:21–35

DOI 10.1007/s10339-006-0158-3

Page 2: Language evolution as a Darwinian process: … · Language evolution as a Darwinian process: computational studies ... Darwin ian process . ... selves in the gene pool by leapi ng

association between triplets of nucleotides and amino-acids seems to be largely arbitrary. The discovery ofthese structures led researchers to think of nucleotidesas an alphabet, codons as words, and genes as sen-tences whose meaning would be the proteins associatedto the genes (Berlinski 1972). This led to the proposalthat the genetic code itself was a language (Searls2002). Nevertheless, this mapping is controversial anda number of strong arguments against the relevance ofthese detailed correspondences at the levels of unitswere developed (Stegmann 2004; Tsonis et al. 1997).

A second kind of parallel was developed in whichthe focus was on the process of evolution rather that onthe units themselves (Mufwene 2005; Croft 2000,2002; Steels 2004). This was in fact a continuation ofthe nineteenth century view of language evolution asa (neo-)Darwinian process. The common processbetween genome and language evolution was here thefollowing: (1) there exists a population of units capableof replication, (2) replication is not perfect: modifica-tions can appear, (3) the units have different levels ofefficiency in replication, which produces differentialreplication. This high-level formulation, sometimesconceptualized as a generalization of Darwin’s theoryof natural selection (Hull 1988), has the advantage ofspecifying neither the structure of units nor themechanisms of replication and variation. And indeed,researchers found ways to instantiate it into biologicalor language evolution by filling in those missing slotswith the corresponding specific structures and mecha-nisms (Croft 2000). As far as biology is concerned, theunits are genes, the mechanisms of replication arethose associated with meiosis/mitosis, and the mecha-nisms of variation are mutations and cross-over. As faras language is concerned, a wide variety of instantia-tions have been proposed. The units of replicationwere conceived as ideas, mnemotypes, idene, culture-type, socio-genes, tuition (van Driem 2003), rangingfrom simple abstract concepts like words or expres-sions to complex neural structures implementingassociations between phonological forms and meaning.Perhaps the most well-known notion of cultural unit ofreplication is the meme introduced by (Dawkins 1976):

Examples of memes are tunes, ideas, catch-phra-ses, clothes fashions, ways of making pots or ofbuilding arches. Just as genes propagate them-selves in the gene pool by leaping from body tobody via sperms or eggs, so memes propagatethemselves in the meme pool by leaping frombrain to brain via a process which, in the broadsense, can be called imitation. If a scientist hears,

or reads about, a good idea, he passed it on to hiscolleagues and students. He mentions it in hisarticles and his lectures. If the idea catches on, itcan be said to propagate itself, spreading frombrain to brain. As my colleague N.K. Humphreyneatly summed up an earlier draft of this chapter:‘... memes should be regarded as living structures,not just metaphorically but technically. When youplant a fertile meme in my mind you literallyparasitize my brain, turning it into a vehicle forthe meme’s propagation in just the way that avirus may parasitize the genetic mechanism of ahost cell. And this isn’t just a way of talking—thememe for, say, ‘belief in life after death’ is actu-ally realized physically, millions of times over, asa structure in the nervous systems of individualmen the world over’. (Chapter 11 from RichardDawkins, ‘‘The Selfish Gene’’)

Linguistic memes, sometimes called linguemes(Croft 2000), are themselves a population of very di-verse kinds of units: phonological features, phonemes,syllables, rules of phoneme sequencing, lexicons, rulesof syntax, semantic categories, systems of world cate-gorization, constructions mapping combinations ofwords and complex meanings, prosodic structures, so-cial conventions involving gestures and gaze to coor-dinate linguistic interactions, etc. Dawkins givesimitation as an example of mechanism of replicationfor units of language evolution. As a matter of fact, allkinds of linguistic activities, which can be much morecomplex than just imitation, like conversation orreading, provoke the replication of linguistic units. Theconsequence is that ‘‘leaping from brain to brain’’ is avery complex process that can happen through a vari-ety of mechanisms. What provokes variation is there-fore also very diverse: bad perception, erroneousinterpretation, exaggeration, etc.

This shows that the conceptualization of languageevolution as a Darwinian process may take quitedifferent forms for different authors and is oftenpresented only at a rather general level. Yet, in orderto be useful, this conceptualization must be precise,detailed and operational. A first attempt to do so isto focus on particular linguistic examples, typicallyparticular phonemes or words for which we havegood data about their evolution in the past, and tofind out a detailed and causal explanation in terms ofDarwinian processes (Croft 2000; Blevins 2004, 2006).Nevertheless, if language evolution is a Darwinianprocess, then it means that many of its features aresystemic: they are the outcome of the complex

22 Cogn Process (2007) 8:21–35

123

Page 3: Language evolution as a Darwinian process: … · Language evolution as a Darwinian process: computational studies ... Darwin ian process . ... selves in the gene pool by leapi ng

interactions between replicators and environment.But this kind of system is extremely complicated, andthis is why in biology the use of mathematical andcomputational modelling, in particular in populationgenetics, has been crucial in order to test and developthe neo-Darwinian theory of evolution (Crow andKimura 1970; Maynard-Smith 1982; Ancel and Fon-tana 2000). These kinds of models could also be usedfor the study of language evolution as a Darwinianprocess. Computational models of the origins oflanguage have been flourishing in the recent years(Cangelosi and Parisi 2002; Kirby 2002; Steels 2003).They allow us to develop intuitions about the com-plex phenomena characterizing language dynamics.Thus, they should be considered as tools for thought,and a number of them already had a significant im-pact on the debates and theories of the origins oflanguage (Cangelosi et al. 2006). Some of them weredesigned to model the origins of particular featuresof language in a cultural Darwinian perspective.Nevertheless, there exists to our knowledge no gen-eral description focusing on the cultural Darwinianperspective and showing how a range of computa-tional models presented in the light of Darwinianconcepts can illustrate the diverse properties of theassociated mechanisms. This is the purpose of thispaper. A number of experimental studies will now bepresented, showing operational implementations oflinguistic replicators and their associated mechanismsof differential replication. These computational stud-ies are tools for thought. We will try to show howthey can help articulate the possible similarities andthe differences between linguistic and genetic repli-cators. The common point of all experiments is thatthey consist of populations of agents initially devoidof linguistic convention, and that will progressivelyand culturally build each time a new (simple) lin-guistic system. A first series of examples will focus onthe coherence constraints imposed on linguistic re-plicators due to their use as communication systemsby their hosts. In particular we will see that only veryspecific mechanisms of replication allow for the effi-cient formation of shared linguistic conventions.Then, we will present an experiment showing howlinguistic replicators can adapt and evolve under thespecific constraints due to the external environment.We will then review an experiment studying the roleof learning biases in the replication process and seehow it can influence language evolution. Finally, wewill describe an experiment in which we can observethe formation of coalition of replicators, forminggroups which are themselves subject to competitionand selection.

Viewing language evolution as a Darwinian process:experimental studies

Linguistic replicators have specific properties com-pared to biological replicators: they form a system thatpermits communication. For instance, in a vocabularyin which words are associated to concepts/meaningsand are used to draw the attention of several speakerstowards a particular referent in a given context, syn-onymy and homonymy tend to be reduced ensuringefficient communication. We will see that not everydifferential replication process permits the emergenceof such communication systems. Typically, each lin-guistic interaction involves the semiotic triangle: thereis a form (e.g. a word), an associated meaning, and anassociated referent in a particular context. This entailsthat three kinds of entities can be replicated throughcommunication: forms, meanings, and associationswithin certain forms and certain meanings. As a matterof fact, each of these macro-entities consists itself of avariety of entities which are also replicators. Forexample, words are composed of syllables which arethemselves composed of vowels and consonantsthrough phonotactic rules, and all these hierarchicalentities can replicate differentially. Although all thesereplicators are constantly interacting, the experimentswe will now present make a number of simplificationsthat allow us to develop a better understanding of thefundamental dynamics associated to various functional,internal and environmental constraints. For example,in the first experiment we will describe, we supposethat there is only one meaning in the world that agentsinhabit, and that two possible words can be associatedto this meaning. This experiment will show some basicproperties of the replication mechanism so that asimple convention can be adopted by a population, i.e.so that linguistic coherence can be reached (speakersassociate the same word to the same meaning).This experiment will then be made more complex infollowing sections, allowing to study progressivelymore complex phenomena like linguistic distinctive-ness (speakers associate different words to differentmeanings).

Basic dynamics of linguistic coherence

Let us consider a simple problem: N agents have tochoose between two conventional names c1 and c2. Wewill consider three simple models representative formany more complex ones studied in the field. The firstmodel is an imitation-based model (model A) and thetwo others are frequency-based (model B and C). Inmodel A (last heard policy), the speaker simply

Cogn Process (2007) 8:21–35 23

123

Page 4: Language evolution as a Darwinian process: … · Language evolution as a Darwinian process: computational studies ... Darwin ian process . ... selves in the gene pool by leapi ng

produces the conventional name he heard last as alistener. In model B (most frequently heard policy), thespeaker produces the name that he has heard mostfrequently as a listener. In model C (proportional tothe frequency heard policy), the speaker produces anyname that he has heard with a frequency f as a listenerwith a probability proportional to f. These three typesof replication processes could be seen a possiblemodels of how cultural replication occurs.

Figures 1, 2, 3 present four representative sampleevolutions respectively for models A, B and C. Withimitation-based model A, the population eventuallyconverges to a state of complete co-ordination. How-ever, convergence happens only after a long series ofoscillations. With model B, convergence towards asingle conventional name also occurs. However, theoscillations observed are much smaller. As soon as aconvention spreads more in the population than theother, its domination seems to amplify even more overtime. For model C, on the contrary, dynamics tend tomaintain the distribution of c1 and c2 over time, afteran initial drift.

An in-depth study of these three models revealthat despite their apparent similarity the types ofdynamics they create are extremely different. Amongthe three models studied, only model B creates self-reinforcing dynamics that permit a fast coordinationof the entire population towards the use of a singleconventional name. Model A is approximativelysimilar to a random walk, converging in quadratictime. On the contrary, the dynamics of model C tend

to maintain the distribution of the convention at afixed level.1

What we must remember from these results is thatnot all cultural transmission systems create a differen-tial replication process that ensures the domination ofsome linguemes over others. Simple models of culturaltransmission solely based on imitation are not suffi-cient to permit linguistic co-ordination. In that sensethe dynamics of linguistic replication are likely to bedifferent from the ones characterizing epidemiologicalprocesses, which have often been presented as a pos-sible metaphor for cultural transmission (Sperber1984).

Implicit evaluation

The positive feedback loop introduced in model Bcreates a winner-take-all situation where one conven-tional name dominates. The replicator that eventuallywins has no special properties and any new run of thesimulation would lead to the selection of a differentone. Let us now consider a set of replicators of unequalqualities, i.e. for example words which sounds are moreor less prone to be deformed when transferred fromone agent to the other due to noise. We model these

0 0.5 1 1.5 2

x 104

0

20

40

60

80

100

t

N1,

N2

0 0.5 1 1.5 2

x 104

0

20

40

60

80

100

t

N1,

N2

0 0.5 1 1.5 2

x 104

0

20

40

60

80

100

t

N1,

N2

0 0.5 1 1.5 2

x 104

0

20

40

60

80

100

t

N1,

N2

Fig. 1 Model A (last heardpolicy). The speaker simplyproduces the conventionalname he heard last as alistener. Competitionbetween two conventionalnames c1 and c2 in apopulation of 100 agents. N1and N2 represent,respectively, the number ofagent using c1 and c2. Initially,50 agents choose c1 and 50other agents choose c2.Several oscillations areobserved before convergence(reprinted from Kaplan 2005)

1 Experimental results and qualitative interpretations suggestthat self-reinforcing dynamics of model B converge in N!log(N),where N is the population size. These experimental results canalso be interpreted using various formalisms including Markovchains, stochastic games and Polya processes (see Kaplan 2005for a review).

24 Cogn Process (2007) 8:21–35

123

Page 5: Language evolution as a Darwinian process: … · Language evolution as a Darwinian process: computational studies ... Darwin ian process . ... selves in the gene pool by leapi ng

differences in a crude way by associating to each re-plicator a mutation probability Pm(ci) between 0 and100%. During each interaction, a random test is doneto check whether the replicator has been transmittedsuccessfully or not. In case of failure, the replicator istransformed to another replicator randomly pickedamong all the possible ones.

More precisely, let us consider a new experiment inwhich there is still only one meaning possible, but thistime there are 50 possible words/replicators that can beassociated to it by individuals. Replicators have amutation probability that grows linearly with their in-dex.2 Figure 4 shows, for 1,000 simulations, the distri-bution of the winning replicator for a population of 50agents. Thirty-nine percentage of the runs convergetowards the three best replicator (replicators with lowmutation rates). The agents perform a collective opti-mization: they spontaneously converge towards thebest names. The phenomena is based on an implicitevaluation of the solutions similar to the one describedfor foraging behaviour in ant colonies (Dorigo et al.1997). It means that the agents are not evaluatingindividually the quality of each convention for choos-ing the more robust ones. Ill-adapted replicators simply

mutate more often and cannot propagate as easily asthe others (Croft 2000).

Reorganization in the presence of a population flux

Figure 4 shows that although there is a clear tendencyto focus towards the most robust replicators, subop-timal solutions are sometimes collectively chosen.This is due to premature convergence. If, for instance,a very good replicator appears in the population laterduring the experiment, it is probable that it will notbe picked up because the positive feedback loopwould have already caused the agents to convergetowards a suboptimal one. The situation is different inan open system where agents are entering and leavingthe population. Indeed, new agents entering thepopulation have no special preference for the domi-nant replicator. They can discover a better replicatorand maybe, if it is really more robust than the onecurrently dominating, such an ‘‘outsider‘‘ mighteventually win.

However, introducing new agents in the populationis not without risk. Intuitively, we can imagine that ifthe flux of new agents is too big, convergence towards ashared system of replicators may not be possible any-more. Let us define Pr as the probability of replacingan old agent before every new interactions during anexperiment. Figure 5 summarizes the results of a largenumber of experiments investigating the relationship

0 1000 2000 30000

20

40

60

80

100

t

N1,

N2

0 1000 2000 30000

20

40

60

80

100

t

N1,

N2

0 1000 2000 30000

20

40

60

80

100

t

N1,

N2

0 1000 2000 30000

20

40

60

80

100

t

N1,

N2

Fig. 2 Model B (mostfrequently heard policy). Thespeaker produces the namethat he has heard mostfrequently as a listener.Competition between twoconventional names c1 and c2in a population of 100 agents.N1 and N2 representrespectively the number ofagent using c1 and c2. Initially,50 agents choose c1 and 50other agents choose c2.Dominance of one name overthe other tends to increaseover time (reprinted fromKaplan 2005)

2 Thus for replicator ci, the probability of mutation can be ex-pressed by the following formula:

Pm"ci# $i

Number of replicators: "1#

Cogn Process (2007) 8:21–35 25

123

Page 6: Language evolution as a Darwinian process: … · Language evolution as a Darwinian process: computational studies ... Darwin ian process . ... selves in the gene pool by leapi ng

between the average convergence time Tc over 100simulations for different population size and agent re-newal rate Pr. The results suggests the existences of athreshold over which convergence is either not possibleor extremely long. This threshold depends on thepopulation size.

It can be argued that this behaviour shows someanalogies with phase transitions occurring in certainphysical systems, for instance in experiments aboutferromagnetism. The renewal rate Pr plays the role ofthe temperature in such systems. A too high tempera-ture leads to a complete disorganization of the ferro-magnetic system. In the same manner, if the renewalrate is too important the cultural transmission processis broken and convergence towards a shared set ofreplicators is not obtained any more. In physics, this

threshold is called critical temperature. By analogy wecan use the term critical flux (Kaplan 2001).

High temperatures have both good and bad effectson physical systems. They can lead to their disorga-nization but also they can permit restructurationsleading to more optimal configurations (optimizationtechniques such as simulated annealing are based onthis effect). Following our initial remark and in orderto see if an agent flux can lead to a better selection ofthe replicator, we can measure for several values of Pr

the proportion of simulation runs that end up withone of the three best replicators dominating in theexperiment similar to the one of Fig. 4. Figure 6 plotsthe proportion of simulation runs converging towardone of the three best solutions for different popula-tion flux Pr before the critical flux is reached.

0 2000 4000 6000 8000 100000

20

40

60

80

100

t

N1,

N2

0 2000 4000 6000 8000 100000

20

40

60

80

100

t

N1,

N2

0 2000 4000 6000 8000 100000

20

40

60

80

100

t

N1,

N2

0 2000 4000 6000 8000 100000

20

40

60

80

100

t

N1,

N2

Fig. 3 Model C (proportionalto the frequency heardpolicy). The speaker producesa name that he has heard as alistener with a probabilityproportional to the frequency.Competition between twoconventional names c1 and c2in a population of 100 agents.N1 and N2 representrespectively the number ofagent using c1 and c2. Initially,50 agents have a bias towardc1 and 50 others a bias towardc2. After an initial driftperiod, the distribution tendsto be maintained (reprintedfrom Kaplan 2005)

Fig. 4 Implicit evaluation oflinguistic replicators.Distribution for 1,000simulation runs of thewinning replicator for apopulation of 50 agents.Thirty-nine percent of theruns converge towards thethree best replicators(replicators with lowmutation rates)

26 Cogn Process (2007) 8:21–35

123

Page 7: Language evolution as a Darwinian process: … · Language evolution as a Darwinian process: computational studies ... Darwin ian process . ... selves in the gene pool by leapi ng

Although the effect is small, the results suggest thatreorganization is optimal near the edge of thisthreshold. In these cases, the flux permits to decreasesuboptimal convergences.

To summarize, if each agent uses the most widelyspread replicator from its own point of view, then apositive feedback loop is created, leading to thedomination of one replicator. This dynamic is ‘‘blind’’and does not prefer any replicator per se. But if somereplicators are more difficult to transmit, they willimplicitly be left aside. Thus, the best replicator tendsto be chosen by the population. Finally, the presenceof a flux of agents in the system avoids prematureconvergence. Better replicators (more robust, easierto learn) tend to be selected. The arrival of newagents enables a continuous parallel search for solu-tions that can replace the ones currently dominating.If needed it can cause reorganization in the linguisticsystem.

Basic dynamics of linguistic distinctiveness

For efficient communication, it is better that differentwords are associated with different meanings and viceversa. This obvious remark actually constrains systemsof linguistic replicators in many important ways. In-deed, the replication process must not only address theissue on linguistic coherence but also permit linguisticdistinctiveness. Let us consider a model in whichindividuals have to establish conventionalized associ-ations between several words and several meanings.Each agent is now equipped with an associativememory, which is a list of word-meaning pairs with anumeric score. It is used to find the best word associ-ated to a given meaning and reversely to find the bestmeaning associated to a given word. As in the modelsof the previous section, agents choose the associationwith the highest score when several solutions are pos-sible. The associative memories of the agents are ini-tially empty. Associations are progressively created asthe agent interacts with other agents.

Studies of such systems were initiated by Steels(1996) in the mid 1990s. Several other experimentsrapidly showed how collective dynamics could permitthat each name eventually becomes associated with asingle context and each context with a single conven-tion (Hutchins and Hazlehurst 1995; Oliphant 1997;Arita and Koyama 1998; Kaplan 2001; De jong andSteels 2003; Vogt 2005; Baronchelli et al. 2006).

Some of the most interesting dynamics of such self-organizing lexicons are obtained in the presence ofnoise. Let us consider that each word/replicator ci ismodelled with an integer value between 0 and 1,000.Each time a word/replicator is transmitted, a randomnumber between –B/2 and + B/2 is added to its value.Thus, B is a measure of the global noise level. Each

Fig. 6 Proportion of simulation runs (100 trials) convergingtoward one of the three best solutions for different populationflux Pr for a population of N = 100 agents. Although the effect issmall, the results suggest that reorganization is optimal near theedge of this threshold. In these cases, the flux permits to decreasesuboptimal convergences

Fig. 5 Comparison of meanconvergence time Tc fordifferent population flux Pr

and for different populationsize N. The results suggest theexistence of a thresholddepending on the populationsize N above whichconvergence is either notpossible or extremely long.This threshold characterizesthe maximal renewal rate thatthe system can tolerate

Cogn Process (2007) 8:21–35 27

123

Page 8: Language evolution as a Darwinian process: … · Language evolution as a Darwinian process: computational studies ... Darwin ian process . ... selves in the gene pool by leapi ng

agent is equipped with a filter permitting to select allthe words/replicators in its associative memory ofwhich the values are at a distance D less than D = B.The structure of an interaction is the following: agent 1randomly chooses a meaning s1 among the differentmeanings available and uses a word c1 to express thismeaning. If it does not have words associated with thismeaning, the agent creates a new one (a randominteger between 0 and 1,000). Then, c1 is transmitted toagent 2 with an alternation between –B/2 and + B/2.Then, agent 2 selects all the possible associations with aword close to the integer received (at a distance lessthan B). If several associations are possible, agent 2chooses the one with the highest score: (c2, s2). Ifs1 = s2 the interaction is a success, in the other cases theinteraction is a failure. If no association is close enoughin agent 2’s memory, the agent creates a new associa-tion between the received integer and the meaning s1.In case of success, agent 2 increases the score of theassociation (c2, s2) with + d and diminishes the score ofcompeting associations ((c2, *) and (*, s2)) where * isany meaning or word in the memory of the agent) with–d. In case of failure, agent 2 decreases the score of(c2,s2) with –d (see Kaplan 2001; De jong and Steels2003; Vogt 2005 for discussions of the importance ofsuch forms of lateral inhibition). Associations are ini-tially created with a 0 score.

We have previously shown that collective implicitevaluation led to choose the ‘‘best‘‘ word/replicatorassociated to a given meaning. What are the bestwords/replicators in the current model? A good word/replicator is a replicator that an agent will not confusewith another one that has a different usage. A ‘‘good’’lexical system should have sets of words clearly distinct

from one another depending on the meanings theyassociated to. Figure 7 shows the evolution in the wordspace (i.e. an idealized acoustic space) of the wordsassociated with five meanings in an experimentinvolving ten agents with a noise level B = 100. Afteran initial ambiguity period, five well separated bands inthe word space are clearly identifiable. Agents do notconverge towards a unique value for each context.Each agent uses a different one. But these values tendto be very close. The ‘‘band’’ for one context is clearlydistinct from bands associated with other contexts. Noconfusion is possible. Figure 7 plots also the ‘average’value of each band. Thus, it is easier to see the col-lective optimization of distinctivity leading to a solu-tion compatible with the level of noise present in theenvironment. We also see on this graph that with thislevel of noise we approach the limit of expressivenesspossible in this medium. If the agents had to commu-nicate about a larger number of distinct meanings,ambiguity will inevitably arise.

Compromise between distinctivity and robustness

Words are certainly not well modelled as integers.Some words are long and difficult to transmit, othersare short but can be easily be confused with one an-other. To study this issue, let us now consider a modelwhere each word/replicator is now a numeric chain ofvariable length. Each character of the chain is a num-ber between 1 and 9. Noise is modelled by a probabilityof alteration Pm equal for each character. When acharacter mutates, it is simply replaced by a randomcharacter between 1 and 9. As in the previous model,agent 2 can look up the chains that are ‘‘close‘‘ to thetransmitted word in its memory. We can define a dis-tance Dc between chains, similar to the traditionalHamming distance.3 In the interaction agent 2 selectsthe chains which are at a distance less than thethreshold D.

Fig. 7 Evolution in the word space (idealized acoustic space) ofthe words associated with five meanings in an experimentinvolving ten agents with a noise level B = 100. After a firstperiod of ambiguity, five well separated bands appear associatedwith each meaning. Evolution of the ‘average’ values (in thepopulation) of the words associated with each meaning ishighlighted in the middle of each band

3 Let c1 and c2 be two chains the length of c1 being either smalleror equal to the length of c2. Let k1(i) and k2(i) be the character inposition i in each of the chain. We define Df as being the sum ofthe distance between the character of both chains to which isadded ten times their length difference, l2–l1.

Dc"c1; c2# $X

i

kk1"i# % k2"i#k& 10:"l2 % l1# "2#

For instance the chains 1-4-5-2 and 1-4-5-7-3 are at a distance5 + 10 = 15. The arbitrary details of this particular distance arenot important for the dynamics characterized. They are providedhere only to permit a reproduction of the experimental resultsdescribed.

28 Cogn Process (2007) 8:21–35

123

Page 9: Language evolution as a Darwinian process: … · Language evolution as a Darwinian process: computational studies ... Darwin ian process . ... selves in the gene pool by leapi ng

With such a model, too long or too short chains arenaturally less adapted. Indeed, the longer a chain is,the more risk it has to be altered during transmission.In a previous model, we have seen that such conven-tions generally lose the competition. But on the otherside, if the conventional system is only composed ofvery short words, a single mutation might very oftenlead to confusion. Short words are robust but easilyconfused, long words are easy to distinguish but diffi-cult to transmit correctly. A compromise betweenrobustness and distinctivity must be found (for a longerdiscussion of this aspect refer to Kaplan 2001). Wehave analysed the distribution of all the words used bythe agents after 5,000 games (at this point, we haveobserved experimentally that the replicator systemreaches a stable state). The results of the distribution ofthe chain length are shown in Fig. 8. The distributionhas a ‘‘peak‘‘ around chains of length 3. As expected,words/replicators that are too long or too short are lesspresent in the final lexical systems.

These types of compromise is commonly observed innatural languages. Language typically contain a smallset of short words used very frequently (e.g. auxilia-ries). On the contrary, long words tend to have veryspecific meanings, used rarely.

Neutral drift

External factors like language contact between popu-lations are often cited as a major cause of languageevolution. But it is also known that language canchange spontaneously based on internal dynamics(Labov 1994). We have seen with the previous modelthat in a noisy environment, agents can converge on astable system in which distinct bands in the word/acoustic space are associated with distinct contexts. Aswe see in Fig. 7, this repartition in separated bands

does not evolve anymore once a stable solution hasbeen found. Figure 9 shows the evolution of the aver-age word in the presence of an agent flux defined by aprobability of replacing an old agent by a new onePr = 0.01, for a population of 20 agents and with twopossible meanings. The centre of the bands are spon-taneously evolving as new agents are entering thesystem. This is an example of a neutral drift.

This effect is easily understandable. A new agenttends to converge on words belonging to the existingbands for each meaning to express. But within thisband, it does not converge towards the exact centre ofthe band. Thus the centre is moving as the flux of newagents enters the system. The higher the agent toler-ance on noise, the higher the amplitude of this drift(see Steels and Kaplan 1998 for a first description ofthis phenomenon).

In this experiment, words/replicators are evolvingspontaneously without any functional drive. However,external pressures can direct these dynamics in onedirection or another. This neutral drift provides nov-elty and thus can lead to a more efficient reorganiza-tion if needed. In some way, this effect is similar to therole of neutral mutation in evolution (Kimura 1983).

Experiments on computational models of phono-logical systems have shown how similar collectivedynamics in the presence of noise lead a population ofagents to converge towards a set of vowels optimallydistributed in the phonological space in order to favourdistinctiveness between them (de Boer 1997; De Boer1999; Oudeyer 2005b). Such emerging phonetic sys-tems have high similarity with real ones as observed innatural languages.

To summarize, noise during word transmission fa-vours sets of words that are clearly distinct from oneanother when they are associated with differentmeanings. When words can have different complexi-ties, a compromise must be found between distinctivityand resilience to noise. Short words are easy to trans-mit but easily confused, long words are difficult totransmit correctly but are easily distinguishable fromone another. We experimentally observe the conver-gence towards words of intermediary sizes. Finally, inthe presence of noise and agent flux, we experimentallyobserve a spontaneous non functional evolution. Thiscontinuous exploration can lead to a more efficientreorganization of the replicator system if needed.

Linguistic replicators adapt to particularenvironments

Until now, we have only considered simple modelsof linguistic replication. Linguistic phenomena are

0

1000

2000

3000

4000

5000

6000

1 3 7 9 10

Num

ber o

f wor

ds

Word Length2 4 5 6 8

Fig. 8 Distribution of word length on 100 simulation runs for tenagents and 20 contexts with D = 20

Cogn Process (2007) 8:21–35 29

123

Page 10: Language evolution as a Darwinian process: … · Language evolution as a Darwinian process: computational studies ... Darwin ian process . ... selves in the gene pool by leapi ng

obviously more complex. The previous experimentshave focused on the replication of words, but theirassociated meanings and semantic categories are alsoentities that can replicate from brain to brain throughlinguistic interactions. In many computational models,these categories are modelled as points in categoryspace. But more complex systems of meanings, alsoreferred as categories or concepts, were also investi-gated (Steels and Belpaeme 2005). The Talking Headsexperiment conducted between 1999 and 2000 by Steelsand co-workers have provided a large set of data onhow systems of categories can adapt to particularenvironments (Steels and Kaplan 2002; Kaplan 2001).In this experiment robotic agents were capable of seg-menting the image perceived through the camera intoobjects and of collecting various sensory data abouteach object, such as the colour, position, size, shape, etc.A couple of robots were placed in front of a white boardon which various types of objects were placed. At eachinteraction, the speaker chose one object from thiscontext, reused or constructed a category that wouldidentify this object from the other object present in thebackground and uttered a word associated with thatcategory. Based on this word, the other robot had toguess which object was named (Fig. 10).

In the first run of the experiment, a total of 8,000words and 500 concepts were created, with a corevocabulary consisting of 10 basic words expressingconcepts like up, down, left, right, green, red, large,small, etc. The dynamics that pushed the populationtowards coherence and distinctivity ensured the col-lective choice of a set of word-category associationsadapted to the environment that the robots were per-ceiving. Interestingly, some features like shape wereused very rarely to specify categories whereas position,colour and size categories were preferred. Interpretingsuch type of ‘‘preferences’’ is not always easy as they

can result both from the categorization mechanismused by the agents and the specific types of environ-ments that they encounter. In the present case, to referto specific objects in the types of scenes the robotsperceived, position and colours were largely sufficient.Moreover, shapes were more difficult to distinguishthan colour and position based on the features avail-able to the agents, and given the kinds of objectspresent in their environment.

Such types of indirect competitions between per-ceptual categories were observed during the wholeexperiment. Some categories were general and otherspecific (e.g. one was used to describe a particularshade of green, and another one to describe greencontexts in general). Usually, general categories werepreferred because they were both easier to learn by theagent and adapted to a larger number of contexts (seealso Smith 2005 for another series of experiments inthis line). However, in several cases, a precise categoryadapted to reoccurring specific context survived asother categories were present to ‘‘back it up‘‘. There-fore, when analyzing these types of complex dynamics,considering competition between isolated categories isnot always sufficient. The quality of a category needs tobe evaluated regarding the category set to which itbelongs and the adaptivity of the whole to particularenvironments.

Another interesting phenomenon was observed.Most of the words of the core vocabulary werecoherently interpreted as having distinct meanings.However, in some cases, two competing meanings

Fig. 9 Example of a neutral drift: spontaneous evolution of the‘average’ forms in presence of an agent flux

Fig. 10 The Talking Heads set up. Two robotic cameras areplaced in front of white board. On the board, objects of variousshaped and colours are placed. The robots have to constructcategories and words to name the objects on the board and havethe other agent guess the right object based on that word.Categories referring to colour, position, size or shape can be used

30 Cogn Process (2007) 8:21–35

123

Page 11: Language evolution as a Darwinian process: … · Language evolution as a Darwinian process: computational studies ... Darwin ian process . ... selves in the gene pool by leapi ng

co-occurred for a long time. For instance the word‘bozopite’ was associated concurrently with two typesof categories: large area (large) and large width (wide).This co-occurrence was due to the fact that in the typesof environments that the robotic agents encounteredmost objects that were large in area were also large inwidth. This is an example of residual polysemy.

This brings us to a remark. As collective dynamicsselect sets of replicators that are well adapted to theenvironment in which the agents are communicating,we might be tempted to say that the ‘‘quality’’ of thereplicators increases. But, like for species in naturalevolution, optimization stops once adaptation isreached. We have seen in the previous section that inthe presence of noise, well separated bands of repli-cators were emerging. However, once a stable solutionwas found, this optimization of distinctivity stops. Thesame effect occurs in more complex architectureswhere residual polysemy is observed. In all these sit-uations, there is no absolute optimization, only thesearch for stable adapted solutions.

Linguistic replicators adapt to the cognitiveconstraints of their hosts

When linguistic replicators leap from brain to brain,they in fact do so through perpetual cycles of produc-tion, perception and learning. Whatever these replica-tors are, they need to go through a representation in thebrain of speakers and hearers at some point. The pro-cess of updating one’s brain to incorporate some newinformation defines learning. Learning theory, and inparticular machine learning theory (Mitchell andWeinmann 1997), has shown that all learning systemsare characterized by a number of biases which meanthat every single system will be good at learning certainthings and bad at learning other things. For example,learning algorithms such as recurrent neural networksare good to learn to predict complex time series but theyare quite inefficient to learn fine categorical distinctionsin high-dimensional static spaces, whereas supportvector machines are good in high-dimensional staticspaces but pretty bad when they have to learn time-dependent phenomena (Duda et al. 2001). Learningbiases also apply to human brains. For example, whenthe human brain learns a new concept or a new sound, itwill do so typically by using the representation of analready known concept or sound and modify it a littlebit. The consequence is that learning a new concept or anew sound will only be effective if the correspondingbrain already knows not too dissimilar concepts orsounds. This imposes strong constraints on the replica-tion of linguistic memes, which are not only defined by

the generic cognitive constraints of all human brains,but also by the particular cognitive structures that werebuilt during the ontogeny of each of them. This meansthat for a given brain, some linguistic memes will beeasily learnt and replicated, but some other linguisticmemes will be strongly deformed often to the point thatno replication at all takes place. And the linguisticmemes which are easy to learn and replicate for thisbrain may prove to be difficult for another brain whichhad a different history.

What is then the consequence of all this on thedynamics of language evolution? We will now presentthe outline of a computational model of the origins ofsyllable systems which provides an answer (this modelis described in detail in Oudeyer 2005a). This modelinvolves a population of agents which can produce,hear, and learn syllables, based on an auditory and amotor apparatus that are linked by abstract neuralstructures. These abstract neural structures are imple-mented as a set of prototypes or templates, each ofthem being an association between a motor programthat has been tried through babbling and the corre-sponding acoustic trajectory. Thus, agents store in theirmemory only acoustic trajectories that they have al-ready managed to produce themselves. The set of theseprototypes is initially empty for all agents, and growsprogressively through babbling. The babblings of eachagent can be heard by nearby agents, and this influ-ences their own babbling. Indeed, when an agent hearsan acoustic trajectory, this activates the closest proto-type in its memory and triggers some specific motorexploration of small variation of the associated motorprogram. This means that if an agent hears a syllable Sthat it does not already know, two cases are possible:(1) he already knows a quite similar syllable and has agreat chance to stumble upon the motor program for Swhen exploring small variations of the known syllable,(2) he does not already know a similar syllable and sothere is little chance that he incorporates in its memorya prototype corresponding to S. This process meansthat if several babbling agents are put together, someislands of prototypes, i.e. networks of very similarsyllables, will form in their memory and they will de-velop a shared skill corresponding to the perceptionand production of the syllables in these networks.Nevertheless, the space of possible syllables was largein these experiments, and so the first thing that wasstudied was whether agents in the same simulationcould develop a large and shared repertoire of sylla-bles. This was shown to be the case (Oudeyer 2005a).Interestingly, if one runs two simulations, the popula-tion of agents will always end up with their own par-ticular repertoire of syllables.

Cogn Process (2007) 8:21–35 31

123

Page 12: Language evolution as a Darwinian process: … · Language evolution as a Darwinian process: computational studies ... Darwin ian process . ... selves in the gene pool by leapi ng

Then, a second experiment was run: some freshagents were tested for learning syllable systems thatwere formed by another population of interactingagents, and some other fresh agents were tested forlearning a syllable system which was generated artifi-cially as a list of random syllables. The results, illus-trated in Fig. 11, were that the fresh agents werealways good at learning the syllable systems developedby other similar agents, but on the contrary rather badat learning the random syllable systems. In other terms,the syllable systems developed culturally by agentswere adapted to their cognitive biases, and the randomsystems were not. Thus, the replicators constituted bysyllables evolved and were selected in a cultural Dar-winian process so as to fit to the ecological niche de-fined by the cognitive structures of agents, fitness beinghere learnability.

Several other computational systems have beendeveloped to study the mechanisms that allow thecultural selection for learnability of linguistic replica-tors. Zuidema (2003) presented abstract simulations ofthe formation of syntactic structures and detailed theinfluence of cognitive constraints upon the generatedsyntax. Brighton et al. (2005) presented a thoroughstudy of several simulations of the origins of syntax(Kirby 2001) which were re-described in the light ofthis paradigm of cultural selection for learnability.

The self-organization of higher-level replicators:from individuals to groups

In the last section, we saw how it was possible that theability of a linguistic replicator to replicate from onebrain to another depended on the presence of otherreplicators in the receiving brain. As a consequence,this allows in principle for the formation of groups orcoalition of mutually reinforcing replicators. In thissection, we will illustrate with a related computersimulation how these groups can themselves becomeunits subject to competition and selection.

This simulation is based on the same general prin-ciples as the simulation presented in the last section butthe neural substrate was implemented in a lower-leveland more realistic manner (Oudeyer 2006). In partic-ular, prototypes were represented directly in terms ofbundles of artificial neurons, and one of the crucialcharacteristics of these neurons was that their survivaldepended on their activation: neurons often activatedwere kept alive and neurons with low activation werepruned. Another difference was that initially there wasa phase of generation of a large number of randomneurons which then were progressively modified andrecruited to represent particular acoustic or motorpatterns. This implied that a given syllable could pos-sibly recruit many redundant bundles of neurons, andanother syllable could be encoded only by very fewneurons. This introduces two interacting sources ofcompetition for syllables: (1) competition for activa-tion, (2) competition for neural resources. Indeed, theactivity of agents was here merely babbling, whichamounted to the random activation of neurons. Theconsequence was that the more a syllable got neuralresources, the more it was produced, and so the morethe neural resources were activated, both within thespeaker’s network and in the hearer’s network. Thisactivation was of course at the expense of other sylla-ble’s neural resources activation, and thus contributedto their progressive death. This introduced a positivefeedback loop: the more a syllable was produced, thehigher the probability to produce it again. Now, aninteresting point was that the production of a syllabledid not only activate the corresponding neural struc-ture in agents, but also the neural structures of rathersimilar syllables, i.e. syllables which shared parts of theacoustic or motor trajectory. This made the formationpossible of mutually re-inforcing groups of syllables.

Now, what happened precisely? Some systematicexperiments were conducted in abstract and simpleacoustic/motor spaces, so that they could be depictedas in Fig. 12. Here, we represent only the articulatoryspace, which is one-dimensional, and syllables are

Fig. 11 Evolution of the rate of successful imitations for a childagent which learns a syllable system established by a populationof agents (top curve), and for a child agent which learns a syllablesystem established randomly by the experimenter (bottomcurve). The child agent can only perfectly learn the vocalizationsystems which evolved in a population of agents. Such vocaliza-tion systems were selected for learnability (reprinted fromOudeyer 2005a)

32 Cogn Process (2007) 8:21–35

123

Page 13: Language evolution as a Darwinian process: … · Language evolution as a Darwinian process: computational studies ... Darwin ian process . ... selves in the gene pool by leapi ng

defined by a beginning and an end point in this space.The beginning points are represented on the x-axis,and the ending points are represented on the y-axis.The little crosses in the middle represent neural bun-dles corresponding to particular syllables. Figure 12shows the set of these neurons in two agents at thebeginning of a simulation: they are randomly spreadacross the space, meaning that the syllables that agentsproduce cover uniformly the space of possible syllablesin that space. Figure 13 shows what happens if one letsagents interact a few thousands times. We observe thatlines and columns have appeared, which means thatagents do not produce all possible syllables anymorebut only a small subset, which is the same in bothagents, and which is structured in a very particularmanner. Indeed, now agents produce syllables com-posed of systematically re-used beginning or endpoints. In other terms, this corresponds to the transi-tion from holistic to phonemically coded syllable

systems. The lines and columns show coalitions ofsyllables either sharing their beginning point or theirend point, which have entered into competition withsingle syllables or other groups that died meanwhile.The state shown on Fig. 13 is in fact a convergent statein which the population of lines and columns hasreached a threshold such that there is enough activa-tion for all neurons (and thus their correspondingsyllables) to survive.

A last point is that in these simulations, not all self-organized phonemes get re-used in the same manner:for example, on Fig. 13, if we call the self-organizedphonemes

p1; p2; . . . ; p8

then we can summarize the repertoire of allowedsequences by:

"p6; '#; "p8; '#; "';p7#

Fig. 12 The neural maps oftwo agents at the beginning ofthe simulation. The neuralmap of one agent isrepresented on the left, andthe neural map of the otheragent is represented on theright

Fig. 13 The neural maps ofthe same two agents after1,000 interactions. Weobserve that many neuronshave died and the survivingones are organised into linesand columns: this means thatphonotactic rules haveappeared, that the repertoireof vocalisation can beorganised into patterns, andthat some phonemes get re-used systematically forbuilding vocalisations, i.e.vocalisations are nowcombinatorial

Cogn Process (2007) 8:21–35 33

123

Page 14: Language evolution as a Darwinian process: … · Language evolution as a Darwinian process: computational studies ... Darwin ian process . ... selves in the gene pool by leapi ng

where * means ‘‘any phoneme in p1,...,p8’’. This cor-responds to the formation of rules of sound syntaxregulating the possible phonemic combinations, and iscalled the phonotactics of a syllable system in pho-nology: this is indeed a feature present in all humansyllable systems. What is interesting here is that this isa self-organized side-effect of the process of coopera-tion/competition between groups of syllables: therewas no functional pressure in the system that forced itto appear. Nevertheless, it has been widely described inthe linguistics literature how phonotactics are used asconstraints that help speakers in multiple ways: forexample, it is useful for better speech recognition andfor morpho-syntactic analysis. So one can envisioneasily an evolutionary scenario of language in whichphonotactics appeared first as a side effect of a morefundamental process, i.e. the formation of a shared setof speech sounds, and was recruited later to constrainvarious linguistic processing functions: this correspondsto the well-known concept of exaptation in biology(Gould and Vrba 1982).

Conclusion

In this paper, we have shown how computationalexperiments could allow us to elaborate and refine thescientific understanding of language evolution as aDarwinian process. In particular, we have seen thatthere is potentially a wide variety of linguistic repli-cators, different both in nature and in replicationmechanisms, and interacting at multiple levels. Thisshows that the most relevant analogy between lan-guage and biological evolution is probably neither atthe level of units nor at the level of replication mech-anisms, but rather at the general level of systems de-fined as unspecified sets of units subject to differentialreplication with inheritance and variation. Usingcomputational experiments allows us to validate thecoherence of this view of language evolution as aDarwinian process, which in turn participates in therefoundation of linguistics (Croft 2000). Indeed, in thelast 50 years language has been mainly considered as afixed and idealized system which could be studiedindependently of its use and of its users. In this tradi-tional body of theories, individual variation, and morebroadly language evolution were either ignored or leftunsolved, and biological evolution was used to explainthe particularly good adaptation of our brains to thelearning of nowadays idiosyncratic languages. On thecontrary, viewing language as a system of replicatorsconstantly replicating from particular brains to partic-ular brains, with variation as a central concept, allows

one to understand how languages change over time,why there is so much linguistic diversity, and provides adifferent account of the ease with which children learnlanguages. Indeed, as we showed in ‘‘Linguistic repli-cators adapt to the cognitive constraints of their hosts’’,this framework allows to understand that languagesthemselves probably evolved in a cultural Darwinianmanner so as to become easily learnable by their users.And the peculiarities of the pre-existing learning sys-tems of these users can explain the apparent idiosyn-cratic properties of languages. The paradigm shiftinduced by viewing language evolution as a Darwinianprocess also sets up new problems to be solved. Inparticular, it highlights the fact that the sharing ofcomplex and intricate linguistic conventions must beexplained: how can a system of competing replicatorsinteracting at the level of individuals converge to acoherent and distinctive system adopted by all thepopulation? We have shown with several computa-tional experiments how this problem could be solvedfor lexical systems, thanks to the use of specific repli-cation mechanisms based on positive feedback loopsand self-organization. Yet, future work will have toshow how several interacting levels of conventions,ranging from phonology to grammar and pragmatics,can be formed through a cultural Darwinian process.

Acknowledgments This research has been partially supportedby the ECAGENTS project founded by the Future andEmerging Technologies programme (IST-FET) of the EuropeanCommunity under EU R&D contract IST-2003-1940.

References

Ancel LW, Fontana W (2000) Plasticity, evolvability, andmodularity in RNA. J Exp Zool 288:242–283

Arita T, Koyama Y (1998) Evolution of linguistic diversity in asimple communication system. In: Adami C, Belew R,Kitano H, Taylor C (eds) Proceedings of Artificial Life VI.The MIT Press, Cambridge, MA, pp 9–17

Baronchelli A, Felici M, Loreto V, Caglioti E, Steels L (2006)Sharp transition towards shared vocabularies in multi-agentsystems. J Stat Mech P06014

Berlinski D (1972) Philosophical aspects of molecular biology.J Philos 69(12):319–335

Blevins J (2004) Evolutionary phonology: the emergence ofsound patterns. Cambridge University Press, Cambridge

Blevins J (2006) New perspectives on english sound patterns:natural and unnatural in evolutionary phonology. J EnglLinguist 34:6–25

de Boer B (1997) Generating vowel systems in a population ofagents In: Husbands P, Harvey I (eds) Proceedings of theFourth European Conference on Artificial Life. The MITPress, Cambridge, MA

Brighton H, Kirby S, Smith K (2005) Cultural selection forlearnability: three hypotheses underlying the view thatlanguage adapts to be learnable. In: Tallerman M (ed)

34 Cogn Process (2007) 8:21–35

123

Page 15: Language evolution as a Darwinian process: … · Language evolution as a Darwinian process: computational studies ... Darwin ian process . ... selves in the gene pool by leapi ng

Language origins: perspective on evolution. Oxford Univer-sity Press, Oxford

Cangelosi A, Parisi D, (2002) Simulating the evolution oflanguage. Springer, Heidelberg

Cangelosi A, Smith AM, Smith K (eds) (2006) The evolution oflanguage. In: Proceedings of the 6th International Confer-ence on the Evolution of Language. World Scientific,Singapore

Croft W (2000) Explaining language change. Linguistics. Long-man, London

Croft W (2002) The darwinization of linguistics. Selection 3:75–91

Crow J, Kimura M (1970) An introduction to population geneticstheory. Harper and Row, New York

Darwin C (1859) On the origin of species. John Murray, LondonDawkins R (1976) The selfish gene. Oxford University Press,

OxfordDe Boer B (1999) Self-organizing phonological systems. Ph.D.

thesis, VUB University, BrusselsDe Jong E, Steels L (2003) A distributed learning algorithm for

communication development. Complex Syst 14(4–5):315–334

Dorigo M, Maniezzo V, Colorni A (1997) The ant system:optimisation by a colony of cooperating agents. Trans SystMan Cybern Part B 26(1):29–41

van Driem G (2003) The language organism: the leiden theory oflanguage evolution. In: Mırovsky J, Kotesovcova A, Haji-cova E (eds) Proceedings of the XVIIth InternationalCongress of Linguists Prague: Matfyzpress vydavatelstvıMatematicko-fyzikalnı fakulty Univerzity Karlovy, Prague

Duda R, Hart P, Stork D (2001) Pattern classification. Wiley,New York

Gould SJ, Vrba E (1982) Exaptation—a missing term in thescience of form. Paleobiology 8:4–15

Hull D (1988) Science as a process: an evolutionary account ofthe social and conceptual development of science. Univer-sity of Chicago Press, Chicago

Hutchins E, Hazlehurst B (1995) How to invent a lexicon: thedevelopment of shared symbols in interaction. In: Gilbert N,Conte R (eds) Artificial societies: the computer simulationof social life. UCL Press, London, pp 157–189

Kaplan F (2001) La naissance d’une langue chez les robots.Hermes Science, Paris

Kaplan F (2005) Simple models of distributed co-ordination.Connect Sci 17(3–4):249–270

Kimura M (1983) The neutral theory of molecular evolution.Cambridge University Press, Cambridge

Kirby S (2001) Spontaneous evolution of linguistic structure: aniterated learning model of the emergence of regularity andirregularity. IEEE Trans Evol Comput 5(2):102–110

Kirby S (2002) Natural language and artificial life. Artif life8:185–215

Labov W (1994) Principles of linguistic change, vol 1. Internalfactors. Blackwell, Oxford

Maynard-Smith J (1982) Evolution and the theory of games.Cambridge University Press, Cambridge

Mitchell B, Weinmann L (1997) Creative design for the WWW.Lecture Notes, SIGGRAPH 1997

Mufwene SS (2005) Language evolution: the population geneticsway, vol 29. Gene, Sprachen, und ihre Evolution, pp 30–52

Oliphant M (1997) Formal approaches to innate and learnedcommunicaton: laying the foundation for language. Ph.D.thesis, University of California, San Diego

Oudeyer P-Y (2005a) How phonological structures can beculturally selected for learnability. Adapt Behav 13(4):269–280

Oudeyer P-Y (2005b) The self-organization of speech sounds.J Theor Biol 233(3):435–449

Oudeyer P-Y (2006) Self-organization in the evolution of speech.Oxford University Press, Oxford

Schleicher A (1863) Die darwinsche Theorie und die Sprachwis-senschaft: Offenes Sendschreiben an Herrn Dr. ErnstHackel. Bohlau, Weimar

Searls DB (2002) The language of genes. Nature 420:211–217Smith ADM (2005) The inferential transmission of language.

Adapt Behav 13(4):311–324Sperber D (1984) Anthropology and psychology: towards and

epidemiology of representations (the malinowsju memoriallecture 1984). Man 20:73–89

Steels L (1996) A self-organizing spatial vocabulary. Artif Life J2(3):319–332

Steels L (2003) Evolving grounded communication for robots.Trends Cogn Sci 7(7):308–312

Steels L (2004) Analogies between genome and languageevolution. In: Pollck J (ed) Artificial Life IX: Proceedingsof the Ninth International Conference on the Simulationand Synthesis of Living Systems. The MIT Press, Cam-bridge, MA, pp 200–207

Steels L, Belpaeme T (2005) Coordinating perceptuallygrounded categories through language: a case study forcolour. Behav Brain Sci 28:469–529

Steels L, Kaplan F (2002) Bootstrapping grounded word seman-tics. In: Briscoe T (ed) Linguistic evolution throughlanguage acquisition: formal and computational models.Cambridge University Press, Cambridge, pp 53–74

Steels L, Kaplan F (1998) Spontaneous lexicon change. In:Proceedings of COLING-ACL 1998. Morgan Kaufmann,San Francisco, CA, pp 1243–1250

Stegmann UE (2004) The arbitrariness of the genetic code. BiolPhilos 19(2):205–222

Tsonis AA, Elsner JB, Tsonis PA (1997) Is DNA a language?J Theor Biol 184:25–29

Vogt P (2005) The emergence of compositional structure inperceptually grounded language games. Artif Intell 167(1–2):206–242

Watson J, Crick F (1953) Molecular structure of nucleic acids.Nature 171:737–738

Zuidema W (2003) How the poverty of the stimulus solves thepoverty of the stimulus. In: Becker S, Obermayer K (eds)Advances in Neural Information Processing 15. MIT Press,Cambridge, pp 51–68

Cogn Process (2007) 8:21–35 35

123