universidad politecnica de madridoa.upm.es/14922/1/alma_yadira_quinonez_carrillo.pdf · yadira...
TRANSCRIPT
UNIVERSIDAD POLITECNICA DE MADRID
FACULTAD DE INFORMATICA
Response Threshold Models, Stochastic Learning
Automata and Ant Colony Optimization-based
Decentralized Self-Coordination Algorithms for
Heterogeneous Multi-Tasks Distribution in
Multi-Robot Systems
Ph.D Thesis
Alma Yadira Quinonez Carrillo
M.Sc. in Artificial Intelligence
Madrid, 2012
DEPARTAMENTO DE INTELIGENCIAARTIFICIAL
FACULTAD DE INFORMATICA
Response Threshold Models, Stochastic Learning
Automata and Ant Colony Optimization-based
Decentralized Self-Coordination Algorithms for
Heterogeneous Multi-Tasks Distribution in
Multi-Robot Systems
Alma Yadira Quinonez Carrillo
M.Sc. in Artificial Intelligence
Thesis Advisors
Javier de Lope Asiaın
PhD. in Informatics
Darıo Maravall Gomez-Allende
PhD. Telecommunications Engineer
Madrid, 2012
Tribunal nombrado por el Magfco. y Excmo. Sr. Rector de la Universidad
Politecnica de Madrid, el dıa —– de ———– de 2012.
Presidente: —————————————–
Vocal: —————————————–
Vocal: —————————————–
Vocal: —————————————–
Secretario: —————————————–
Suplente: —————————————–
Suplente: —————————————–
Realizado el acto de defensa y lectura de la Tesis el dıa —– de ———– de
2012 en la Facultad de Informatica.
VOCAL VOCAL VOCAL
PRESIDENTE SECRETARIO
v
I would like to dedicate this thesis to my Mother and my Brothers.
Acknowledgements
After such a great experience, I obviously have many people to thank...
First, I want to thank all my family members. Thanks for being there and
supporting me in every decision. Thank you for believing in me and giving me
the strength to face even the most difficult things. Definitely thanks to you all I
was able achieve this objective.
I would also like to take this opportunity to thank my supervisors, Javier de
Lope y Darıo Maravall, because they have helped me enormously to further my
understanding and expand my horizons in the field of robotics, but above all, I
am very grateful to them for their unfailing interest, guidance and wisdom during
the development this project.
I am sincerely thankful with the Consejo Nacional de Ciencia y Tecnologıa,
the Univesidad Autonoma de Sinaloa and the Universidad Politecnica de Madrid
for contributing with the financial support in conducting this PhD thesis.
A heartfelt thanks also to the members of the Lab for making my stay more
comfortable, but in particular, to Antonio Fernandez and Juan Bekios for their
comments and suggestions.
Finally, but not least important, I would like to express my gratitude to all
my friends that I met here in Madrid, who they not only encouraged me during
the research career, but also, have given me many great moments. Thanks to
Marinela, Ivan, Lindsay, Miguel, Jez, Boris, Gonzalo, Juan, Tony, Ernesto, Raul,
Ghislain, David and Monse for sharing with me so many lunches and speaking
not only about work, I have enjoyed these last few years enormously! Thank you
all for your support, friendship and conviviality.
Yadira Quinonez
viii
Abstract
In recent decades, there has been an increasing interest in systems comprised of
several autonomous mobile robots, and as a result, there has been a substantial
amount of development in the field of Artificial Intelligence, especially in Robotics.
There are several studies in the literature by some researchers from the scientific
community that focus on the creation of intelligent machines and devices capable
to imitate the functions and movements of living beings. Multi-Robot Systems
(MRS) can often deal with tasks that are difficult, if not impossible, to be accom-
plished by a single robot. In the context of MRS, one of the main challenges is
the need to control, coordinate and synchronize the operation of multiple robots
to perform a specific task. This requires the development of new strategies and
methods which allow us to obtain the desired system behavior in a formal and
concise way.
This PhD thesis aims to study the coordination of multi-robot systems, in
particular, addresses the problem of the distribution of heterogeneous multi-tasks.
The main interest in these systems is to understand how from simple rules inspired
by the division of labor in social insects, a group of robots can perform tasks in
an organized and coordinated way. We are mainly interested on truly distributed
or decentralized solutions in which the robots themselves, autonomously and in
an individual manner, select a particular task so that all tasks are optimally
distributed.
In general, to perform the multi-tasks distribution among a team of robots,
they have to synchronize their actions and exchange information. Under this
approach we can speak of multi-tasks selection instead of multi-tasks assignment,
which means, that the agents or robots select the tasks instead of being assigned a
task by a central controller. The key element in these algorithms is the estimation
ix
of the stimuli and the adaptive update of the thresholds. This means that each
robot performs this estimate locally depending on the load or the number of
pending tasks to be performed. In addition, it is very interesting the evaluation
of the results in function in each approach, comparing the results obtained by the
introducing noise in the number of pending loads, with the purpose of simulate
the robot’s error in estimating the real number of pending tasks.
The main contribution of this thesis can be found in the approach based on
self-organization and division of labor in social insects. An experimental scenario
for the coordination problem among multiple robots, the robustness of the ap-
proaches and the generation of dynamic tasks have been presented and discussed.
The particular issues studied are:
• Threshold models: It presents the experiments conducted to test the re-
sponse threshold model with the objective to analyze the system perfor-
mance index, for the problem of the distribution of heterogeneous multi-
tasks in multi-robot systems; also has been introduced additive noise in the
number of pending loads and has been generated dynamic tasks over time.
• Learning automata methods: It describes the experiments to test the learn-
ing automata-based probabilistic algorithms. The approach was tested to
evaluate the system performance index with additive noise and with dy-
namic tasks generation for the same problem of the distribution of hetero-
geneous multi-tasks in multi-robot systems.
• Ant colony optimization: The goal of the experiments presented is to test
the ant colony optimization-based deterministic algorithms, to achieve the
distribution of heterogeneous multi-tasks in multi-robot systems. In the
experiments performed, the system performance index is evaluated by in-
troducing additive noise and dynamic tasks generation over time.
x
Resumen
En las ultimas decadas, ha habido un interes creciente en los sistemas compuestos
por varios robots moviles autonomos, y como resultado, ha surgido una cantidad
sustancial de desarrollo en el campo de la inteligencia artificial, especialmente
en la robotica. Hay varios estudios en la literatura por parte de algunos inves-
tigadores de la comunidad cientıfica que se centran en la creacion de maquinas
inteligentes y dispositivos capaces de imitar las funciones y los movimientos de
los seres vivos. En los sistemas multi-robot (MRS) a menudo pueden tratar con
tareas que son difıciles, por no decir imposibles, de realizar por un solo robot. En
el contexto de los MRS, uno de los principales retos es la necesidad de controlar,
coordinar y sincronizar el funcionamiento de multiples robots para realizar una
tarea especıfica. Esto requiere el desarrollo de nuevas estrategias y metodos que
permitan obtener el comportamiento deseado del sistema de una manera formal
y concisa.
Esta tesis tiene como objetivo el estudio de la coordinacion de sistemas multi-
robot, en particular, aborda el problema de la distribucion de multiples tareas
heterogeneas. El principal interes por este tipo de sistemas es comprender como
a partir de reglas sencillas inspiradas en la division del trabajo en los insectos
sociales, un grupo de robots pueden realizar tareas de una manera organizada y
coordinada. Estamos interesados principalmente en soluciones verdaderamente
distribuidas o descentralizadas en el que los propios robots, de forma autonoma
y de manera individual, seleccionan una tarea particular de tal modo que todas
las tareas se distribuyan de manera optima.
En general, para realizar la distribucion de multiples tareas entre un equipo
de robots, tienen que sincronizar sus acciones e intercambiar informacion. Bajo
este enfoque se puede hablar de la seleccion de multiples tareas en lugar de la
xi
asignacion de multiples tareas, es decir, como los agentes o robots seleccionan
las tareas en lugar de ser asignados a una tarea por un controlador central. El
elemento fundamental en estos algoritmos es la estimacion de los estımulos y la
actualizacion adaptativa de los umbrales. Esto significa que cada robot realiza
dicha estimacion de forma local dependiendo de la carga o el numero de tareas
pendientes por ejecutar. Ademas, es muy interesante la evaluacion de los resul-
tados en funcion de cada enfoque comparando los resultados obtenidos mediante
la introduccion de ruido en el numero de cargas pendientes para simular el error
del robot en la estimacion del numero real de tareas pendientes.
La principal aportacion de esta tesis se puede encontrar en un enfoque basado
en la auto-organizacion y division del trabajo en los insectos sociales. Un esce-
nario experimental para el problema de la coordinacion entre multiples robots, la
robustez de los enfoques y la generacion de tareas dinamicas han sido presentados
y discutidos. Los temas especıficos estudiados son los siguientes:
• Modelos de umbral: se presentan los experimentos realizados para pro-
bar el modelo umbral de respuesta con el objetivo de analizar el ındice de
rendimiento del sistema, para el problema de la distribucion de multiples
tareas heterogeneas en los sistemas multi-robot; tambien se ha introducido
ruido aditivo en el numero de cargas pendientes y se han generado tareas
dinamicas a traves del tiempo.
• Metodos de automatas de aprendizaje: se describen los experimentos para
probar los automatas de aprendizaje basadas en algoritmos probabilısticos.
El enfoque fue probado para evaluar el ındice de rendimiento del sistema con
ruido aditivo y la generacion de tareas dinamicas para el mismo problema de
la distribucion de multiples tareas heterogeneas en los sistemas multi-robot.
• Optimizacion de colonias de hormigas: el objetivo de los experimentos pre-
sentados es poner a prueba el algoritmo de optimizacion de colonias de
hormigas basado en algoritmos deterministas, para lograr la distribucion de
multiples tareas heterogeneas en los sistemas multi-robot. En los experi-
mentos realizados se evaluo el ındice de rendimiento del sistema mediante
la introduccion de ruido aditivo y la generacion de tareas dinamicas en el
tiempo.
xii
Contents
Acknowledgements viii
Abstract ix
Resumen xi
Contents xiii
List of Figures xviii
List of Tables xx
I Goals and Background 1
1 Introduction 2
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Thesis Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 General Objective . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Specific Objectives . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Main Contributions and Publications . . . . . . . . . . . . . . . . 6
1.3.1 Main Contributions . . . . . . . . . . . . . . . . . . . . . . 6
1.3.2 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 State of the Art 11
2.1 Multi-Robot Systems . . . . . . . . . . . . . . . . . . . . . . . . . 12
xiii
CONTENTS
2.1.1 Coordination in Multi-Robot Systems . . . . . . . . . . . . 14
2.1.2 Architectures for Multi-robot Systems . . . . . . . . . . . 16
2.1.2.1 Centralized Architectures . . . . . . . . . . . . . 16
2.1.2.2 Hierarchical Architectures . . . . . . . . . . . . . 16
2.1.2.3 Decentralized Architectures . . . . . . . . . . . . 17
2.1.2.4 Hybrid Arquitectures . . . . . . . . . . . . . . . . 17
2.1.3 Main Problems among a Group of Robots . . . . . . . . . 19
2.1.4 Coordination Schemes: Cooperative and Competitive . . . 20
2.2 Fields of Application . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.1 Cooperative Manipulation . . . . . . . . . . . . . . . . . . 22
2.2.2 Unstructured Environments . . . . . . . . . . . . . . . . . 23
2.2.3 Formation Control . . . . . . . . . . . . . . . . . . . . . . 24
2.2.4 Biologically-Inspired . . . . . . . . . . . . . . . . . . . . . 25
2.3 Previous and Related Work . . . . . . . . . . . . . . . . . . . . . 26
2.3.1 Formal Methods in Relation to Coordination . . . . . . . . 26
2.3.1.1 Multi-Agent Systems . . . . . . . . . . . . . . . . 27
2.3.1.2 Swarm Robots . . . . . . . . . . . . . . . . . . . 27
2.3.1.3 Multi-Robot Systems . . . . . . . . . . . . . . . . 28
II Setting the Problem 32
3 Problem Description 33
3.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2 Formal description of the problem . . . . . . . . . . . . . . . . . . 34
3.3 Application Scenario . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.4 Description of the Proposed Solution . . . . . . . . . . . . . . . . 35
III Foundations 40
4 Theoretical Fundamentals 41
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.2 Threshold Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2.1 An Overview of Response Threshold Model . . . . . . . . 44
xiv
CONTENTS
4.2.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.3 Learning Automata Methods . . . . . . . . . . . . . . . . . . . . . 48
4.3.1 A Brief Introduction . . . . . . . . . . . . . . . . . . . . . 48
4.3.2 Definition of Stochastic Processes . . . . . . . . . . . . . . 49
4.3.3 Basic Definition of Learning Automata . . . . . . . . . . . 51
4.3.4 Stochastic Reinforcement Algorithms based on Reward and
Penalty . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.4 Ant Colony Optimization . . . . . . . . . . . . . . . . . . . . . . . 54
4.4.1 A Brief Introduction . . . . . . . . . . . . . . . . . . . . . 54
4.4.2 Biological Inspiration . . . . . . . . . . . . . . . . . . . . . 56
4.4.3 The Ant System Approach . . . . . . . . . . . . . . . . . . 58
IV Experimentation and Conclusions 61
5 Experimental Results 62
5.1 Preliminaries of the Experimentation . . . . . . . . . . . . . . . . 63
5.1.1 Evaluation of the Performance Index . . . . . . . . . . . . 63
5.1.1.1 Additive Noise Generation . . . . . . . . . . . . . 64
5.1.1.2 Dynamic Tasks Generation . . . . . . . . . . . . 64
5.2 Experiments with Threshold Models . . . . . . . . . . . . . . . . 65
5.2.1 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.2.2 Evaluation of the Approach with Additive Noise . . . . . . 66
5.2.3 Evaluation of the Approach with dynamic tasks . . . . . . 67
5.2.4 Results and Discussion . . . . . . . . . . . . . . . . . . . . 67
5.3 Experiments with Learning Automata-based Probabilistic Algo-
rithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.3.1 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.3.2 Evaluation of the Approach with Additive Noise . . . . . . 68
5.3.3 Evaluation of the Approach with Dynamic Tasks . . . . . 69
5.3.4 Results and Discussion . . . . . . . . . . . . . . . . . . . . 71
5.4 Experiments with Ant Colony Optimization-based Deterministic
Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.4.1 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
xv
CONTENTS
5.4.2 Evaluation of the Approach with Additive Noise . . . . . . 71
5.4.3 Evaluation of the Approach with Dynamic Tasks . . . . . 73
5.4.4 Results and Discussion . . . . . . . . . . . . . . . . . . . . 74
6 Conclusions and Further Work 77
6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.2 Future Research Work . . . . . . . . . . . . . . . . . . . . . . . . 80
Bibliography 83
xvi
List of Figures
2.1 Taxonomy: coordination dimensions in multi-robot systems . . . . 14
2.2 Multi-robot system . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3 Box-Pushing Mission [59; 107; 160; 166] and group of mobile robots
designed to work cooperatively lifting columns (http://birg.epfl.
ch/page28710.html) . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4 Exploration in unstructured environments. (a) The Mars explo-
ration rovers, Spirit and Opportunity, with a manipulator arm in
front, (b) a conceptual drawing for robotic rescue of Hubble space
telescope, (c) The Pathfinder rover, Sojourner and (d) Rocky 4. . 24
2.5 Formation Control. (a) Flying in Formation Takes Aircraft Far-
ther, Dylan Ashe (http://www.popsci.com/). In (b) shows im-
age of Vicon cameras overlooking a group of Khepera III robots.
3 cameras shown, 8 cameras total [98] . . . . . . . . . . . . . . . . 25
2.6 Bio-inspired robotics . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1 Experimental scenario . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2 Procedure for the selection of multi-tasks . . . . . . . . . . . . . . 38
4.1 Threshold function . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2 Semi-logarithmic plot with different thresholds (θ = 1, 5, 20, 50)
and with n = 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.3 Interaction of learning automaton with random environment . . . 52
4.4 In [12] presents a experimental setting that shows the shortest path
finding capability of ant colonies . . . . . . . . . . . . . . . . . . . 55
xviii
LIST OF FIGURES
5.1 Learning curves with the evolution of the system performance in-
dex for self-election of tasks using Response Threshold Models with
noise = 0.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.2 Learning curves with the evolution of the system performance in-
dex for self-election of tasks using Response Threshold Models with
noise = 0.25 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.3 Dynamic tasks generation: learning curves with the evolution of
the system performance index for self-election of tasks using Re-
sponse Threshold Models . . . . . . . . . . . . . . . . . . . . . . . 68
5.4 Learning curves with the evolution of the system performance in-
dex for self-election of tasks using Learning Automata-based prob-
abilistic algorithms with noise = 0.10 . . . . . . . . . . . . . . . . 69
5.5 Learning curves with the evolution of the system performance in-
dex for self-election of tasks using Learning Automata-based prob-
abilistic algorithms with noise = 0.25 . . . . . . . . . . . . . . . . 70
5.6 Dynamic tasks generation: learning curves with the evolution of
the system performance index for self-election of tasks using Learn-
ing Automata-based probabilistic algorithms . . . . . . . . . . . . 70
5.7 Learning curves with the evolution of the system performance in-
dex for selfelection of tasks using Ant Colony Optimization-based
deterministic algorithms with noise = 0.10 . . . . . . . . . . . . . 72
5.8 Learning curves with the evolution of the system performance in-
dex for selfelection of tasks using Ant Colony Optimization-based
deterministic algorithms with noise = 0.25 . . . . . . . . . . . . . 72
5.9 Dynamic tasks generation: learning curves with the evolution of
the system performance index using Ant Colony Optimization-
based deterministic algorithms . . . . . . . . . . . . . . . . . . . . 73
5.10 The index k represents the number of tasks expected to be gener-
ated during a time interval for different values of λ and P (X = k)
describes the probability that a value of variable X with a given
probability distribution is equal to k . . . . . . . . . . . . . . . . 74
5.11 Number of tasks performed by each robots . . . . . . . . . . . . . 75
xix
List of Tables
2.1 Taxonomies multi-robot . . . . . . . . . . . . . . . . . . . . . . . 15
5.1 Experiments performed without dynamic tasks and their respective
variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.2 Experiments performed with dynamic tasks and their respective
variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
xx
Part I
Goals and Background
1
Chapter 1
Introduction
A rational and fruitful discussion is
impossible unless the participants share a
common framework of basic assumptions
or, at least, unless they have agreed on
such a framework for the purpose of the
discussion.
Karl R. Popper
SUMMARY: This chapter details the aspects related to the research
area. Section 1.1 mentions the reasons that justify why it is important
to develop this research work. Section 1.2 defines the general and specific
objectives. Section 1.3 presents the main contribution of the thesis and
presents the results obtained that have been presented in several interna-
tional conferences and published in international scientific journals with
peer reviewed. Finally, section 1.4 briefly describes the organization of the
thesis.
2
1.1. MOTIVATION
1.1 Motivation
The systems formed by multiple mobile robots, also known as Multi-Robot Sys-
tems (MRS) are employed for different reasons, however, one of the main moti-
vations is that MRS can be used to increase the system effectiveness in terms of
time and quality, providing greater flexibility in the tasks execution. Generally
speaking, the term multi-robot system includes different types of robotic systems,
for example, several industrial manipulators, mobile robots with manipulators on
board, or team of autonomous vehicles, but, in this thesis, the term will be used
to refer to a team of cooperating mobile robots to carry out the distribution of
heterogeneous multi-tasks.
The problem of coordination in MRS has been discussed in the literature
in many forms; each of the proposed methods are applied for groups of robots
that work closely together to accomplish a task composed of multiple sub-tasks.
As is typical for many complex systems, mathematical models are needed to
obtain tradeoff and accuracy in a system. The main benefits or advantages of
these systems are that the robots are capable of performing multiple tasks with
much greater precision than humans, but mostly because they can be extremely
efficient, they can perform calculations quickly, they can minimize risk and also
complete a task in less time. Probably one of the most promising directions for
research in this area is based on the coordination of multiple robots.
In recent decades, there has been a large amount of research done with respect
to autonomous mobile robots related to the coordination between them [3; 18;
24; 75]. These investigations have been directed toward finding efficient and
robust methods for controlling these groups of mobile robots. With this increase
there has also arisen new problems that require the execution of bigger and more
complex tasks. A very useful solution to this problem is to implement multiple
cooperative robots to accomplish a certain task since the cost is generally lower
for several robots than it would be for one single robot. In addition, a group of
robots is capable to perform many tasks as well as faster than a single independent
robot could ever do.
For example, a group of unmanned aerial vehicles (UAVs) can be deployed to
perform dangerous tasks to improve the chance of success and to study the conse-
3
1.2. THESIS OBJECTIVES
quences in case of a natural disaster. In some applications such as reconnaissance
missions, mine detection, surveillance and rescue victims, groups of robots can
augment and even replace humans in order to avoid possible injury to those that
protect us. During these missions, it is necessary to maintain communication
within the team of robots to carry out successfully the task at hand.
MRS can often deal with tasks that are difficult, if not impossible, to be
accomplished by a single robot. In the context of MRS, one of the challenges is
the need to control, coordinate and synchronize the operation of multiple robots
to perform a specific task. This requires the development of new strategies and
methods to obtain the desired system behavior, by means of, simple rules inspired
by the division of labor in social insects, in order that a group of robots can
perform tasks in an organized and coordinated way.
1.2 Thesis Objectives
This PhD thesis focuses on the self-coordination problem of MRS and in particu-
lar addresses the distribution of heterogeneous multi-tasks in a robust and efficient
manner. We take into account a specifically distributed or decentralized approach
as we are particularly interested in experimenting with truly autonomous and de-
centralized techniques in which the robots themselves are responsible for choosing
a particular task in an autonomous and individual way. In this regard, we have
experimented with different techniques: firstly, the application of the response
threshold models inspired by division of labor in social insects, secondly, the
application of the reinforcement learning algorithm based on learning automata
theory, and finally, ant colony optimization-based deterministic algorithms.
There are different strategies to address the task assignment problem, but in
this thesis is presented different approaches based on self-organizing and biolog-
ically inspired to address the multi-tasks selection instead of multi-tasks assign-
ment. This thesis will attempt, first, to answer the following questions:
• It is posible that agents or robots select the tasks instead of being assigned?
• It is posible to obtain an optimal distribution of the tasks by introducing
noise in the approaches?
4
1.2. THESIS OBJECTIVES
1.2.1 General Objective
The main goal of this PhD thesis is:
“Study, analyze and propose a set of techniques or methods for the
problem of coordinating multi-robot systems, specifically in the dis-
tribution of heterogeneous multi-tasks, and experimenting with dif-
ferent approaches based chiefly on self-organization and emergence
that is biologically inspired.”
1.2.2 Specific Objectives
The main goal is decomposed into several objectives, then, we establish the fol-
lowing specific objectives for this research:
• Investigate decentralized approaches inspired by the division of labor in
social insects and apply to the problem of distribution of heterogeneous
multi-tasks in MRS.
• Define the experimental scenario.
• Define the number of robots and the number of tasks in the system.
• Design the auto-assignment algorithm for multi-tasks with response thresh-
old models.
• Design the auto-assignment algorithm for multi-tasks by the reinforcement
learning algorithm based on learning automata theory.
• Design the auto-assignment algorithm for multi-tasks using ant colony optimization-
based deterministic algorithms.
• Analyze the robustness of the approaches by introducing noise to the meth-
ods.
• Generate dynamic tasks over time.
5
1.3. MAIN CONTRIBUTIONS AND PUBLICATIONS
1.3 Main Contributions and Publications
1.3.1 Main Contributions
The thesis presents several contributions to the self-coordination problem of
multi-robot systems in the distribution of heterogeneous multi-tasks with dif-
ferent approaches biologically-inspired. Therefore, the results obtained are based
on papers written that have been presented and published in several international
conferences and journals.
The main contributions of the thesis are:
• A bio-inspired solution based on response threshold models to solve the
problem for self-coordination of multi-robots, through the distribution of
heterogeneous and specialized multi-tasks in multi-robot systems.
• A solution through automata learning-based probabilistic algorithm, that
focuses on the general problem of coordinating multiple robots, specifically,
for the self-coordination in the selection of heterogeneous multi-tasks in
multi-robot systems.
• A solution using two different approaches by applying ant colony optimization-
based deterministic algorithms as well as learning automata-based prob-
abilistic algorithms which addresses the general problem of coordinating
multiple robots specifically for decentralized distribution of multi-tasks in
heterogeneous robot teams.
• A solution using two different approaches by applying response threshold
models and stochastic learning automata to solve the problem correspond-
ing to self-coordination in the distribution of heterogeneous multi-tasks in
multi-robot systems.
• An experimental scenario for all approaches has been proposed in order to
analyze the coordination problem among multiple robots. The robustness of
each method has been studied by the introduction of noise, which perturbs
6
1.3. MAIN CONTRIBUTIONS AND PUBLICATIONS
the number of pending load. The performance index with generation of
tasks over time has also been analyzed.
1.3.2 Publications
The results presented have influenced the contents of this thesis and have been
published in several international conferences and journals. The research re-
sults have been published in the IEEE library, the ACM library, the ISI Web of
Knowledge, Lecture Notes in Computer Science and Lecture Notes in Artificial
Intelligence by Springer-Verlag. The publications are documented in the follow-
ing works:
Journals Publications:
• De Lope, J., Maravall, D. and Quinonez, Y. (2012). Response threshold
models and stochastic learning automata for self-coordination of heteroge-
neous multi-tasks distribution in multi-robot systems. Robotics and Au-
tonomous Systems - Impact Factor: 1.313 [31].
International Conferences Publications:
• Quinonez, Y., De Lope, J. and Maravall, D. (2009). Communication and
coordination of robots teams in dynamic environments. Twelve Interna-
tional Conference on Computer Aided Systems Theory, EUROCAST 2009,
pp. 150–151 [128].
• Quinonez, Y., Baca, J., De Lope, J., Ferre, M. and Aracil, R. (2010). Self-
alignment approach based on cooperative behaviors for the docking process
of modular mobile robots. IEEE International Conference on Electronics,
Robotics and Automotive Mechanics, CERMA 2010, pp. 445–450 [130].
• Quinonez, Y., Maravall, D. and De Lope, J. (2012). Application of self-
organizing techniques for the distribution of heterogeneous multi-tasks in
multi-robot systems. IEEE International Conference on Electronics, Robotics
and Automotive Mechanics, CERMA 2012, pp. 66–71 [133].
7
1.4. THESIS STRUCTURE
Book Chapters Publications
• Quinonez, Y., De Lope, J. and Maravall, D. (2009). Cooperative and com-
petitive behaviors in a multi-robot system for surveillance tasks. Computer
Aided Systems Theory, EUROCAST 2009. Revised Selected Papers, LNCS
5717. R. Moreno-Diaz, F. Pichler, A. Quesada (Eds.) Springer-Verlag,
Berlin Heidelberg, pp. 437–444 [129].
• Quinonez, Y., De Lope, J. and Maravall, D. (2011). Bio-inspired decentral-
ized self-coordination algorithms for multi-heterogeneous specialized tasks
distribution in multi-robot systems. Foundations on Natural and Artificial
Computation, LNCS 6686. J.M. Ferrandez et al. (Eds.) Springer-Verlag,
Berlin Heidelberg, pp. 30–39 [131].
• Quinonez, Y., De Lope, J. and Maravall, D. (2011). Stochastic learning
automata for self-coordination in heterogeneous multi-tasks selection in
multi-robot systems. International Conference on Advances in Artificial
Intelligence, MICAI 2011, Part I, LNAI 7094, pp. 443–453 [132].
• De Lope, J., Maravall, D. and Quinonez, Y. (2012). Decentralized multi-
tasks distribution in heterogeneous robot teams by means of ant colony
optimization and learning automata. International Conference on Hybrid
Artificial Intelligence Systems, HAIS 2012, Part I, LNCS 7208, pp. 103–114
[32].
1.4 Thesis Structure
This document is organized by a set of chapters whose contents are described
briefly as follows.
• Chapter 2. State of the Art
This chapter explains the main features that present the systems formed by
multiple robots, also it introduces an overview of previous related work on
this research, in order to cover all the necessary knowledge and contextu-
alize the associated domains. It presents an overview on the main issues of
8
1.4. THESIS STRUCTURE
multi-robot systems, control architectures, coordination schemes and main
problems between theses systems. In addition, it provides some applications
of robotics that involve different fields using multiple robots, such as: co-
operative manipulation, unstructured environments, formation control and
biologically-inspired. Finally, it describes briefly the main previous works
related with the multi-robot systems and formal methods used.
• Chapter 3. Problem Description
This chapter defines the problem statement of the thesis, it presents a formal
description of the problem and describes the experimental scenario. Finally,
it details the description of the proposed solution to the previously defined
problems.
• Chapter 4. Theoretical Fundamentals
This chapter some mathematical concepts used throughout the thesis are
reviewed. The main objective of the chapter is to describe mathematical
models or probabilistic based on distributed or decentralized approaches in-
spired by division of labor in social insects. It presents a brief introduction
about mathematical models. Firstly, it describes an overview of response
threshold model and specifically a description of mathematical model of re-
sponse thresholds. Secondly, it presents a brief introduction about learning
automata methods, basic definitions of the theory of stochastic processes,
a basic definition of learning automata and stochastic reinforcement algo-
rithms based reward and penalty. And finally, it describes a brief introduc-
tion of the ant colony optimization, biological inspiration and description
of the ant system algorithm.
• Chapter 5. Experimental Results
This chapter we present the experimental results obtained from the applying
of the different decentralized approaches inspired on division of labor in so-
cial insects, such as: the response threshold model, ant colony optimization-
based deterministic algorithms and the learning automata-based probabilis-
tic algorithms. We analyze the results of experimentation, evaluating the
performance index by introducing additive noise to the number of pending
9
1.4. THESIS STRUCTURE
loads and we generated dynamic tasks over time.
• Chapter 6. Conclusions and Further Work
This chapter we present the conclusions of the thesis, and finally, are de-
tailed the future research lines derived from this research work.
10
Chapter 2
State of the Art
Science, despite its incredible advances, is
not and will never be able to explain
everything. It will continue to conquer new
areas that today are beyond our
understanding. But the frontiers of
knowledge, however high these may be
raised, will always have an infinite world
of mystery.
Gregorio Maran
SUMMARY: This chapter explains the main features that present the
systems formed by multiple robots, also it introduces an overview of pre-
vious related work on this research, in order to cover all the necessary
knowledge and contextualize the associated domains. Section 2.1 provides
an overview on the main issues of multi-robot systems, control architec-
tures, coordination schemes and main problems between theses systems.
Section 2.2 presents some applications of robotics that involve different
fields using multiple robots, such as: cooperative manipulation, unstruc-
tured environments, formation control and biologically-inspired. Finally,
section 2.3 serves as an presentation and review of the main previous works
related with the multi-robot systems and formal methods used.
11
2.1. MULTI-ROBOT SYSTEMS
2.1 Multi-Robot Systems
MRS is one of the characteristic applied areas of Artificial Intelligence that has
gotten an amazing growth since its inception until today [50; 55], and it has
developed very significant progress in various fields of application [124], becoming
a fundamental tool to produce, work and perform dangerous jobs on earth and
beyond.
In recent years, MRS are increasingly used in highly dynamic or contradictory
environment to deal with complex tasks [83], are quickly becoming a vast research
area and includes several different topics and ideas, as shown in the various works
[4; 35; 49; 65; 80; 122]. A MRS consists of a set of robots that, in the same
environment, interact with each other to achieve a common goal [53], thus trying
to improve the effectiveness, performance and robustness. These systems provide
greater flexibility in performing tasks and possible fault tolerance. To achieve that
several robots coordinate with each other to perform a specific mission is not a
trivial task, because, they must be designed to operate in dynamic environments
in which we must also take into account the classical problems of autonomous
robotics (e.g. uncertainty and unforeseen changes always present), new difficulties
arising from the influence of the team robots on the environment and the task
goal.
The main advantages of these systems with regard to a single robot is that they
have higher flexibility, efficiency and reliability achieving a more robust behavior
by accomplishing coordinated tasks that are not possible for single robots; they
can perform complex tasks much faster and execute tasks beyond the limits of
single robots. In fact, a multi-robot system may result robust to malfunctions
like unreliable communication and robot failures. Arai et al. [4] and Parker [122]
have identified the following primary research topics within MRS:
• biological inspirations;
• communication;
• architectures, task allocation, and control;
• localization, mapping, and exploration;
12
2.1. MULTI-ROBOT SYSTEMS
• object transport and manipulation;
• motion coordination;
• reconfigurable robots;
• learning
During these years, the scientific community has developed some research
progress in cooperative robotics with respect to mechanisms for coordination
and communication [85]. Dudek et al. [49] present a taxonomy for multi-agent
robotic systems, where proposed a classification based on the size of the team,
communication parameters (communication range, bandwidth and topology), the
reconfigurability of the team, the processing capacity of each member and the
team composition (homogeneous vs. heterogeneous robots).
A taxonomy for the classification of coordination approaches in MRS have
proposed in [53; 80]. They present a classification based on different levels of
coordination (unaware, aware but non coordinated, weakly coordinated, strongly
coordinated systems) and is characterized by two groups of dimensions, that is
the coordination dimension (cooperation, knowledge, coordination and organi-
zation) and the system dimension (communication, team composition, system
architecture and team size). The term dimension refers to specific features that
are grouped together in the taxonomy. Fig. 2.1 shows a hierarchical structure for
the coordination dimensions of the taxonomy. The different levels of the struc-
ture are: A cooperation level, a knowledge level, a coordination level, and an
organization level.
The first level of the taxonomy is concerned with the ability of the system to
cooperate in order to accomplish a specific task. The second level is concerned
with how much knowledge each robot in the system has about the presence of
other robots. The third level is concerned with the mechanism that is used in
order to achieve cooperation in the system. The fourth level is concerned with the
way the decision system is realized within the MRS. Finally, the work in [66] have
presented a taxonomy based on coordination mechanisms and on multi-robot task
allocation.
13
2.1. MULTI-ROBOT SYSTEMS
Cooperative
Aware Unware
Strongly
Coordinated
Weakly
Coordinated
Not
Coordinated
Strongly
Centralized
Weakly
Centralized Distributed
Co
ori
nat
ion
O
rgan
izat
ion
K
no
wle
dg
e C
oo
per
atio
n
Figure 2.1: Taxonomy: coordination dimensions in multi-robot systems
Some researchers have proposed taxonomies or classification systems that al-
low to organize and to control a multi-robot system. Then, in table 2.1 describes
a summary with the most significant features of some taxonomies multi-robot
presented in the literature.
2.1.1 Coordination in Multi-Robot Systems
Coordination is the act of organizing a group of mobile robots that is of fun-
damental importance for any MRS. That is, coordination in MRS imply that a
group of robots working together to accomplish specified actions simultaneously
that can result in the completion of an overall system goal at the global-level.
Cooperation refers to the simultaneous action of two or more agents that work
together and produce the identical effect. In the context of multi-robot systems
cooperation is defined as constructive and synergistic interaction of robots in a
system to exchange information in an intelligent manner and thus achieve the
execution of tasks more quickly and efficiently. In [80; 87] present a explicit defi-
14
2.1. MULTI-ROBOT SYSTEMS
Taxonomy Domain DescriptionYuta et al. [169] Multi-robot Defined from the objectives
and mechanisms of decision.Fulbright et al. [60] Multi-agent Establishes three classifica-
tions according the couplingof agents.
Cao [19] Cooperative robots Based on problems and solu-tions of the cooperation.
Balch [8] Multi-robot Useful in systems that em-ploy reinforcement learning(tasks and rewards).
Stone et al. [145] Multi-agent Study the homogeneity ofthe agents and their level ofcommunication.
Todt [154] Multi-robot Based on coordination be-tween robots.
Table 2.1: Taxonomies multi-robot
nition about cooperation and coordination in a MRS as follows:
COORDINATION: Cooperation in which the actions performed by each robot
take into account the actions executed by the other robots in such a way that the
whole ends up being a coherent and high performance operation.
COOPERATION: Situation in which several robots operate together to per-
form some global task that either cannot be achieved by a single robot, or whose
execution can be improved by using more than one robot, thus obtaining higher
performances.
Coordination is an essential characteristic between a groups of robot and is
an important issue of investigation [84], because, they require the development
of new techniques for control and coordination that enable the interaction be-
tween them and with environment to solve problems together. The coordination
between the robots can vary but there are usually four kinds of architectures for
coordinating of multi-robots, which are centralized, distributed, hierarchical and
15
2.1. MULTI-ROBOT SYSTEMS
Hybrid architectures.
2.1.2 Architectures for Multi-robot Systems
Robot architectures are designed to facilitate the concurrent execution of task-
achieving behaviors. At a very low level, robots must be able to react quickly to
dynamic changes in the environment and perform reactive routines in order to
accomplish tasks such as obstacle avoidance. At higher level, robots must be able
to coordinate with each other, performing asynchronous tasks such as cooperative
search or highly synchronized tasks such as cooperative transportation. Several
different kinds of control architectures for MRS have been presented in literature,
however, the main distinction can be done between centralized, hierarchical, de-
centralized, and hybrid [124].
2.1.2.1 Centralized Architectures
Centralized multi-robot systems were developed as a method to coordinate com-
munication between robots and the system. Centralization allows the main pro-
cessing and computational requirements to be removed from the individual robots,
and be completed on an external computer [149]. In centralized systems, a cen-
tral unit collects and manages information about the environment and optimize
the coordination among the robots to ensure the proper achievement of the mis-
sion; moreover, they can easily manage faults of some of the robots. In these
approaches, the central unit plays a key role, because it handles the whole sys-
tem, that is, it has to coordinate the information received by the sensors and
manage global information of the environment, to take all possible decisions and
to communicate with all robots of team, therefore, must be powerful enough to
satisfy all technological requirements.
2.1.2.2 Hierarchical Architectures
Hierarchical architectures are realistic for some applications. In this control ap-
proach, each robot oversees the actions of a relatively small group of other robots,
each of which in turn oversees yet another group of robots, and so forth, down
to the lowest robot, which simply executes its part of the task. This architecture
16
2.1. MULTI-ROBOT SYSTEMS
scales much better than centralized approaches, and is reminiscent of military
command and control. A point of weakness for the hierarchical control architec-
ture is recovering from failures of robots high in the control tree [124].
2.1.2.3 Decentralized Architectures
In Decentralized control architectures, the act of coordination is significantly more
complex [170]. Decentralized multi robot systems have stemmed from the inabil-
ity to adapt a fully centralized system to specific environments. Often the ability
to develop a fully centralized system is difficult due to the number of robots or the
capabilities of the central processor [99] and therefore decentralized systems are
needed. These systems are highly scalable to large multi-robot systems and ap-
plicable to outdoor unknown environments [25]. Decentralized systems can easily
result tolerant to possible faults, however, one major drawback of decentralized
systems is the complexity of the communications network that needs to be devel-
oped between the robots [90], since each robot works independently because the
resources are distributed among all the robots. Each robot uses its own sensors to
extrapolate local information of the environment and the relative position of the
robots closest to take its own decisions; that is, it is more difficult to coordinate
the robots and optimize the execution of the mission, then, a lot of cooperation
should be developed for that the system can work together.
2.1.2.4 Hybrid Arquitectures
Hybrid control architectures combine local control with higher-level control ap-
proaches to achieve both robustness and the ability to influence the entire team’s
actions through global goals, plans, or control. Many multi-robot control ap-
proaches make use of hybrid architectures [124].
For these schemes have been proposed several works in the literature with
experiments on coordination of multi-robot systems [24; 75; 88; 105]. There
are several examples of different multi-robot specific architectures, employing
different control strategies. Below we brief describe three prominent architectures
that have been proposed in literature:
1. The ALLIANCE architecture has been developed by Parker [121], is a
17
2.1. MULTI-ROBOT SYSTEMS
control architecture for fault tolerant, reliable and adaptive to coopera-
tive control of teams of heterogeneous mobile robots performing missions
composed of loosely coupled subtasks that may have ordering dependencies.
ALLIANCE is a fully distributed, behavior-based architecture that incor-
porates the use of mathematically-modeled motivations. The ALLIANCE
architecture is implemented on each robot in the cooperative team, delin-
eates several behavior sets, each of which correspond to some high-level
task-achieving function. The primary mechanism enabling a robot to select
a high-level function to activate is the motivational behavior.
2. The Layered Architecture for coordination of mobile robots was developed
by Simmons et al. [144], is an architecture that enables multiple robots to
explicitly coordinate actions at multiple levels of abstraction. Their layered
architecture has three layers than enables robots to interact directly at the
behavioral level, the executive level and the planning level. This architec-
ture ensures that at all levels the robots utilize coordinated behaviors, co-
ordinated task execution and coordinated planning. Each robot essentially
has these three layers and on an individual robot the layers can exchange
information while on a robot-to-robot basis the synonymous layers (e.g. the
executive layer) talk to each other.
3. The CAMPOUT architecture, designed by Huntsberger et al. [78], is an
architecture that is able to autonomously adapt to the uncertainties of a
dynamic environment. “CAMPOUT is a distributed control architecture
based on a multi-agent behavior-based methodology, wherein higher-level
functionality is composed by coordination of more basic behaviors under the
downward task decomposition of a multi-agent planner. Basically CAM-
POUT provides the infrastructure, tools and guidelines that consolidate a
number of diverse techniques to allow the efficient use and integration of
these components for meaningful interaction and operation”. CAMPOUT
is comprised of five different architectural mechanisms including, behavior
representation, behavior composition, behavior coordination, group coordi-
nation and communication behaviors
The above architectures are but a few of the complex architectures that have
18
2.1. MULTI-ROBOT SYSTEMS
been developed strictly for multi-robot systems, other architectures have been
proposed and presented in [23; 56; 151; 159; 171].
2.1.3 Main Problems among a Group of Robots
Communication plays an important role in multi-robot systems and can increase
their capacity and effectiveness, however, is one of the main problems among a
group of autonomous robots due to its complexity and dynamism as it depends
on environmental conditions as the interaction between themselves. The amount
of information that is exchanged at a time can vary from one problem to an-
other and consequently increases the degree of coordination depending on system
complexity [85].
Communication in a multi-robot system is the ability possessed by members
of the system to transmit and receive information between them, in a system of
multiple robots can be two types of communication [163]: intentional or direct, in
which used dedicated devices to ensure an effective communication. In this first
type, the messages have a defined receiver which it always get the information,
that is, communication is transmitted and received via some sort protocol or
language as a medium. The second type is the non-intentional or indirect, in
which information is transmitted by environmental changes or by visible state of
the agents, also known as stigmery. In this type of communication there is no
specific receptor for messages, that means, agents can leave marks and trails that
can convey information to other agents that will recognize these changes in the
environment.
Several investigations have been directed to the problem of communication
and information flow between multiple robots. Different works focusing on this
problem and have been presented taking into account limited communication
[30; 58; 110] and recently there has been an increased interest about the self-
emergence of a common lexicon in robot teams [96; 101; 104].
19
2.1. MULTI-ROBOT SYSTEMS
2.1.4 Coordination Schemes: Cooperative and Competi-
tive
Cooperative and competitive methods provide a means of coordinating behav-
ioral response for conflict resolution and offer an alternative to competitive. The
coordination can be viewed as a competition among behaviors; this type of com-
petitive strategy can be performed in a variety of ways. Generally, a coordination
function (serving as an arbiter) selects a single behavioral response. The function
can take the form of either a prioritization network (in which a strict behavioral
dominance hierarchy exists) or an action-selection method (in which, on the basis
of sensor information, only the most active behavior is selected).
As we have previously commented a MRS has several advantages over a single
robot, however, there are many problems that need to be considered in a dynamic
environment, for example, multiple moving objects, various obstacles, team mem-
bers, among others. All this makes more difficult to achieve coordination between
robots. Currently one of the main interests of the international community is de-
sign strategies for communication and coordination for MRS, which allow robots
to modify their behavior to cope with the environmental changes or actions per-
formed by other robots, in order to obtain cooperative behavior that allows them
to achieve a common goal.
In previous works [128; 129] we have presented a control architecture to achieve
cooperative and competitive behaviors in a MRS in an unknown environment. It
has established a surveillance scenario where there are two teams of robots: the
red robots must patrol and detect the blue robots in an office-like environment
(see Fig. 2.2). The objective of red robots is to work coordinately in order to
catch the blue robots (cooperative), meanwhile the goal of blue robots is to avoid
be caught by any member of red robots (competitive).
In another work [130], we have proposed two alignment strategies for self re-
configuration of modular mobile robots by means of cooperative behaviors. The
strategies are based on a modular robot system [5; 168] using mobile reconfigu-
rations and simulated to accomplish the task. The cooperative behaviors allow
robots to modify their behavior to cope with environmental changes or actions
performed by other robots, in order to obtain cooperative behavior that allows
20
2.2. FIELDS OF APPLICATION
Figure 2.2: Multi-robot system
them to achieve a common goal. According to the results experimental obtained
in both works, the coordination of multi-robot systems in dynamic environments
require a well-structured control architecture, and to achieve collaborative behav-
ior between members of a system, it needs a combination of behaviors associated
with each robot. The results demonstrates that implementing the cooperative be-
haviors to both robots is the fastest way to achieve self-alignment for the docking
process.
2.2 Fields of Application
Currently, there are many fields of application that require the use of a group of
robots, able to exhibit it more versatile behavior and flexibly to a great variety
of situations. For this reason, research on MRS has increased and is being a field
much studied by several researchers. Traditionally, robotics applications [124]
were focused mainly in the industrial sector (e.g. welding, assembly, processing,
workpiece handling, cutting materials by robot), where the main objective was
the massive automation in services for increase productivity, flexibility, quality,
and above all, to improve security to reduce the risk of people in dangerous tasks.
In the past two decades, application fields of robotics has been extended to
other sectors [17] some examples are: robots for construction [6; 71] (e.g. build-
ings, tunnels, roads, bridges, walls; domestic service robots [29; 137] (e.g. vacuum
cleaners, lawn-mowing, window cleaning, pool funds, tanks, tubes and pipes; de-
21
2.2. FIELDS OF APPLICATION
fense robots, rescue and safety [94; 112; 138] (e.g. rescuing victims, deactivators
mines, fire fighting and explosives, surveillance and security systems; assistive
robots [70; 97] (e.g. helps disabled wheelchair, operational rehabilitation robots,
wearable rehabilitation robots and other welfare functions; robots in medicine
[118; 134] (e.g. diagnostic methods, surgical and interventional robotics, robot-
assisted recovery and rehabilitation, behavioral therapy, personalized care for
special-needs populations.
At the present, applications of multi-robot systems span a broad spectrum
of areas, including human-unreachable environments, such as space, underwater,
and rescue; challenging domains, such as construction and teams of unmanned
aerial vehicles; and adversarial domains, such as robot soccer. Various specific
tasks are addressed, e.g., foraging and coverage of a given area, multi-target
observation, object pushing and transportation, exploration and flocking [158].
There are several areas of research that currently being explored in the field of
MRS, focusing mainly on issues of coordination, cooperation, communication,
localization, resource conflicts, architectures, among others. These applications,
require more than one robot to complete a specific task and are needed to control
the robots simultaneously to ensure synchronicity between them.
MRS have numerous applications and can involve different fields of robotics,
for example, industrial, military and service, or research and study of biological
systems, and they can greatly affect different types of missions, for example,
exploration, box pushing, the military operation, navigation in an unstructured
environment, traffic control, entertainment, simulations of biological systems (see
Fig. ??). In some industrial applications, for example, concern the possibility to
move large objects that hardly a single robot can be sufficiently powerful to push
alone a object and it can not enable to apply forces in all generalized directions.
Therefore, a multi-robot solution can be useful for share the needed power among
multiple robots.
2.2.1 Cooperative Manipulation
Some tasks can require transporting objects (see Fig. 2.3), to achieve that a
team of robots cooperate to carry a large object in an environment containing
22
2.2. FIELDS OF APPLICATION
static and dynamic obstacles, it is not an easy task. Different works about MRS
have been discussed and presented in the literature to achieve this type of mis-
sion generally called Box-Pushing Mission, for example, in [107] are presented
some experimental results of box pushing using two legged robots, in the works
[59; 160; 166] have presented different methods for the problem of transporting
objects by multiple mobile robots, the work in [152] presents an approach to carry
a deformable object by means of two mobile robots with manipulators on board.
Some have addressed the aerial transport of objects using cables [57; 109] and
in [77] have proposed a solution to the problem box-pushing with multiple au-
tonomous robotic fish in an underwater environment. Finally, others have taken
inspiration from ant societies [9; 89].
Figure 2.3: Box-Pushing Mission [59; 107; 160; 166] and group of mobilerobots designed to work cooperatively lifting columns (http://birg.epfl.ch/page28710.html)
2.2.2 Unstructured Environments
The exploration in unknown environments (see Fig. 2.4) with a team of mobile
robots is another kind of application which have been extensively studied in the
literature in many forms. To achieve this mission in a cooperative way, all the
robots must be coordinated to explorer different parts of the environment with
23
2.2. FIELDS OF APPLICATION
goal to cover the whole environment in less time than a single robot. Several
authors proposed multi-robot exploration strategies based on market principles,
in which robots place bids on subtasks of the exploration attempt and does not
require a central agent, in [142] have proposed a distributed bidding algorithm
for multiple robots in exploration tasks and addresses the problem caused by the
limited communication range. The work in [18] presents an approach to explore
an unstructured environment that has been implemented on real robots for differ-
ent environments. Another approaches for coordination of multiple robots using
market- based approach were proposed in [141; 172]
Figure 2.4: Exploration in unstructured environments. (a) The Mars explorationrovers, Spirit and Opportunity, with a manipulator arm in front, (b) a conceptualdrawing for robotic rescue of Hubble space telescope, (c) The Pathfinder rover,Sojourner and (d) Rocky 4.
2.2.3 Formation Control
Research on formation control involves a collection of decision making agents
with limited processing capabilities, locally sensed information, and limited inter-
agent communications, all seeking to achieve a collective objective (see Fig. 2.5).
In the recent years, there is growing interest in distributed control due to its
many advantages such as energy saving, scalable property and robustness [92;
24
2.2. FIELDS OF APPLICATION
95]. Formation control is one of the most studied problems in MRS and many
researchers start working on the consensus based formation control [20; 36; 54;
127; 135; 164]. In the leader-follower approach, each robot is assigned a leader
from which it must maintain certain constraints [27; 52; 67; 147; 148; 162].
Figure 2.5: Formation Control. (a) Flying in Formation Takes Aircraft Farther,Dylan Ashe (http://www.popsci.com/). In (b) shows image of Vicon camerasoverlooking a group of Khepera III robots. 3 cameras shown, 8 cameras total [98]
2.2.4 Biologically-Inspired
The field of application in multi-robot systems has increased in recent years,
several investigations have focused on the applications of biological inspiration
as they provide fascinating examples of functional collective behavior [119; 136],
characterized by rapid changes, high uncertainty, indefinite richness, and limited
availability of information. These examples have been useful to study and apply
these findings to the design of multi-robot systems. The first works inspired in
the behavior of social insects (e.g., ants, bees, birds and fishes) in relation to the
study of group behavior have been presented in [91; 106; 120]. Most bio-inspired
robots are designed for specific tasks and for different environments (see Fig.
2.6), in order to cope with uncertain situations and react quickly to unforeseen
changes in the environment. Pfeifer et al. [125] have presented a study about
self-organization, embodiment and biologically inspired robotics.
25
2.3. PREVIOUS AND RELATED WORK
Figure 2.6: Bio-inspired robotics
2.3 Previous and Related Work
Several researchers have addressed the problem of coordination in MRS, currently,
there are several studies that focus mainly on the coordination of a set of robots
using different techniques, in order to solve a specific problem. In the following
subsections, we review some potential trends of research articles related with
the coordination of multi-agent systems, swarm robots and multi-robot systems.
In particular, we focus on previous and related work to coordination in MRS,
reviewing some of the approaches to coordination that employ formal methods.
2.3.1 Formal Methods in Relation to Coordination
In the last decade, there has been an increasing interest in systems comprised
of several autonomous mobile robots, and as a result, there has been a substan-
tial amount of development in this field; several researchers have studied the use
of formal methods for the coordination and control of MRS. These works focus
mainly on the coordination of a set of robots using different techniques, in order
to solve a specific problem. With regard to the optimal tasks assignment prob-
lem, a brief review of some potential trends of research articles related to the
coordination of multi-agent systems, swarm robots and multi-robot systems will
26
2.3. PREVIOUS AND RELATED WORK
be presented here. The discussion is focused on the recent literature in the area
coordination with multiple robots.
2.3.1.1 Multi-Agent Systems
Researches in multi-agent systems about self-organization and emergence focus
on naturally inspired approaches [62; 82] and socially-based approaches [72], have
been studied and experimented with several mechanisms leading to self organi-
zation [10].
Price and Tino suggest a number of strategies to address problems of task
allocation in multi-agent systems, based on the principle of self-organization of
social insects through the mathematical model developed by Bonabeau. They
make a comparison of decentralized algorithms (FIFO and Greedy) to measure
and evaluate the effectiveness of each strategy to process the mail and at the
same time minimizing the number of changes [126]. The problem that has been
considered for these algorithms of adaptation is a variation of the mail retrieval
proposal by Bonabeau.
Shang and Wang [140] have applied a similar problem of congestion of public
resources in multi-agent systems: the famous “El Farol” bar problem in which
a population of N agents have to self-coordinated respect to attendance at a
place with limited capacity C, much lower than N. This strategy provides a sim-
ple mechanism for a large collection of decentralized decision makers to solve a
complex congestion problem.
Agassounon and Martinoli [2] have proposed a system for collecting objects,
similar to response threshold completely deterministic, that is, when the stimulus
exceeds a threshold determined immediately begins the execution of the task. In
this case, it uses the time to find an object as stimulus to decide whether a robot
should to run the task or rest.
2.3.1.2 Swarm Robots
In general, researchers in swarm robotics are inspired by the decentralized self-
organizing biological systems and collective behavior of social insects in particular
27
2.3. PREVIOUS AND RELATED WORK
[68]. Swarm robotics is a novel approach to robotics which tries to circumvent
problems with classical, monolithic robots like inflexibility and individual com-
plexity by applying the principles of swarm intelligence to the field of robotics
[44]. Typically these systems are composed of robots that, at the individual level,
have relatively limited capacity to solve the task and limited knowledge about
their environment. The general paradigm is often referred to as swarm intelli-
gence [16; 47; 61].
Baglietto et al. have presented a coordination approach to swarm robots both
navigation and task allocation based on RFID (Radio Frequency Identification,
RFID). RFID devices are distributed a priori in the environment by building a
navigation chart; each RFID device contains navigation instructions that allow
the robots to run the routes from one place to another. Robots cannot commu-
nicate with each other, but may do so indirectly by writing and reading RFID
devices. To perform the distributed task allocation algorithm defines an auction,
where the central server takes work to be undertaken by a team of robots, ana-
lyzes and decides the number of robots, then robots are informed about the new
tasks The allocation is the result of negotiations that each robot makes its own.
Similarly using RFID devices to communicate, leaving registration messages be-
tween them, for example, messages and records assignments and out of zones.
The system has been implemented in Player/Stage and navigation algorithm has
been tested in MATLAB [7].
In the study by Yang et al. [167] have proposed a foraging mission in swarm
robots, using mechanisms of response threshold with a nondeterministic selection
of the task to be performed. Experiments have been implemented in TeamBots.
2.3.1.3 Multi-Robot Systems
One of the most popular approaches based on auction market mechanisms for
the coordination of multi-robot systems was introduced by Dias and Stentz [33]
in 2000. They consider that in multi-robot systems based on auctions, the robots
are designed as agents of their own interests operating in a virtual economy. The
28
2.3. PREVIOUS AND RELATED WORK
tasks are assigned to the robots through the auction market mechanisms, for each
task the complete robot generates some income that are reflected in the form of
virtual money for providing a service to the team. However, when executing a
task, the robot consumes resources such as fuel or network bandwidth, therefore,
requires some expenses to pay for the resources used to complete the task. In
2004 [34] Dias has developed a coordination mechanism called Traderbots, which
is designed to inherit the effectiveness and flexibility of a market economy. In
this approach, were made some improvements in relation to the estimated costs
to improve the efficiency of the team, then, in 2006 [86] this mechanism was ap-
plied in teams of harvesting to search treasure in an unknown environments.
Shiroma and Campos have proposed a framework for coordination and dis-
tribution of tasks between a set of heterogeneous mobile robots called CoMutaR
(Coalition formation based on Multi-tasking robots), allowing the robots to per-
form multiple tasks same time. It is based on the Contract Net Protocol to form
coalitions concurrent through actions, use an auction process of a single round.
They considered two specific experiments: (1) that two robots cooperate to push
a box and (2) that a set of three tasks are performed by two robots [143].
Gerkey and Mataric have proposed an auction method for multi-robot coor-
dination in their MURDOCH system [64]. A variant of the Contract Net Pro-
tocol, MURDOCH produces a distributed approximation to a global optimum
of resource usage. The work basically shows the effectiveness of distributed ne-
gotiation mechanisms such as MURDOCH for coordinating physical multi-robot
systems. In most of the previous work, the communication between robots is as-
sumed to be perfect, which makes their algorithms unable to handle unexpected,
occasional communication link breakdowns.
Song et al. have proposed a Distributed Bidirectional Auction algorithm for
multi-robot systems coordination. A task is divided into n sub-tasks, a robot can
only run a sub-task, the allocation of sub-tasks is decided by both the auctioneer
and bidder; the auctioneer chooses the pre-winners ordering the prices of offer,
while the bidders chosen all tasks that pre-won the sub-task which has the lowest
29
2.3. PREVIOUS AND RELATED WORK
price. After the first round, the sub-tasks that were not chosen by any bidder
enters a second round of auction depending on the initial price auction, this pro-
cess is repeated until all sub-tasks have been completed [146].
In [93] Lim et al. have presented an architecture based on the auction market
for the cooperation of a team of robots. On this platform, each team of robots
is controlled by a respective MRS Client program and communicating through
ZigBee Wireless Personal Area Network (WPAN). Each WPAN is assigned with
a different identity (ID) so the data security of communicated information be
preserved. A client program that acts as a buyer is used to deliver the tasks for
users in the market. Then, a server program of tasks coordination is used to com-
pare the buyers’ demand matches the supply from sellers. These programs are
based on client/server architecture and are connected through Local Area Net-
work (LAN) using Transmission Control Protocol (TCP) and Internet Protocol
(IP).
30
Part II
Setting the Problem
32
Chapter 3
Problem Description
Most of the fundamental ideas of science
are essentially simple, and may, as a rule,
be expressed in a language comprehensible
to everyone.
Albert Einstein
SUMMARY: This chapter defines the problem statement proposed in
this thesis. Section 3.1 establishes the idea of research, describing the
issues related to the problem of coordinating a team of robots, mainly, the
task assignment between them. Section 3.2 presents a formal description
of the problem. Section 3.3 shows the experimental scenario established to
carry out the experiments with different decentralized approaches. Finally,
section 3.4 details the description of the proposed solution to the previously
defined problems.
33
3.1. PROBLEM STATEMENT
3.1 Problem Statement
The topics of research on MRS have been studied by several researchers of the
scientific community due to the complexity of these systems. The problem of
coordinating a team of robots involves a series of challenges that going beyond
manipulation, modeling and navigation of the robot, that means, to accomplish
a large task which can be divided into smaller parallel subtasks where a group
works on an individual subtask. For example, in some works presented by Zlot et
al. have been demonstrated the ability to handle task decomposition and loosely
coordinated tasks using market-based techniques [173; 174].
Task assignment implies determining the order in which sub-tasks should be
completed, groups that must meet each sub-task, and robots that should belong
to which groups. Once the task assignment is completed, robots should be found
with their new groups. In addition, groups should be able to communicate with
other groups to ensure that the overall task is completed.
In MRS, optimal task/job allocation or assignment is an active research prob-
lem, in which several central or global allocation methods have been proposed
[79]. The probabilistic approaches have been used to solve major challenges of
mobile robotics, getting some new and innovative solutions to important problems
such as navigation, localization, tracking and robot control. This approach could
be applied to the problem of coordinating multiple robots to the self-election of
heterogeneous specialized tasks.
3.2 Formal description of the problem
The optimal multi-task selection problem in multi-robot systems can be formally
defined as follows:
• “Let L = l1(t), l2(t), ..., lJ(t) be the different specialized tasks. Each
lj ∈ L has a number of j jobs or pending loads where J = j1, j2, ..., jK.Let R = r1, r2, ..., rN be the set of N heterogeneous mobile robots. We
made several assumptions concerning the problem description mentioned
above; we have supposed that all members R = r1, r2, ...rN are able to
participate in any jobs or pending loads lj”.
34
3.3. APPLICATION SCENARIO
Perform the multi-tasks selection in order to obtain an optimal distribution
of a robot team formed by N heterogeneous robots with K different robots roles
or robots jobs among the K different types of heterogeneous specialized tasks
or equivalently, in such a way that the robots themselves, autonomously and in
an individual manner, select a particular task such that all the existing tasks
L = l1(t), l2(t), ..., lK(t) are optimally executed in the shortest time.
3.3 Application Scenario
We have established the following experimental scenario (Fig. 3.1) in order to
analyze a particular strategy or solution for the coordination of multi-robot sys-
tems as regards the optimal distribution of the existing tasks. Given a set of N
heterogeneous mobile robots in a region, achieving an optimal distribution for
different types of tasks. The set of N robots will form sub-teams for each type
of task lj. The sub-teams are dynamic over time, i.e. the same robots will not
be always part of the same sub-team, but the components of each sub-team can
vary depending on the situation.
Most of the proposed solutions in the technical literature are of a centralized
nature, in the sense that an external controller is in charge of distributing the tasks
among the robots by means of conventional optimization methods and based on
global information about the system state [65]. However, we are mainly interested
on truly decentralized solutions in which the robots themselves, autonomously
and in an individual and local manner, select a particular task so that all the tasks
are optimally distributed and executed. In this regard, we have experimented with
different techniques; first, the application of response threshold models inspired
by division of labor in social insects, secondly, the application of reinforcement
learning algorithm based on learning automata theory, and finally, ant colony
optimization-based deterministic algorithms.
3.4 Description of the Proposed Solution
Research in multi-robot systems has increased considerably to the point that
systems with hundreds of robots have been proposed [75; 88]. To accomplish a
35
3.4. DESCRIPTION OF THE PROPOSED SOLUTION
Distribution of Robots
When the task is completed, the robots change to another task.
Distribution Task
Allo
cation
Change task
Robots perform tasks
L1 L2
L3 L4
L1 L2
L3 L4
L1 L2
L3 L4 L4
L1 L2
L3
Figure 3.1: Experimental scenario
given task, the robots must share information, thus, increasing the size of the
team is required an increase in resources (for example: time, sensory efforts and
bandwidth of communication). In this sense, all communication features such
as network topology, the bandwidth of communication, messages coordination
strategies and the traffic of information between the robots represent open issues
for mobile robot applications.
The research of this work is mainly based on the study of the coordination of
multi-robot systems, in particular, the problem of distribution of heterogeneous
multi-tasks. In this sense, with order to resolve this issue raised, and according
to the general objective and specific objectives of this work research, we propose
experimenting with different techniques based chiefly on self-organization and
emergence biologically inspired. Under this approach we can speak of multi-tasks
selection instead of multi-tasks allocation, that means, as the agents or robots
select the tasks instead of being assigned a task by a central controller.
The key element in these algorithms is the estimation of the stimuli and the
adaptive update of the thresholds. This means that each robot performs this
36
3.4. DESCRIPTION OF THE PROPOSED SOLUTION
estimate locally depending on the load or the number of pending tasks to be
performed. In addition, it is very interesting the evaluation of the results in
function each approach comparing the results obtained by introducing of noise
the number of pending loads to simulate the robot’s error in estimating the real
number of pending tasks.
Next in Fig. 3.2 it shows the flow chart of the approaches used to carry out
the multi-tasks selection among a group of robots.
37
3.4. DESCRIPTION OF THE PROPOSED SOLUTION
Start
Define the robots number
and the type of tasks
Generate the total number
of loads for each type of
task
Get the probabilities of each
robot for each task
Select the task
Perform the task
Are there
any more
tasks?
End
Did the
robot completed
the task?
Yes
No
No
Yes
Figure 3.2: Procedure for the selection of multi-tasks
38
Part III
Foundations
40
Chapter 4
Theoretical Fundamentals
When you can measure what you are
speaking about, and express it in numbers,
you know something about it; but when you
cannot measure it, when you cannot
express it in numbers, your knowledge is of
a meagre and unsatisfactory kind.
Lord Kelvin
SUMMARY: This chapter describes mathematical models or probabilis-
tic that have been used according to the problem statement proposed in
this thesis, based on distributed or decentralized approaches inspired by
division of labor in social insects. Section 4.1 presents a brief introduction
about mathematical models. Section 4.2 describes an overview of response
threshold model and specifically a description of mathematical model of re-
sponse thresholds. Section 4.3 explains a brief introduction about learning
automata methods, basic definitions of the theory of stochastic processes,
a basic definition of learning automata and stochastic reinforcement al-
gorithms based reward and penalty. Finally, section 4.4 shows a brief
introduction of the ant colony optimization, its biologically inspired and
description of the ant system algorithm.
41
4.1. INTRODUCTION
4.1 Introduction
The theory of self-organization was originally introduced in the context of physics
and chemistry in order to describe the emergence of macroscopic patterns out
of processes and interactions defined at the microscopic level [14]. This theory
can explain the behavioral aspects of social insects, in particular, it shows how
the complexity of collective behavior of these insects may arise from the inter-
action among individuals who exhibit a simple behavior. Bonabeau and other
researchers say the discovery of the theory of self-organization not only has im-
plications for the study of social insects, but also has been a great tool to transfer
knowledge to the area of distributed artificial systems.
Each day increases the biological influence on research in the field of collective
robotics, because, collective behaviors provide evidence that the systems com-
posed of simple agents can perform complex tasks in the real world. It is known
that the cognitive abilities of these insects are very limited and that complex
behaviors emerge from interactions among a multitude of insects obeying simple
rules, which are mainly based on the existence of social units and individuals that
interact to produce a collective behavior, global and emerging.
In insect societies, many factors contribute to an individuals decision to per-
form a task, including genotypic, environmental, temporal, morphological, physi-
ological, and social factors. However, certain signals can be dominant in stimulat-
ing the performance of a particular task. Within a group of insects, an individual
performs a task if it observes sufficient signals indicating demand for the task to
be performed. These signals might be environmental or in the form of messages
from the fellow members of the society. Such signals can be categorised accord-
ing to the task to perform, hence the name task-associated stimulus or hereafter
stimulus.
In many problems that involve modeling the behavior of some system, we lack
sufficiently detailed information to determine how the system behaves, or the
behavior of the system is so complicated that an exact description of it becomes
irrelevant or impossible. In that case, probabilistic and deterministic models are
often useful.
Probabilistic algorithms are those algorithms that model a problem or search
42
4.1. INTRODUCTION
a problem space using an probabilistic model of candidate solutions. Many meta-
heuristics and computational intelligence algorithms may be considered proba-
bilistic, although the difference with algorithms is the explicit (rather than im-
plicit) use of the tools of probability in problem solving.
In deterministic models good decisions bring about good outcomes. Given a
particular input, it will always produce the same correct output, and the under-
lying machine will always pass through the same sequence of states, therefore,
the outcome is deterministic. One simple model for deterministic algorithms is
the mathematical function; just as a function always produces the same output
given a certain input. The difference is that algorithms describe precisely how
the output is obtained from the input, whereas abstract functions may be defined
implicitly.
In this sense, a learning automata is a model for making adaptation decisions
using only stochastic information, from the environment and not based on detailed
models or estimates of the parameters. It learns to choose the optimal actions of a
specific and finite set of actions called its action set, based only on noisy feedback
from its environment. At each time instant, the automaton randomly chooses an
action of its action set based on its current action probability distribution. Using
the feedback from the environment, the automaton updates the action probability
distribution and uses the updated distribution to select the next action.
Therefore, for both learning automata-based probabilistic algorithms as for
ant colony optimization-based deterministic algorithms the robots take their de-
cision concerning the task election based on the response signals emitted by the
environment: a reward signal whenever the state of affairs is correct, that is, there
is no pending task to be executed and all the robots are busy, and a penalty signal
whenever there are idle robots or pending tasks. On the contrary for response
threshold model the decision is based on its estimation of the current state of
affairs. In the Multi-Agent technical literature this kind of decisions based on
the use by the agents of a model of the state of the world are known as inductive
learning. Following the terminology introduced by Brian Arthur in the El Farol
Bar problem [140].
43
4.2. THRESHOLD MODELS
4.2 Threshold Models
4.2.1 An Overview of Response Threshold Model
Threshold models are based by an understanding of the decentralized mechanisms
that underlie the organization of natural swarms such as ants, bees, birds and
fish. Social insects provide one of the best-known examples of biological self
organized behavior. By means of local and limited communication, they are able
to accomplish impressive behavioral feats: as maintaining the health of the colony
and caring for their young.
A social insect colony operates without any central control, no one is in charge,
and no colony member directs the behavior of another. With this decentralized
way to work, colony exhibits flexibility and robustness, two features that are de-
sirable in an artificial system [89]. Social insect colonies are formed by highly
cooperative groups that are expert at manipulating and exploiting their environ-
ment, defending resources and brood, and allow for task specialization among
group members.
The response threshold model assumes that individuals have inherent thresh-
old to respond to stimuli associated with specific tasks and, in a group, the
individuals with the lowest threshold for a task will perform this task more of-
ten. Division of labor emerges from the differences between individuals in their
thresholds. Different versions of the response threshold model have looked at
the effect of threshold reinforcement [108; 156], colony size [63; 81], number of
tasks [69] and genetic diversity [51]. These studies assume that task stimuli are
well-mixed in the environment; the cues used by individuals to choose tasks are
therefore global.
Insect societies are characterized by the division of labor, communication be-
tween individuals and the ability to solve complex problems [13], and these char-
acteristics have long been a source of inspiration and subject of numerous studies,
acquiring great relevance for many researchers both in the field of robotics as in
biology. On the one hand, the biologists trying to prove their theories of social
insects on robots, and on the other hand, researchers in the discipline of robotics
seek solutions to problems that cannot be solved by a single robot.
44
4.2. THRESHOLD MODELS
Seeley et al. [139] have considered the following experiment to study the
collective behavior in a colony of insects, focusing on the work performed by bees
to get honey. Two food sources are presented to the colony at 8:00 A.M. at the
same distance from the hive: source A is characterized by a sugar concentration
of 1.0 mol/1 and source B by a concentration of 2.5 mol/1. Between 8:00 A.M.
and noon, source A has been visited 12 times and source B, 91 times. At noon,
the sources are modified: source A is now characterized by a sugar concentration
of 2.5 mol/1 and source B 0.75 mol/1. Between noon and 4:00 P.M., source A
has been visited 121 times and source B only 10 times. Have shown that a bee
has a relatively high probability of going to a good food source and abandon a
poor food source.
4.2.2 Model
Based on these observations, these simple rules of behaviors allow the bees to
select the best quality source; Eric Bonabeau 1 et al. have proposed a simple
mathematical model of response thresholds for the regulation of division of labor
in insect societies [15]. In this model it is assumed that each task is associated
with a stimulus or set of stimuli, so that individuals can detect information on
each of the different stimulus intensity (see Fig. 4.1), therefore, can assess the
demand for a particular task when are in contact with the stimulus associated.
Figure 4.1: Threshold function
1http://www.icosystem.com/about-us/management-team/bonabeau/
45
4.2. THRESHOLD MODELS
Let s be the intensity of a stimulus associated with a particular task; s can be
a number of encounters, a chemical concentration, or any quantitative cue sensed
by individuals. A response threshold θ, expressed in units of stimulus intensity, is
an internal variable that determines the tendency of an individual to response to
the stimulus s and perform the associated task. More precisely, θ is such that the
probability of response is low for s < θ and high for s > θ. This mathematical
model that satisfies this requirement is given by:
Tθij(sj) =snj
snj + θnij(n > 1) (4.1)
where Tθij(sj) is the probability of response of the robot ri to execute the task lj;
n > 1 determines the steepness of the threshold, following the recommendations
of works by other authors [2; 15], the value of n in all experiments always equals
2.
Fig. 4.2 shows the values of the equation 4.1 for different values of thresholds
θ. It can be noted more clearly that: for s < θ, the probability of engaging task
performance is close to 0, and for s > θ, this probability is close to 1. Then, the
probability than an individual will perform a task depends on s.
Figure 4.2: Semi-logarithmic plot with different thresholds (θ = 1, 5, 20, 50) andwith n = 2.
46
4.2. THRESHOLD MODELS
The underlying idea is very simple, when a stimulus exceeds the threshold of
response of an individual, that individual is likely to respond to stimuli, and en-
gage in the task because the level of the stimulus associated with that task exceeds
its threshold. The intensity of a stimulus decreases as the individual performs the
task; therefore, individuals with high thresholds are unlikely to perform the task
when other individuals, with lower thresholds, maintain the stimulus intensity
below their thresholds. However, when individuals with low thresholds do not
perform the task, individuals that have high thresholds may engage in the task
performance because the stimulus intensity exceeds their thresholds. Algorithm
1 describes the implementation done for this approach.
Algorithm 1: Algorithm of response threshold for the robot ri1: Input: L = list of tasks unselected2: for all l ∈ L do3: if sj > θij then4: return l (begins running the task lj)5: end if6: end for all7: return null
The tasks can be constant or can be and time-dependent variable. Stimuli
associated with each task can vary considerably from one task to another de-
pending on the nature of tasks, task demand and by number of robots that are
executing the task. Each task is associated with the demand expressed in the
form of a stimulus, when a robot performs a task tends to reduce the intensity
of associated stimulus, and as a result, modifies the intensity of the stimuli for
tasks that is not running.
Each robot r has a set response thresholds θi = θ1, θ2, ..., θN. Each thresh-
old θi corresponds to a task type lj = l1, l2, ..., lJ that the robot is capable of.
The initial values of the threshold are randomized to ensure that their roles are
not predetermined; when a robot engages in performing a task lj, the task asso-
ciated threshold is decremented by a minimum amount, as follows:
θnewi,j = θoldi,j − σ (4.2)
47
4.3. LEARNING AUTOMATA METHODS
And conversely, the thresholds of other tasks that are not running are incre-
mented by a minimum amount, as follows:
θnewi,j = θoldi,j + σ (4.3)
where σ > 0 is a factor of increase or decrease that allows to the thresholds vary
over time, depending on the performance of tasks. Then, Algorithm 2 describes
how the thresholds can vary when a robot engages in performing a task.
Algorithm 2: Algorithm of response threshold for the robot ri1: if just engaged in Lj then2: θnewi,j ← θoldi,j − σ;3: if θnewi,j < θmin then4: θnewi,j ← θmin5: end if6: for i = 0→ N do7: if j 6= i then8: θnewi,j ← θoldi,j + σ;9: if θnewi,j > θmax then
10: θnewi,j ← θmax11: end if12: end if13: end for14: end if
4.3 Learning Automata Methods
4.3.1 A Brief Introduction
Automata models of learning systems introduced in the 1960’s were popularized
as learning automata in a survey paper in 1974 [114]. Learning automata [116]
have been studied and have attracted a considerable interest in last years. The
first researches on learning automata models were developed in Mathematical
Psychology, that describe the use of stochastic automata with updating of action
probabilities which results in reduction in the number of states in comparison
48
4.3. LEARNING AUTOMATA METHODS
with deterministic automata. They can be applied to a broad range of modeling
and control problems, control of manufacturing plants, pattern recognition, path
planning for manipulators, among other. An important point to note is that the
decisions must be made with very little knowledge concerning of the environment,
to guarantee robust behavior without the complete knowledge of the system. In
a purely mathematical context, the goal of a learning system is the optimization
of a function not known explicitly [114].
Learning is defined as any permanent change in behavior as a result of past
experience, and an automata is a machine or control mechanism designed to au-
tomatically follow a predetermined sequence of operations or respond to encoded
instructions [117]. The definition of learning automata is given in [157] as fol-
lowed: “The stochastic automaton attempts a solution of the problem without
any information on the optimal action (initially, equal probabilities are attached
to all the actions). One action is selected at random, the response from the en-
vironment is observed, action probabilities are updated based on that response,
and the procedure is repeated. A stochastic automaton acting as described to
improve its performance is called a learning automaton”.
Stochastic learning automata operating in stationary, as well as, non sta-
tionary random environments have been studied extensively [116; 153]. Various
algorithms have been proposed in the literature (e.g., LR−I algorithm, LR−P
algorithm, pursuit algorithm, etc.) for the automaton to update its action proba-
bility vector [116]. The objective of stochastic learning automata is to determine
how the choice of the action at any stage should be guided by past actions and
responses, so when a specific action is performed the environment provides a
random response which is either favorable or unfavorable [102].
4.3.2 Definition of Stochastic Processes
Stochasticity or uncertainty appears in all systems but so far was not possible
the solution for optimization problems of large systems considering explicitly this.
Uncertainty may be due to lack of reliable data, measurement errors, or treated
in parameters representing information about the future. In deterministic opti-
mization assumes that the parameters of the problem are known with certainty,
49
4.3. LEARNING AUTOMATA METHODS
even its average value. In stochastic optimization is reflected this condition; their
values are not known, only their distributions and it is usually assumed that these
are discrete with a finite number of possible states.
Loosely speaking, a stochastic process is simply a collection of random vari-
ables indexed by time t, taking values from a set T that may be discrete time
(T = 0, 1, 2, · · ·), that is, is a countable collection (usually N) of random vari-
ables indexed by the non-negative integers, in which case it speaks of stochastic
process in discrete time. Also can be continuous time T = [0,∞] or T = [0, a], 0 <
a < ∞), that means, is an uncountable collection (usually T = R) of random
variables indexed by the non-negative real numbers, then it speaks of stochastic
processes in continuous time. It may be denoted by X = X(t, ω); t ∈ T, ω ∈ Ω.A more precise definition may be given as follows.
Definition. A stochastic process is a family of indexed random variables
X = X(t, ω); t ∈ T, ω ∈ Ω defined on a probability space (Ω, F, P ) and taking
values in a measurable space (S,A). Ω is the sample space, F is a sigma algebra
defined on the sample space and P is a probability measure on Ω. T is an arbi-
trary set.
There are many ways of visualizing a stochastic process as follows:
• For each choice of t ∈ T,X(t, ω) is a random variable.
• For each choice of ω ∈ Ω, X(t, ω) is a function of t.
• For each choice of ω and t, X(t, ω is a number.
• In general it is an ensemble (family) of functions X(t, ω) where t and ω can
take different possible values.
A stochastic process X = Xt : t ∈ T can be considered as an application
that depends on two arguments:
X : T × Ω→ S (t, ω)→ X(t, ω) = Xt(ω)
50
4.3. LEARNING AUTOMATA METHODS
Considering fixed t is obtained X(t, ·) = Xt(·) that is a random variable
defined on (Ω, F, P ) and taking values in (S,A).
Let Xt : t ∈ T a stochastic process and t1, . . . , tn a finite subset of T . The
multivariate distribution of the random vector Xt1 , . . . , Xtn is called a finite-
dimensional distribution of the process.
Given a probabilistic space (Ω, F, P ), is called discrete time stochastic process
to any succession of random variables Xnn∈N all defined in the same space,
considering real random variables i.e. Xn : (Ω, F, P ) → (R,B,R). For each
ω ∈ Ω the stochastic process Xnn∈N it obtains a succession of real numbers
that is called trajectory of the process associated to ω.
4.3.3 Basic Definition of Learning Automata
A learning automaton is a sextuple < x,Q, u, ~P (t), G,R >, where x is the finite
set of inputs, Q = q1, q2, . . . , qm is a finite set of internal states, u is the set
of outputs, ~P (t) = p1(t), p2(t), . . . , pm(t) is the state probability vector at time
instant t, G : Q→ u is the output function (normally considered as deterministic
and one-one), and R is an algorithm called the reinforcement scheme, which
generates ~P (t+ 1) from ~P (t) and the particular input at a discrete instant t.
The automaton operates in a random environment and chooses its current
state according to the input received from the environment. The new state prob-
abilities distribution ~P (t+ 1) reflects the information obtained from the environ-
ment. The random environment has a set of inputs u and its set of outputs is
frequently binary 0, 1, with ‘0’ corresponding to the reward response and ‘1’
to the penalty response. If the input to the environment is ui the environment
produces a penalty response with probability ci.
Fig. 4.3 shows the feedback configuration of a learning automaton operating
in a random environment. At each instant t the environment evaluates the action
of the automaton by either a penalty ‘1’ or reward ‘0’. The performance of the
automaton’s behaviors is the average penalty
I(t) =1
m
m∑i=1
pi(t)ci (4.4)
51
4.3. LEARNING AUTOMATA METHODS
which must be minimized. In order to minimize the expectation of penalty (4.4),
the reinforcement scheme modifies the state probability vector ~P . The basic
idea is to increase pi if state qi generates a reward and to decrease pi when the
same state has produced a penalty. A great number of reinforcement schemes for
minimizing the expected value of penalty have been studied and compared. One
of the most serious difficulties that arise in learning automata is the dichotomy
between learning speed and accuracy. If the speed of convergence is increased in
any particular reinforcement scheme, this action is almost invariably accompanied
by an increase of convergence to the undesired state [113; 115].
Learning Automata
Random Environment
q1, q2, ..., qmu1, u2, ..., um
c1, c2, ..., cm
0,1
p(t)
Figure 4.3: Interaction of learning automaton with random environment
4.3.4 Stochastic Reinforcement Algorithms based on Re-
ward and Penalty
In the technical literature a widely used stochastic reinforcement algorithms is
LR−I , which stands for Linear Reward-Inaction algorithm.
Let us suppose that the action chosen by the automaton at instant t is φi, for
the LR−I the updating of the action probabilities is as follows [102]:
pi(t+ 1) = pi(t) + λβ(t) [1− pi(t)] (4.5)
52
4.3. LEARNING AUTOMATA METHODS
pj(t+ 1) = pj(t)− λβ(t)pj(t) ∀j 6= i, 1 ≤ j ≤ N (4.6)
where 0 < λ < 1 is the learning rate and β(t) is the environment’s response:
β = 1 (favorable response or reward) or β = 0 (unfavorable response or penalty
in which case the algorithm do not change the probability, i.e. inaction).
Let’s suppose that there areK different specialized tasks, then we designate by
pij(t), the probability at instant t that robot ri selects task lj these probabilities
hold:
0 ≤ pij(t) ≤ 1;N∑i=1
pij(t) = 1; i = 1, 2, ..., N robots; j = 1, 2, . . . , K tasks (4.7)
Initially, without previous robot’s experience these probabilities are initialized
at the “indifference” position as follow:
pij(0) =1
Kfor i = 1, 2, ..., N robots and j = 1, 2, . . . , K tasks (4.8)
Afterwards it starts the learning process in which each robot updates its elec-
tion probabilities according to the following conventional updating rule:
pij(t+ 1) = pij(t) + λβ(t) [1− pij(t)] (4.9)
where 0 < λ < 1 is the learning rate with a fixed value of 0.2; β(t) is the usual
reward signal generated by the environment of the learning automata with the
following interpretation: β(t) = 1; reward if and only if for the corresponding task
lj at instant t it holds that #Rj(t) ≤ #Lj(t), i.e. the number of robots performing
task lj is lower than the number of tasks lj to be executed; β(t) = 0; penalty
if and only if #Rj(t) > #Lj(t); i.e. when the number of robots performing
task lj is greater than the number of tasks lj or whenever there are not pending
tasks to be executed the automata receives a penalty signal. In few words: at
each instant t the environment evaluates the action of the automata, when the
response generated by environment is 1 means that the action is “favorable” and
53
4.4. ANT COLONY OPTIMIZATION
if the response value is 0 corresponds to an “unfavorable” as follow:
βLj(t) =#Rj
#Lj=
If ≤ 1 then reward β = 1
If > 1 then penalty β = 0(4.10)
4.4 Ant Colony Optimization
4.4.1 A Brief Introduction
For over many years, communities or colonies of social insects have been deeply
studied by some researchers [119; 136], as they provide fascinating examples of
functional collective behavior; and are certainly an example of decentralized res-
olution problems, by the way how these insects perform tasks like finding food,
building or expanding their nests, division of labor, etc. In addition, another
important feature is that they can solve problems in a way very flexible because
it allows adaptation to environmental changes robustly. Therefore, it has devoted
a great deal of research to figuring out how the social insects achieve these feats.
With these researches, has allowed computer scientists to design a variety of “ant
algorithms”, all of which attempt to capture some amazing qualities of social
insects such as self-organization, flexibility, and robustness.
Ant Colony Optimization (ACO) is a meta-heuristic approach that was in-
troduced in the early 1990’s by Marco Dorigo in [37; 38]. Since its introduction
to the present, a growing number of researchers have been involved in further
developing it. The general idea of the ACO approach is to solve combinatorial
optimization problems based on the behavior of real ants, more specifically, the
inspiring source is how ants can find shortest paths between food sources their
nest [12] (see fig. 4.4). ACO algorithms are stochastic search procedures based
on a colony of artificial ants (computational agents) that work cooperatively and
communicate through artificial pheromone trails [43], by means a parameterized
probabilistic model [45] called by the authors “the pheromone model”.
ACO algorithms use a population of artificial ants to construct feasible solu-
tions to a discrete optimization problem. The solutions are evaluated according
to a fitness function and according to a pre-defined rule implant their solution
information in a global memory known as a pheromone mapping where each
54
4.4. ANT COLONY OPTIMIZATION
Figure 4.4: In [12] presents a experimental setting that shows the shortest pathfinding capability of ant colonies
component of the pheromone mapping corresponds to an individual connection
of the problem being optimized. Ant algorithms are based on two essential prin-
ciples [42]: (1) self-organization, in which global behavior arises from a myriad of
low-level interactions, and (2) stigmergy, in which the individuals interact with
one another indirectly using the environment as an intermediary. That is, one
individual changes its surroundings (e.g., by laying a pheromone trail), and other
individuals then react to those changes at a later time.
ACO algorithms imitate the foraging behavior of natural ants and have been
successfully used in several problems, allowing the application of this search
metaphor to the finding of the solutions of hard combinatorial optimization prob-
lems like the travelling salesman problem [26; 39; 48; 150; 161], the quadratic
assignment problem [100], the job shop scheduling problem [39]. Later scientists
55
4.4. ANT COLONY OPTIMIZATION
have applied them to many different discrete optimization problems [1; 21; 22; 41;
165] and also it has been applied to different combinatorial optimization problems
[11; 76].
4.4.2 Biological Inspiration
“Greater understanding of biology in modern times has enabled significant break-
throughs in improving healthcare, quality of life, and eliminating many diseases
and congenital illnesses. Simultaneously there is a move towards imitating nature
and copying many of the wonders uncovered in biology, resulting in biologically
inspired systems” [73]. Biological inspiration can play many different roles, one of
biology’s most important roles is that it can serve as an existence proof of perfor-
mance that some desirable behavior is possible. That is, a biological system may
operate according to principles that have applicability to non-biological comput-
ing problems. By studying the biological system, one may be able to derive or
understand the relevant principles and use them to help solve a non-biological
problem.
Ants are able to find the shortest path between their nest and a food source
following the trail of a chemical substance called “pheromone” [28]. If not
pheromone trails available, ants move randomly, but in the presence of pheromones
they have a tendency to follow the trail, that is, the ants choose the path to follow
by this simple rule: the stronger the pheromone trail, the higher the desirability;
then, the probability of such an event occurring is inversely proportional to the
amount of pheromone and directly proportional to the distance away from the
nest. This behavior allows ants to identify the shortest paths between their nest
and the food source. What is even more amazing is that these emergent proper-
ties seem to exist without the requirement for centralized control [74].
ACO algorithms are based on the following ideas:
• Each path followed by an ant is associated with a candidate solution for a
given problem.
• When an ant follows a path, the amount of pheromone deposited on that
56
4.4. ANT COLONY OPTIMIZATION
path is proportional to the quality of the corresponding candidate solution
for the target problem.
• When an ant has to choose between two or more paths, the path(s) with a
larger amount of pheromone have a greater probability of being chosen by
the ant.
As a result, the ants eventually converge to a short path, hopefully the opti-
mum or a near-optimum solution for the target problem, as explained before for
the case of natural ants. In essence, the design of an ACO algorithm involves the
specification of [38]:
• An appropriate representation of the problem, which allows the ants to
incrementally construct/modify solutions through the use of a probabilistic
transition rule, based on the amount of pheromone in the trail and on a
local, problem-dependent heuristic.
• A method to enforce the construction of valid solutions, that is, solutions
that are legal in the real-world situation corresponding to the problem def-
inition.
• A problem-dependent heuristic function (η) that measures the quality of
items that can be added to the current partial solution.
• A rule for pheromone updating, which specifies how to modify the pheromone
trail (τ).
• A probabilistic transition rule based on the value of the heuristic function
(η) and on the contents of the pheromone trail (τ) that is used to iteratively
construct a solution.
Artificial ants have several characteristics similar to real ants, namely:
• Artificial ants have a probabilistic preference for paths with a larger amount
of pheromone.
• Shorter paths tend to have larger rates of growth in their amount of pheromone.
• The ants use an indirect communication system based on the amount of
pheromone deposited on each path.
57
4.4. ANT COLONY OPTIMIZATION
4.4.3 The Ant System Approach
Ant System was first introduced and applied to TSP by Dorigo et al. [39; 40; 46].
Initially, each ant is randomly put on a city. During the construction of a feasible
solution, ants select the following city to be visited through a probabilistic decision
rule. When an ant k states in city i and constructs the partial solution, the
probability moving to the next city j neighboring on city i is given by:
pkij(t) =
[τij(t)]
α[ηij ]β∑
u∈Jk(i)
[τiu(t)]α[ηiu]
βif j ∈ Jk(i)
0 otherwise
(4.11)
where τij is the intensity of trails between edge (i,j) and ηij is the heuristic visibil-
ity of edge (i,j), and ηij = 1/dij. Jk(i) is a set of cities which remain to be visited
when the ant is at city i. α and β are two adjustable positive parameters that
control the relative weights of the pheromone trail and of the heuristic visibility.
After each ant completes its tour, the pheromone amount on each path will be
adjusted with equation
τij(t+ 1) = (1− ρ)τij(t) + ∆τij(t) (4.12)
In this equation,
∆τij(t) =m∑k=1
∆τ kij(t) (4.13)
∆τ kij(t) =
QLk, if (i, j) ∈ tour done by ant k
0 otherwise(4.14)
(1 − ρ) is the pheromone decay parameter (0 < ρ < 1) where it represents the
trail evaporation when the ant chooses a city and decide to move. Lk is the length
of the tour performed by ant k and m is the number of ants.
In this case, a generic robot ri selects the tasks in a deterministic way based
on “forces” fij(t). These forces are updated, after being initialized at the “indif-
58
4.4. ANT COLONY OPTIMIZATION
ference” position, as follows:
fij(t+ 1) = ρfij(t) + (1− ρ)β(t); 0 ≤ ρ ≤ 1 (4.15)
where ρ is the usual learning rate of ant colony optimization-like algorithms and
β(t) is the reward/penalty signal at instant t with the same exact interpretation
than for the learning automata-based probabilistic algorithms.
59
Part IV
Experimentation and Conclusions
61
Chapter 5
Experimental Results
Always doubt yourself, until the data leaves
no doubt.
Louis Pasteur
SUMMARY: This chapter presents the experimental results obtained
from the applying of the different decentralized approaches inspired by di-
vision of labor in social insects. Section 5.1 details the preliminaries of the
experimentation, the evaluation of the performance index by introducing
additive noise to the number of pending loads and dynamic tasks genera-
tion over time. Section 5.2 presents the experiments with threshold models,
goals, evaluation of the approach with additive noise and dynamic tasks.
Section 5.3 shows the learning curves with the evolution of the system
using learning automata-based probabilistic algorithms including the ex-
periments with additive noise and dynamic task. Section 5.4 describes the
experiments with ant colony optimization-based deterministic algorithms,
presenting the goals, the evaluation of the approach with additive noise
and dynamic tasks.
62
5.1. PRELIMINARIES OF THE EXPERIMENTATION
5.1 Preliminaries of the Experimentation
We have conducted several experiments to evaluate the system performance index
by applying of response threshold models, learning automata-based probabilistic
algorithms as well as ant colony optimization-based deterministic algorithms to
solve the optimal distribution of the tasks among the N robots; so that all of them
are executed by means of the minimum number of robots. The ideal objective
is that the performance index or learning curve corresponding to the load lj(t)
of each task tend asymptotically to zero for all curves in the minimum time and
using the minimal possible number of robots for task execution.
In the simulations we have considered some variants such as: the multi-robot
system size, different loads lj(t) for each type of task, two different ways to carry
out the tasks selection, the additive noise generation to simulate the robot’s error
and the dynamic generation of tasks lj(t) over time. According to the results
obtained with eq. 4.1, eq. 4.9 and eq. 4.15 we have also employed two different
mechanisms for the response threshold model and for the learning automata-based
probabilistic algorithms, for the selection of tasks:
1. Maximum principle (MP): at each instant t choose the task that has the
highest probability for all Tθij(sj), pij(t) and fij(t).
2. The strictly random method (SRM): using the probabilities Tθij(sj), pij(t)
and fij(t) in the strict sense of the word, it generates a random number with
uniform distribution (0− 1) and it selects the appropriate task to the value
obtained by the method of inversion of discrete probability distributions.
5.1.1 Evaluation of the Performance Index
The performance index is an indicator that evaluates the efficiency of each method
concerning the optimal distribution of the existing tasks so that all of them are
executed by means of the minimum number of robots. In other words, the per-
formance index or learning curve for each task is the corresponding load Lj(t)
versus time, the ideal objective being that all these curves tend asymptotically
to zero in the minimum time and also with the additional constraint of using the
minimum possible number of robots for task execution.
63
5.1. PRELIMINARIES OF THE EXPERIMENTATION
For all experiments, the graphics show the performance index for 4 types
of tasks with different loads. Each task is represented by a different color (for
example: task 1 is red, task 2 is blue, task 3 is green and task 4 is purple). The
continuous line means that the evaluation of the performance index is without
noise and with noise is the dotted line.
5.1.1.1 Additive Noise Generation
To evaluate the evolution of the performance index we have introduced additive
noise, perturbing the number of pending loads to simulate the robot’s error in
estimating the real number of pending tasks. The noise generated is modeled
using a normal distribution (“White Noise”) as follows:
Noise = R +R ∗ S = R(1 + S) (5.1)
where Noise is the noise generated to the number of pending loads li(t), which is
proportional to the amplitude of the noise R without perturbing, S is a Gaussian
distribution with a mean of ‘0’ and a typical deviation ‘0.005’ N(0, 0.005).
Table 5.1 and 5.2 shows a scheme of the experiments performed with their
respective variants.
Without Noise With NoiseXXXXXXXXXXXXApproaches
MechanismsMP SRM MP SRM
Not dy-namictasks
Threshold Models Fig.5.1 and Fig. 5.2 Fig.5.1 and Fig. 5.2Learning Automata Fig.5.4 and Fig. 5.5 Fig.5.4 and Fig. 5.5Ant Colony Optimization Fig.5.7 and Fig. 5.8 Fig.5.7 and Fig. 5.8
Table 5.1: Experiments performed without dynamic tasks and their respectivevariants
5.1.1.2 Dynamic Tasks Generation
In the previous experiments, the number of loads for each type of task is deter-
mined from the beginning of the simulation and there is no change until the end
64
5.2. EXPERIMENTS WITH THRESHOLD MODELS
of the execution. To evaluate the performance of the algorithm we have gen-
erated dynamic tasks, that is, new tasks appear in the environment. This idea
was rescued from classical models of queues simulation, so we have used Poisson
distribution to determine the probability of generating a number of tasks through
time:
f(k;λ) =e−λλk
k!(5.2)
Specifically we will have a different distribution for k = 1 to 100. Each λ is a
positive real number that represents the number of tasks expected to be generated
during a time interval. The expected number of tasks generated is decreasing, and
therefore the system is stable, we have parameterized this constant λ as follows:
λ(t) = σ − α ∗ t (5.3)
where σ is the initial value (for example, 10 or 20) and α is a factor of “reduction
tasks” that initially we have defined to 1. Finally, t corresponds the time of
execution at each instant.
Without Noise With NoiseXXXXXXXXXXXXApproaches
MechanismsMP SRM MP SRM
Dynamictasks
Threshold Models Fig.5.3 Fig.5.3Learning Automata Fig.5.6 Fig.5.6Ant Colony Optimization Fig.5.9 Fig.5.9
Table 5.2: Experiments performed with dynamic tasks and their respective vari-ants
5.2 Experiments with Threshold Models
5.2.1 Goals
In this subsection we present the experiments conducted to test the response
threshold model proposed by Bonabeau et al. and are described in subsection
4.2.2, for the problem of heterogeneous multi-tasks distribution in multi-robot
systems; we have introduced additive noise in the number of pending loads and
65
5.2. EXPERIMENTS WITH THRESHOLD MODELS
we have generated dynamic tasks through time. The objective of the experiments
is to analyze the performance index of the system. In the following sections we
describe the experiments performed and the preliminary results obtained.
5.2.2 Evaluation of the Approach with Additive Noise
Fig. 5.1 and Fig. 5.2 show the evolution of the system performance index obtained
for self-selection of heterogeneous specialized tasks through response threshold
models, using both mechanisms: maximum principle and the strictly random
method, with a team of robots formed by 20 – 30 heterogeneous robots and 4
types of heterogeneous specialized tasks with different loads. Each experiment
has been run 10 times and the results shown are the mean of all.
Fig. 5.1 shows the performance index through threshold response models for
the two task selection mechanisms mentioned above and for different values of
noise (noise = 0.10), Fig. 5.2 presents the results obtained with noise = 0.25.
It can be noted that in all cases the generation of additive noise does not affect
the performance of the approach, on the contrary, in most cases better results
are obtained with the generation of noise.
10 20 30 40 50 600
50
100
150
200
250
300
350
400
450
Time
Tas
ks
The strictly random method
5 10 15 20 25 30 35 40 45 500
50
100
150
200
250
300
350
400
450
Tas
ks
Maximum principle
Time
J0 J1 J2 J3Without NoiseWith Noise (0.10)
J0 J1 J2 J3Without NoiseWith Noise (0.10)
Figure 5.1: Learning curves with the evolution of the system performance indexfor self-election of tasks using Response Threshold Models with noise = 0.10
66
5.2. EXPERIMENTS WITH THRESHOLD MODELS
10 20 30 40 50 600
50
100
150
200
250
300
350
400
450
Time
Tas
ks
Maximum principle
10 20 30 40 50 600
50
100
150
200
250
300
350
400
450
Time
Tas
ks
The strictly random method
J0 J1 J2 J3Without NoiseWith Noise (0.25)
J0 J1 J2 J3Without NoiseWith Noise (0.25)
Figure 5.2: Learning curves with the evolution of the system performance indexfor self-election of tasks using Response Threshold Models with noise = 0.25
5.2.3 Evaluation of the Approach with dynamic tasks
Fig. 5.3 shows the evolution of the system performance index with dynamic tasks
generation through time using the Poisson distribution. Experiments have been
performed 10 times and the results shown are the mean of all, additive noise is
also generated in the loads with the maximum principle and the strictly random
method. In the results it can be observed dynamic tasks generation, the tasks
number generated is decreasing over time. All learning curves tend to zero and
not affected the performance by introducing of noise, it can see that better results
are obtained with the maximum principle than with the strictly random method.
5.2.4 Results and Discussion
We have presented and evaluated a method for the multi-tasks distribution among
a team of robots, experimental results show that the proposed method is an
effective method and can be efficiently applied to solve this self-coordination
problem in multi-robot systems.
67
5.3. EXPERIMENTS WITH LEARNING AUTOMATA-BASEDPROBABILISTIC ALGORITHMS
20 40 60 80 100 120 1400
100
200
300
400
500
600
700
800Maximum principle
Time
Tas
ks
20 40 60 80 100 120 1400
100
200
300
400
500
600
700
800The strictly random method
Time
Tas
ks
J0 J1 J2 J3Without NoiseWith Noise (0.20)
J0 J1 J2 J3Without NoiseWith Noise (0.20)
Figure 5.3: Dynamic tasks generation: learning curves with the evolution of thesystem performance index for self-election of tasks using Response ThresholdModels
5.3 Experiments with Learning Automata-based
Probabilistic Algorithms
5.3.1 Goals
Experiments to test the learning automata-based probabilistic algorithms are
described in subsection 4.3.3 and 4.3.4. The approach was tested to evaluate the
performance index of the system with additive noise and dynamic tasks generation
for the same problem of heterogeneous multi-tasks distribution in multi-robot
systems.
5.3.2 Evaluation of the Approach with Additive Noise
In the same way, Fig. 5.4 and Fig. 5.5 present the evolution of the learning
curves obtained for self-selection of heterogeneous specialized tasks through learn-
ing automata-based probabilistic algorithms, using both mechanisms: maximum
principle and the strictly random method. Besides, experiments are formed by
20 – 30 heterogeneous robots and 4 types of heterogeneous specialized tasks with
different loads . Each experiment has been run 10 times and the results shown
are the mean of all.
Fig. 5.4 shows the performance index using Learning Automata-based proba-
68
5.3. EXPERIMENTS WITH LEARNING AUTOMATA-BASEDPROBABILISTIC ALGORITHMS
bilistic algorithms for both mechanisms and for different values of noise (noise =
0.10), Fig. 5.5 shows the results with noise = 0.25. It can be observed that
learning curves corresponding to the load lj(t) of each task tend asymptotically
to zero. However, when it introduced additive noise in this approach can be
clearly seen that in some cases more time is required for the execution of tasks.
10 20 30 40 50 600
50
100
150
200
250
300
350
400The strictly random method
Time
Tas
ks
10 20 30 40 50 600
50
100
150
200
250
300
350
400Maximum principle
Time
Tas
ks
J0 J1 J2 J3Without NoiseWith Noise (0.10)
J0 J1 J2 J3Without NoiseWith Noise (0.10)
Figure 5.4: Learning curves with the evolution of the system performance indexfor self-election of tasks using Learning Automata-based probabilistic algorithmswith noise = 0.10
According to previous results it can be observed that system performance
with the learning automata approach is more affected with the introduction of
noise versus to the results shown in the response threshold models approach.
5.3.3 Evaluation of the Approach with Dynamic Tasks
Fig. 5.6 shows the evolution of the system performance index with dynamic tasks
generation through time using the Poisson distribution. Experiments have been
performed 10 times and the results shown are the mean of all, we have also ad-
ditive noise generated in the loads with the maximum principle and the strictly
random method. In the results, dynamic tasks generation can be observed, in-
dicating that the tasks number generated is decreasing over time. All learning
curves tend to zero in both mechanism and not affect the performance of the
69
5.3. EXPERIMENTS WITH LEARNING AUTOMATA-BASEDPROBABILISTIC ALGORITHMS
5 10 15 20 25 30 35 40 450
50
100
150
200
250
300
350Maximum principle
Time
Tas
ks
5 10 15 20 25 30 35 400
50
100
150
200
250
300
350The strictly random method
Time
Tas
ks
J0 J1 J2 J3Without NoiseWith Noise (0.25)
J0 J1 J2 J3Without NoiseWith Noise (0.25)
Figure 5.5: Learning curves with the evolution of the system performance indexfor self-election of tasks using Learning Automata-based probabilistic algorithmswith noise = 0.25
approach, however, better results are obtained with strictly random method than
with the maximum principle.
0 20 40 60 80 100 120 140 160 180 2000
200
400
600
800
1000
1200
Time
Tas
ks
Maximum principle
0 20 40 60 80 100 120 140 160 180 2000
200
400
600
800
1000
1200
The strictly random method
Time
Tas
ks
J0 J1 J2 J3Without NoiseWith Noise (0.10)
J0 J1 J2 J3Without NoiseWith Noise (0.10)
Figure 5.6: Dynamic tasks generation: learning curves with the evolution of thesystem performance index for self-election of tasks using Learning Automata-based probabilistic algorithms
70
5.4. EXPERIMENTS WITH ANT COLONYOPTIMIZATION-BASED DETERMINISTIC ALGORITHMS
5.3.4 Results and Discussion
We have presented the automata learning-based probabilistic algorithm, applied
to self-coordination problem of multi-robot systems. In particular, it addresses
the distribution of heterogeneous multi-tasks to be executed by a team of het-
erogeneous mobile robots. We have evaluated the robustness of the approach by
introducing noise, disturbing the real number of pending tasks and generating
dynamic tasks over time using Poisson distribution. The results confirm that the
robots are capable to select the existing tasks in an autonomously and individu-
ally manner, without the intervention of any global and central tasks scheduler.
5.4 Experiments with Ant Colony Optimization-
based Deterministic Algorithms
5.4.1 Goals
The goal of the experiments presented in this subsection is to test the ability of
the ant colony optimization-based deterministic algorithms to achieve a distribu-
tion of heterogeneous multi-tasks in multi-robot systems, described in subsection
4.4. The performance index of the system is then evaluated in the experiments
performed, through the introduction of additive noise and the dynamic tasks
generation over time
5.4.2 Evaluation of the Approach with Additive Noise
In this case, Fig. 5.7 and Fig. 5.8 also show the evolution of the system per-
formance index obtained through the ant colony optimization by introducing
additive noise in the number of pending loads (noise = 0.10 and noise = 0.25).
Each experiment has been run 10 times and the results shown are the mean
of all, to carry out the self-election of heterogeneous tasks we have used both
mechanisms: maximum principle and the strictly random method, with a team
of robots formed by 20–30 heterogeneous robots and 4 types of heterogeneous
specialized tasks with different loads.
71
5.4. EXPERIMENTS WITH ANT COLONYOPTIMIZATION-BASED DETERMINISTIC ALGORITHMS
10 20 30 40 50 60 70 800
100
200
300
400
500
600
Maximum principle
Time
Tas
ks
10 20 30 40 50 60 70 800
100
200
300
400
500
600
The strictly random method
Time
Tas
ks
J0 J1 J2 J3Without NoiseWith Noise (0.10)
J0 J1 J2 J3Without NoiseWith Noise (0.10)
Figure 5.7: Learning curves with the evolution of the system performance in-dex for selfelection of tasks using Ant Colony Optimization-based deterministicalgorithms with noise = 0.10
10 20 30 40 50 60 70 800
100
200
300
400
500
600
Maximum principle
Time
Tas
ks
10 20 30 40 50 60 700
100
200
300
400
500
600
The strictly random method
Time
Tas
ks
J0 J1 J2 J3Without NoiseWith Noise (0.25)
J0 J1 J2 J3Without NoiseWith Noise (0.25)
Figure 5.8: Learning curves with the evolution of the system performance in-dex for selfelection of tasks using Ant Colony Optimization-based deterministicalgorithms with noise = 0.25
According to the results shown above in Fig. 5.7 and Fig. 5.8, it can be noted
that in all cases the best results are with the maximum principle method instead
72
5.4. EXPERIMENTS WITH ANT COLONYOPTIMIZATION-BASED DETERMINISTIC ALGORITHMS
of strictly random method. All learning curves tend to zero, however, when
additive noise is introduced to the number of pending tasks, the performance
index of the system is also affected and it can be seen clearly that in most cases
more time is required for the execution of tasks.
5.4.3 Evaluation of the Approach with Dynamic Tasks
Finally, we present the results obtained from the evolution of the system per-
formance index by dynamic tasks generation through time, using the Poisson
distribution, by applying ant colony optimization (see Fig. 5.9). Similarly, ex-
periments have been performed 10 times and the results shown are the mean of
all, we also have additive noise generated in the loads with the maximum principle
and the strictly random method. In the results, dynamic tasks generation over
time can be noted with the tasks number generated decreasing over time. All
learning curves tend to zero in both mechanisms and the introduction of additive
noise does not affect the performance, sometimes results are more optimal with
the introduction of noise.
20 40 60 80 100 120 140 160 180 2000
200
400
600
800
1000
Time
Tas
ks
The strictly random method
20 40 60 80 100 120 140 160 180 2000
100
200
300
400
500
600
700
800
900
1000Maximum principle
Time
Tas
ks
J0 J1 J2 J3Without NoiseWith Noise (0.10)
J0 J1 J2 J3Without NoiseWith Noise (0.10)
Figure 5.9: Dynamic tasks generation: learning curves with the evolution of thesystem performance index using Ant Colony Optimization-based deterministicalgorithms
Fig. 5.10 shows the probability mass function and the cumulative distribution
function obtained in experiments with dynamic tasks generation using the Poisson
distribution.
73
5.4. EXPERIMENTS WITH ANT COLONYOPTIMIZATION-BASED DETERMINISTIC ALGORITHMS
Figure 5.10: The index k represents the number of tasks expected to be generatedduring a time interval for different values of λ and P (X = k) describes theprobability that a value of variable X with a given probability distribution isequal to k
Fig. 5.11 shows a summary of the number of tasks lj performed by each robot
Ri using both mechanisms: maximum principle and the strictly random method,
for the approaches proposed in this thesis. It can clearly see that each robot
specializes in a particular task, and after to complete the current task is moved
to perform another task.
5.4.4 Results and Discussion
We have evaluated the efficiency of the approach concerning the optimal distri-
bution of the existing tasks so that all of them are executed by means of the
minimum number of robots. In the experiments conducted, the performance in-
dex of the system by introducing additive noise and the dynamic tasks generation
over time is evaluated. According to the results obtained, the approach can be
efficiently applied to solve this self-coordination problem in multi-robot systems.
74
5.4. EXPERIMENTS WITH ANT COLONYOPTIMIZATION-BASED DETERMINISTIC ALGORITHMS
0 2 4 6 8 10 12 14 16 18 200
10
20
30
40
50
60
70
80
90
Robots
Num
ber
of t
asks
Maximum principle
0 2 4 6 8 10 12 14 16 18 200
10
20
30
40
50
60
70
80
90
Robots
Num
ber
of t
asks
The strictly random method
Tasks J0 J2 J3 J4Tasks J0 J2 J3 J4
(a) Using the response threshold approach in Fig. 5.1
0 2 4 6 8 10 12 14 16 18 200
10
20
30
40
50
60
70
80
90Maximum principle
Robots
Num
ber
of t
asks
0 2 4 6 8 10 12 14 16 18 200
10
20
30
40
50
60
70
80The strictly random method
Robots
Num
ber
of t
asks
Tasks J0 J2 J3 J4Tasks J0 J2 J3 J4
(b) Using learning automata-based probabilistic algorithms in Fig. 5.4
0 2 4 6 8 10 12 14 16 18 200
10
20
30
40
50
60
70
80
90
100Maximum principle
Robots
Num
ber
of t
asks
0 2 4 6 8 10 12 14 16 18 200
10
20
30
40
50
60
70
The strictly random method
Robots
Num
ber
of t
asks
Tasks J0 J2 J3 J4Tasks J0 J2 J3 J4
(c) Using ant colony optimization-based deterministic algorithms in Fig. 5.7
Figure 5.11: Number of tasks performed by each robots
75
Chapter 6
Conclusions and Further Work
There are two modes of acquiring
knowledge, namely, by reasoning and
experience. Reasoning draws a conclusion
and makes us grant the conclusion, but
does not make the conclusion certain, nor
does it remove doubt so that the mind may
rest on the intuition of truth unless the
mind discovers it by the path of experience.
Roger Bacon
SUMMARY: This chapter summarizes the results of the thesis and con-
cludes by suggesting possible future extensions to the presented work.
77
6.1. CONCLUSIONS
6.1 Conclusions
The research described in this thesis has concerned the coordination of multi-
robot systems; which focuses on the self-coordination problem to the distribution
of heterogeneous multi-tasks using different approaches. In particular, the appli-
cation of response threshold models, the application of reinforcement learning al-
gorithm based on learning automata theory and, finally, ant colony optimization-
based deterministic algorithms. We have focused our interest on truly decen-
tralized solutions in the sense that the robots have to select existing tasks in
an autonomously and individually manner, so that all the tasks are optimally
executed without the intervention of any global and central tasks scheduler. Af-
ter a brief overview on experimental results obtained, we present in detail the
conclusions of this research work as follows:
• We have proposed and presented a bio-inspired solution based on response
threshold models to solve the problem corresponding to the multi-tasks
distribution. More specifically, it addresses the self-election of heteroge-
neous and specialized tasks by autonomous robots, as opposed to the usual
multi-tasks allocation problem in multi-robot systems in which an exter-
nal controller distributes the existing tasks among the individual robots.
According to the results obtained, we have shown that the bio-inspired
threshold model can be efficiently applied to solve this self-coordination
problem in multi-robot systems [131].
• We have proposed and presented a solution through automata learning-
based probabilistic algorithm, applied to the self- coordination problem of
multi-robot systems, taking into account the distribution of heterogeneous
multi-tasks in a team of mobile robots. The performance indexes or learn-
ing curves obtained for each task corresponding to load Li(t) versus time,
confirm that the robots are capable to select the existing tasks in an au-
tonomous and individual manner without the intervention of any global and
central tasks scheduler. We have shown that the algorithm can be efficiently
applied to solve this self-coordination problem in multi-robot systems ob-
taining truly decentralized solutions [132].
78
6.1. CONCLUSIONS
• We have compared two different approaches and we have proposed a solution
to the self-coordination problem of multi-robot systems in the distribution
of heterogeneous multi-tasks by applying Ant Colony Optimization-based
deterministic algorithms as well as Learning Automata-based probabilistic
algorithms. We have evaluated the efficiency of each method concerning the
optimal distribution of the existing tasks so that all of them are executed by
means of the minimum number of robots. According to the results obtained,
we can speak of multi-tasks selection instead of multi-tasks allocation, that
means, as the agents or robots select the tasks instead of being assigned
a task by a central controller. We have shown that both approaches can
be efficiently applied to solve this self-coordination problem in multi-robot
systems obtaining truly decentralized solutions [32].
• Apart of the analysis mentioned above, with the performance indexes achieved
by each approach, we have also analyzed the robustness of each method as
regards the estimation error or noise as it is an important and critical pa-
rameter concerning the practical viability of these methods or this method
in real multi-robots scenarios. We have perturbed the number of pending
load to simulate the robot’s error in estimating the real number of pending
tasks and we have also studied the performance index with dynamic gen-
eration of loads through time. To carry out the selection of tasks in the
approaches we used two mechanisms: maximum principle and the strictly
random method. In most experiments, the best results are obtained with
strictly random method instead of the maximum principle. According to
the results obtained the noise generated does not affect the performance of
the approaches since the best result are obtained by generating noise in the
pending loads [32].
• Finally, we have experimented with response threshold models and learning
automata-based probabilistic algorithms applied to the general problem of
coordinating multiple robots. We have conducted several experiments to
evaluate the evolution of the performance index considering some variants,
such as, the multi-robot system size, different loads for each type of task, two
different ways to carry out the tasks selection, the additive noise generation
79
6.2. FUTURE RESEARCH WORK
to simulate the robot’s error and the dynamic generation of tasks over
time. According to the results obtained the noise generated does not affect
the performance of the response threshold models approach, since the best
result are obtained by generating noise in the pending loads, however, by
applying learning automata-based probabilistic algorithms in some cases
more time is required for the execution of tasks. We have also shown that
both approaches can be efficiently applied to solve this self-coordination
problem in multi-robot systems, obtaining truly decentralized solutions.
6.2 Future Research Work
This PhD thesis describes in detail a study about the coordination problem in
multi-robot systems, but in particular, it addresses the distribution of heteroge-
neous mult-tasks among multiple robots. The solutions presented in this work
were useful to complete the goals proposed at the beginning of this thesis. How-
ever, with the development and the results obtained by the methods proposed,
revisions and improvements that lead to new research lines can be extended in
many ways. Next, we summarized possible future research lines arising from this
PhD thesis as follows:
• We acknowledge the need for more flexible inter-robot and inter-group co-
ordination, because, environments may not always be fully known and the
communication will not be perfect. A major contributor to complexity of
multi-robot problems is task assignment. Therefore, an interesting topic
of research would be to study and test other sophisticated techniques for
optimizing the distribution of multi-tasks.
• With respect to the mathematical part, it would be interesting to perform
the tasks generation following a periodic pattern (hours, days, months,
etc.) through manipulation of sinusoidal functions. In addition, it would
interesting to define tasks with priorities, that is, tasks with penalty costs
due to inactivity in certain tasks or non-compliance with some important
tasks.
80
6.2. FUTURE RESEARCH WORK
• It would be interesting to study and implement these results in some robotic
simulators (e.g. Player&Stage, Pyrobotics, Webots, RoboCup) and specify
which multi-robot simulator will be the most appropriate to carry out the
implementation in real robots.
81
Bibliography
Of the various instruments invented
by man, the most amazing is the
book; all others are extensions of his
body Only the book is an extension
of the imagination and memory.
Jorge Luis Borges
[1] Alaya, I., Solnon., C. and Ghedira, K. (2007). Ant colony optimization
for multi-objective optimization problems. In Proceedings of the 19th IEEE
International Conference on Tools with Artificial Intelligence, pp. 450–457.
56
[2] Agassounon, W. and Martinoli, A. (2002). Efficiency and robustness of
threshold-based distributed allocation algorithms in multi-agent systems.
In 1st International Joint Conference on Autonomous Agents and Multi-
Agents Systems, pp. 1090–1097.27, 46
[3] Arcak, M. (2007). Passivity as a design tool for group coordination. In IEEE
Transactions on Automatic Control, 52(8):1380–1390. 3
[4] Arai, T., Pagello, E. and Parker, L.E. (2002). Guest editorial advances in
multirobot systems. In IEEE Transactions on Robotics and Automation,
volume 18, pages 655–661. 12
[5] Baca, J.A. (2011). A heterogeneous modular robotic system towards the exe-
cution of cooperative tasks. Ph.D. thesis, Universidad Politcnica de Madrid.
20
83
BIBLIOGRAPHY
[6] Baeksuk, C., Kyungmo, J., Youngsu, C., Daehie, H., Myo-Taeg, L., Shin-
suk, P., Yongkwun, L., Sung-Uk, L., Min, C.K. and Kang, H.K. (2009).
Robotic automation system for steel beam assembly in building construc-
tion. In IEEE 4th International Conference on Autonomous Robots and
Agents, pages 655–661. 21
[7] Baglietto, M., Cannata, C., Capezio, F., Grosso, A. and Sgorbissa, A.
(2009). A multi-robot coordination system based on RFID technology. In
IEEE International Conference on Advanced Robotics, pages 1–6. 28
[8] Balch, T. (1998). Taxonomies of multirobot task and reward. Technical Re-
port, Carnegie Mellon University. 15
[9] Berman, S., Lindsey, Q., Sakar, M., Kumar, V. and Pratt, S. (2010). Study
of group food retrieval by ants as a model for multi-robot collective trans-
port strategies. Robotics: Science and Systems, The MIT Press. 23
[10] Bernon, C., Chevrier, V., Hilaire, V. and Marrow, P. (2005). Applications
of self-organising multi-agent systems: an initial framework for comparison.
Informatica, 30:73–82. 27
[11] Blum, C. and Dorigo, M. (2004). The hyper-cube framework for ant colony
optimization. IEEE Transactions on Systems, Man, and Cybernetics -Part
B, 34(2):1161–1172. 56
[12] Blum, C. (2005). Ant colony optimization: introduction and recent trends.
Physics of Life Reviews, 2(4):353–373. xviii, 54, 55
[13] Bonabeau, E., Theraulaz, G. and Deneuborurg, J. (1996). Quantitative
study of the fixed threshold model for the regulation of division of labour
in insects societies. Proceedings Biological Science, pages 1565–1569. 44
[14] Bonabeau, E., Theraulaz, G., Deneubourg, J.L., Aron, S. and Camazine. S.
(1997). Self-organization in social insects. Trends in Ecology & Evolution,
12(5):188–193. 42
84
BIBLIOGRAPHY
[15] Bonabeau, E., Theraulaz, G. and Deneubourg, J. (1998). Fixed response
thresholds and the regulation of division of labor in insect societies. Bulletin
of Mathematical Biology, pages 753–807. 45, 46
[16] Bonabeau, E., Dorigo, M. and Theraulaz, G. (1999). Swarm intelligence:
from natural to artificial systems. New York: Oxford Univ. Press. 28
[17] Braunl, T. (2008). Embedded robotics: mobile robot design and applica-
tions with embedded systems. Springer-Verlag Berlin Heidelberg. 21
[18] Burgard, W., Moors, M., Stachniss, C. and Schneider, F. (2005). Coordi-
nated multi-robot exploration. IEEE Transactions on Robotics, 21(3):376–
386. 3, 24
[19] Cao, Y., Fukunaga, A.S. and Kahng, A.B.(1997). Cooperative mobile
robotics: antecedents and directions. Autonomous Robots, 4:1–23. 15
[20] Cao, Y., Ren, W. and Li, Y. (2009). Distributed discrete-time coordinated
tracking with a time-varying reference state and limited communication.
Automatica, 45(5):1299–1305. 25
[21] Chaharsooghi, S.K. and Meimand Kermani, A.H. (2008). An intelligent
multi-colony multi-objective ant colony optimization (ACO) for the 0-1
knapsack problem. In IEEE Congress on Evolutionary Computation, pages
1195–1202. 56
[22] Chaharsooghi, S.K. and Meimand Kermani, A.H. (2008). An effective
ant colony optimization algorithm (ACO) for multi-objective resource
allocation problem (MORAP). Applied Mathematics and Computation,
200(1):167–177.56
[23] Chaimowicz, L., Sugar, T., Kumar, V. and Campos, M. (2001). An archi-
tecture for tightly coupled multi-robot cooperation. In IIEEE International
Conference on Robotics and Automation, volume 4, pages 2292–2297. 19
[24] Chaimowicz, L., Grocholsky, B., Keller, J.F., Kumar, V. and Taylor, C.J.
(2004). Experiments in multirobot air-ground coordination. In IEEE Inter-
85
BIBLIOGRAPHY
national Conference on Robotics and Automation, volume 4, pages 4053–
4058. 3, 17
[25] Chunyang, L., Yingwei, M. and Chang’an, L. (2009). Cooperative multi-
robot map-building under unknown environment. In Proceedings of the 2009
International Conference on Artificial Intelligence and Computational In-
telligence, volume 3, pages 392–396. 17
[26] Colorni, A., Dorigo, M. and Maniezzo, V. (1991). Distributed optimiza-
tion by ant colonies. In Proceedings of ECAL91 - European Conference on
Artificial Life, pages 134–142. 55
[27] Dai, Y. and Lee, S.G. (2011). Leader-follower formation control based on
hybrid formation control framework and waypoint in cone method. In IEEE
International Conference on Robot, Vision and Signal Processing, pages
233–236. 25
[28] Detrain, C., Deneubourg, J.L. and Pasteels, J. (1999). Decision-making in
foraging by social insects. In C. Detrain, J.L. Deneubourg, and J. Pasteels,
editors, Information Processing in Social Insects. 56
[29] De Almeida, A.T. and Fong, J. (2011). Domestic service robots. IEEE
Robotics and Automation Magazine, 18(3):18–20. 21
[30] De Hoog, J., Cameron, S. and Visser, A. (2010). Dynamic team hierarchies
in communication-limited multi-robot exploration. In IEEE International
Workshop on Safety Security and Rescue Robotics, pages 1–7. 19
[31] De Lope, J., Maravall, D. and Quinonez, Y. (2012). Response threshold
models and stochastic learning automata for self-coordination of hetero-
geneous multi-tasks distribution in multi-robot systems. Robotics and Au-
tonomous Systems, DOI information: 10.1016/j.robot.2012.07.008. 7
[32] De Lope, J., Maravall, D. and Quinonez, Y. (2012). Decentralized multi-
tasks distribution in heterogeneous robot teams by means of ant colony
optimization and learning automata. In Hybrid Artificial Intelligent Sys-
tems, volume 7208, pages 103–114.8, 79
86
BIBLIOGRAPHY
[33] Dias, B. and Stentz, A. (2000). A free market architecture for distributed
control of a multirobot system. In 6th International Conference on Intelli-
gent Autonomous Systems, pages 115–122. 28
[34] Dias, B. (2004). Traderbots: A new paradigm for robust and efficient multi-
robot coordination in dynamics environments. Ph.D. dissertation, Robotics
Institute, Carnegie Mellon University, Pittsburgh. 29
[35] Dias, M.B., Zlot, R., Kalra, N. and Stentz, A. (2006). Market-based
multi-robot coordination: a survey and analysis. Proceedings of the IEEE,
94(7):1257–1270. 12
[36] Dimarogonas, D.V. and Johansson, K.H. (2010). Stability analysis for
multi-agent systems using the incidence matrix: Quantized communication
and formation control. Automatica, 46(4):695–700. 25
[37] Dorigo, M., Maniezzo, V. and Colorni, A. (1991). The ant system: an
autocatalytic optimizing process. Technical Report TR91-016, Politecnico
di Milano. 54
[38] Dorigo, M. (1992). Optimization, learning and natural algorithms. Ph.D.
thesis, Dipartimento di Elettronica, Politecnico di Milano, Milan. 54, 57
[39] Dorigo, M., Maniezzo, V. and Colorni, A. (1996). The ant system: optimiza-
tion by a colony of cooperating agents. In IEEE Transactions on Systems,
Man, and Cybernetics-Part B, 26(1):29–41. 55, 58
[40] Dorigo, M. and Gambardella, L.M. (1997). Ant Colony System: A co-
operative learning approach to the traveling salesman problem. In IEEE
Transactions on Evolutionary Computation, 1(1):53–66. 58
[41] Dorigo, M., Di, C. and Gambardella, L.M. (1999). Ant algorithms for dis-
crete optimization. Artificial Life, 5(2):137–172. 56
[42] Dorigo, M., Bonabeau, E. and Theraulaz, G. (2000). Ant algorithms and
stigmergy. Future Generation Computer Systems, 16(9):851–871. 55
87
BIBLIOGRAPHY
[43] Dorigo, M. and Stutzle, T. (2004). Ant colony optimization. MIT Press,
Cambridge, MA. 54
[44] Dorigo, M. (2005). Swarm-bot: An experiment in swarm robotics. In Proc.
of the 2005 IEEE Swarm Intelligence Symp, pages 192–200. 28
[45] Dorigo, M. and Blum, C. (2005). Ant colony optimization theory: a survey.
Theoretical Computer Science, 344(2-3):243–278. 54
[46] Dorigo, M., Birattari, M. and Stutzle, T. (2006). Ant colony optimization:
artificial ants as a computational intelligence technique. IEEE Computa-
tional Intelligence Magazine, 1(4):28–39. 58
[47] Dorigo, M. and Birattari. M. (2007). Swarm intelligence. Scholarpedia,
2(9):1462. 28
[48] Duan, H. and Xiufen, Y. (2007). SHybrid ant colony optimization us-
ing memetic algorithm for traveling salesman problem. In IEEE Interna-
tional Symposium on Approximate Dynamic Programming and Reinforce-
ment Learning, pages 92–95. 55
[49] Dudek, G., Jenkin, M.R.M., Milios, E. and Wilkes, D. (1996). A taxonomy
for multi-agent robotics. Autonomous Robots, 3(4):375–397. 12, 13
[50] Duro, R.J., Grana, M. and de Lope, J. (2010). On the potential contribu-
tions of hybrid intelligent approaches to Multicomponent Robotic System
development. Information Sciences, 180(14):2635–2648. 12
[51] Eckholm, B., Anderson, K., Weiss, M. and DeGrandi-Hoffman, G. (2011).
Intracolonial genetic diversity in honeybee (Apis mellifera) colonies in-
creases pollen foraging efficiency. Behavioral Ecology and Sociobiology,
65(5):1037–1044. 44
[52] Emrani, S., Dirafzoon, A. and Talebi, H.A. (2011). Leader-follower forma-
tion control of autonomous underwater vehicles with limited communica-
tions. In IEEE International Conference on Control Applications, pages
921–926. 25
88
BIBLIOGRAPHY
[53] Farinelli, R., Iocchi, L. and Nardi, D. (2004). Multirobot systems: A clas-
sification focused on coordination. IEEE Transactions on Systems, Man,
and Cybernetics, Part B, 34(5):2015-2028. 12, 13
[54] Feng, S. and Zhang, H. (2011). Formation control for wheeled mobile robots
based on consensus protocol. In IEEE International Conference on Infor-
mation and Automation, pages 696–700. 25
[55] Ferrandez, J.M., de la Paz, F. and De Lope, J. (2010). Intelligent robotics
and neuroscience. Robotics and Autonomous Systems, 58(12):1221-1222. 12
[56] Fierro, F., Das, A., Spletzer, J., Esposito, J., Kumar, V., Ostrowski, J.P.,
Pappas, G., Taylor, K.J., Hur, Y., Alur, R., Lee, I., Grudic, G. and Southall,
B. (2002). A framework and architecture for multi-robot coordination. The
International Journal of Robotics Research, 21(10-11):977–995. 19
[57] Fink, J., Michael, N., Kim, S. and Kumar, V. (2009). Planning and con-
trol for cooperative manipulation and transportation with aerial robots.
International Symposium on Robotics Research, pages 324–334. 23
[58] Fox, D., Ko, J., Konolige, K., Limketkai, B., Schulz, D. and Stewart, B.
(2006). Distributed multi-robot exploration and mapping. Proceedings of
the IEEE, 95(7):1325–1339. 19
[59] Fujii, M., Inamura, W., Murakami, H., Tanaka, K. and Kosuge, K. (2007).
Cooperative control of multiple mobile robots transporting a single object
with loose handling. In IEEE International Conference on Robotics and
Biomimetics, pages 816–822. xviii, 23
[60] Fulbright, R. and Stephens, L.M. (1994). Classification of multiagent sys-
tems, USC Technical Report ECE 06-94-02. 15
[61] Garnier, S., Gautrais, J. and Theraulaz, G. (2007). The biological principles
of swarm intelligence. Swarm Intelligence, 1(1):3–31. 28
[62] Gabbai, J.M.E., Yin, H., Wright, W.A. and Allinson, N.M. (2005). Self-
organization, emergence and multi-agent systems. In IEEE International
Conference on Neural Networks and Brain, pages 13–15. 27
89
BIBLIOGRAPHY
[63] Gautrais, J., Theraulaz, G., Deneubourg, J.L. and Anderson, C. (2002).
Emergent polyethism as a consequence of increased colony size in insect
societies. Journal of Theoretical Biology, 215(3):363–373. 44
[64] Gerkey, B.P. and Mataric, M.J. (2002). Sold!: auction methods for mul-
tirobot coordination. IEEE Transactions on Robotics and Automation,
18(5):758–768. 29
[65] Gerkey, B.P. and Mataric, M.J. (2003). Multi-robot task allocation: ana-
lyzing the complexity and optimality of key architectures. In IEEE Interna-
tional Conference on Robotics and Automation, volume 3, pages 3862–3868.
12, 35
[66] Gerkey, B. and Mataric, M.J. (2004). A formal analysis and taxonomy of
task allocation in multi-robot systems. International Journal of Robotics
Research, 23(9):939–954.13
[67] Ghommam, J., Mehrjerdi, H. and Saad, M. (2011). Leader-follower forma-
tion control of nonholonomic robots with fuzzy logic based approach for
obstacle avoidance. In IEEE/RSJ International Conference on Intelligent
Robots and Systems, pages 2340–2345. 25
[68] Gordon, D.M. (2007). Control without hierarchy. Nature, 446(7132):143. 28
[69] Gove, R., Hayworth, M., Chhetri, M. and Rueppell, O. (2009). Division of
labour and social insect colony performance in relation to task and mating
number under two alternative response threshold models. Insectes Sociaux,
56(3):319–331. 44
[70] Guglielmelli, E., Johnson, M.J. and Shibata, T. (2009). Guest editorial
special issue on rehabilitation robotics. In IEEE Transactions on Robotics,
volume 25, pages 447–480. 22
[71] Hanjong, J., ChiSu, S., Kyunghun, K., Kyunghwan, K. and Jaejun, K.
(2007). A study on the advantages on high-rise building construction which
the application of construction robots take. In IEEE Control, Automation
and Systems, pages 1933–1936. 21
90
BIBLIOGRAPHY
[72] Hassas, S., Di Marzo-Serugendo, G., Karageorgos, A. and Castelfranchi,
C. (2006). Self-Organising mechanisms from social and business/economics
approaches. Informatica, 30(1):63–71. 27
[73] Hinchey, M.G. and Sterritt, Roy. (2007). 99% (Biological) inspiration.... In
Proceedings of the Fourth IEEE International Workshop on Engineering of
Autonomic and Autonomous Systems, pages 187–195. 56
[74] Hirsh, A.E. and Gordon, D.M. (2001). Distributed problem solving in social
insects. Annals of Mathematics and Artificial Intelligence, 31(1-4):199–221.
56
[75] Howard, A., Parker, L.E. and Sukhatme, G.S. (2006). Experiments with a
large heterogeneous mobile robot team: exploration, mapping, deployment
and detection. The International Journal of Robotics Research, 25(5-6):431–
447. 3, 17, 35
[76] Hu, X., Zhang., J. and Li, Y. (2008). Orthogonal methods based ant colony
search for solving continuous optimization problems. Journal of Computer
Science and Technology, 23(1):2–18. 56
[77] Hu, Y., Wang, L., Liang, J. and Wang, T. (2011). Cooperative box-pushing
with multiple autonomous robotic fish in underwater environment. In IEEE
in IET Control Theory and Applications, volume 5, pages 2015–2022. 23
[78] Huntsberger, T.L., Pirjanian, P., Trebi-Ollennu, A., Nayar, H.D., Aghazar-
ian, H., Ganino, A.J., Garrett, M., Joshi, S.S. and Schenker, P.S. (2003).
CAMPOUT: a control architecture for tightly coupled coordination of mul-
tirobot systems for planetary surface exploration. IEEE Transactions on
Systems, Man and Cybernetics, Part A: Systems and Humans, 33(5):550–
559. 18
[79] Huntsberger, T.L., Trebi-Ollennu, A., Aghazarian, H., Schenker, P.S., Pir-
janian, P. and Nayar, H.D. (2004). Distributed control of multi-robot sys-
tems engaged in tightly coupled tasks. Autonomous Robots, 17(1):79–92.
34
91
BIBLIOGRAPHY
[80] Iocchi, L., Nardi, D. and Salerno, M. (2001). Reactivity and deliberation:
a survey on multi-robot systems. In Balancing Reactivity and Social Delib-
eration in Multi-Agent Systems, pages 9–34. 12, 13, 14
[81] Jeanson, R., Fewell, J.H., Gorelick, R. and Bertram, S. (2007). Emergence
of increased division of labor as a function of group size. Behavioral Ecology
and Sociobiology, 62(2):289–298. 44
[82] Jelasity, M., Babaoglu, O. and Laddaga, R. (2006). Guest editors’ introduc-
tion: self-management through self-organization. IEEE Intelligent Systems,
21(2):8-9. 27
[83] Jones, C., Shell, D., Mataric, M.J. and Gerkey, Brian. (2004). Principled
approaches to the design of multi-robot systems. In Proc. of the Workshop
on Networked Robotics, IEEE/RSJ International Conference on Intelligent
Robots and Systems, pages 71–80. 12
[84] Jones, C. and Mataric, M.J. (2004). The use of internal state in multi-
robot coordination. In Proceedings of the Hawaii International Conference
on Computer Sciences, pages 27–32. 15
[85] Jones, C. and Mataric, M.J. (2005). Behavior-based coordination in multi-
robot systems. Autonomous Mobile Robots: Sensing, Control, Decision-
Making, and Applications. 13, 19
[86] Jones, E., Browning, B., Dias, B., Argall, B., Veloso, M. and Stentz, A.
(2006). Dynamically formed heterogeneous robot teams performing tightly-
coordinated tasks. In IEEE International Conference on Robotics and Au-
tomation, pages 570–575. 29
[87] Khamis, A.M., Kamel, M.S. and Salichs, M.A. (2006). Cooperation: con-
cepts and general typology. In IEEE International Conference on Systems,
Man and Cybernetics, volume 2, pages 1499–1505. 14
[88] Konolige, K., Fox, D., Ortiz, C., Agno, A., Eriksen, M., Limketkai, B., Ko,
J., Morisset, B., Schulz, D., Stewart, B. and Vicent, R. (2006). Centibots:
very large scale distributed robotic teams. In Experimental Robotics IX:
92
BIBLIOGRAPHY
The 9th International Symposium, Springer Tracts in Advanced Robotics,
volume 9, pages 131–140. 17, 35
[89] Kube, R.C. and Bonabeau, E. (2000). Cooperative transport by ants and
robots. Robotics and Autonomous Systems, 30:85–101. 23, 44
[90] Lacroix, P., Polotski, V. and Cohen, Paul. (1999). Decentralized control of
cooperative multi-robot systems. Integrated Computer-Aided Engineering,
6(4):259–274. 17
[91] Langer, D., Rosenblatt, J.K. and Hebert, M. (1994). A Behavior-based
system for off-road navigation. In IEEE Transactions on Robotics and Au-
tomation, volume 10, pages 776–782. 25
[92] Lawton, J.R.T., Beard, R.W. and Young, B.J. (2003). A decentralized ap-
proach to formation maneuvers. IEEE Transactions on Robotics and Au-
tomation, 19(6):933–941. 24
[93] Lim, C., Mamat, R. and Braunl, T. (2009). Market-based approach for
multi-team robot cooperation. In IEEE International Conference on Au-
tonomous Robots and Agents, pages 62–67. 30
[94] Linder, T., Tretyakov, V., Blumenthal, S., Molitor, P., Holz, D., Murphy,
R., Tadokoro, S. and Surmann, H. (2010). Rescue robots at the collapse
of the municipal archive of cologne city: a field report. In International
Workshop on Safety Security and Rescue Robotics, pages 1–6. 22
[95] Liu, S., Chen, C., Xie, L. and Chang, Y.H. (2010). Formation control of
multi-robot systems. In International Conference on Control Automation
Robotics and Vision, pages 1057–1062. 25
[96] Loula, A., Gudwin, R., El-Hani, C.N. and Queiroz, J. (2010). Emergence of
self-organized symbol-based communication in artificial creatures. Cognitive
Systems Research, 11(2):131–147. 19
[97] Low, K.H. (2011). Robot-assisted gait rehabilitation: from exoskeletons to
gait systems. In Defense Science Research Conference and Expo (DSR),
pages 1–10. 22
93
BIBLIOGRAPHY
[98] Macdonald, E.A. (2011). Multi-robot assignment and formation control.
M.S. thesis, Georgia Institute of Technology. xviii, 25
[99] Madhavan, R., Fregene, K. and Parker, L.E. (2002). Distributed heteroge-
neous outdoor multi-robot localization. In IEEE International Conference
on Robotics and Automation, pages 374–381. 17
[100] Maniezzo, V., Dorigo, M. and Colorni, A. (1994). The ant system applied
to the quadratic assignment problem. Technical Report IRIDIA/94-28, Uni-
versit Libre de Bruxelles, Belgium. 55
[101] Maravall, D., De Lope, J. and Domınguez, R. (2010). Self-emergence of
lexicon consensus in a population of autonomous agents by means of evo-
lutionary strategies. In Proceedings of the 5th International Conference on
Hybrid Artificial Intelligence Systems - Volume Part II, pages 77–84. 19
[102] Maravall, D. and De Lope. J. (2011). Fusion of learning automata theory
and granular inference systems: ANLAGIS. Applications to Pattern Recog-
nition and Machine Learning, pages 1237–1242. 49, 52
[103] Maravall, D., De Lope, J. and Domınguez, R. (2011). Coordination of com-
munication in robot teams by reinforcement learning. In Proceedings of the
4th International Conference on Interplay between Natural and Artificial
Computation - Volume Part I, pages 156–164.
[104] Maravall, D., De Lope, J. and Domınguez, R. (2012). Self-emergence of a
common lexicon by evolution in teams of autonomous agents. Neurocom-
puting, 75(1):106–114. 19
[105] Marshall, J.A., Fung, T., Broucke, M.E., Deleuterio, G. and Francis, B.
(2006). Experiments in multirobot coordination. Robotics and Autonomous
Systems, 54(3):265–275. 17
[106] Mataric, M.J. (1993). Designing emergent behaviors: from local interactions
to collective intelligence. In International Conference on From Animal to
Animal: Simulation of Adaptive Behavior, volume 2 pages 432–441. 25
94
BIBLIOGRAPHY
[107] Gerkey, B. and Mataric, M.J. (1995). Cooperative multi-Robot box-
pushing. In IEEE International Conference on Robotics and Automation,
pages 3862–3868. xviii, 23
[108] Merkle, D. and Middendorf, M. (2004). Dynamic polyethism and competi-
tion for tasks in threshold reinforcement models of social insects. Adaptive
Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems,
12(3-4):251–262. 44
[109] Michael, N., Fink, J. and Kumar, V. (2011). Cooperative manipulation and
transportation with aerial robots. Autonomous Robots, 30(1):73–86. 23
[110] Mosteo, A.R., Montano, L. and Lagoudakis, M.G. (2008). Multi-robot rout-
ing under limited communication range. In IEEE International Conference
on Robotics and Automation, pages 1531–1536. 19
[111] Murphy, R.R. (2000). An introduction to AI robotics (intelligent robotics
and autonomous agents), The MIT Press.
[112] Nagatani, K., Okada, Y., Tokunaga, N., Yoshida, K., Kiribayashi, S., Ohno,
K., Takeuchi, E., Tadokoro, S., Akiyama, H., Noda, I., Yoshida, T. and Koy-
anagi, E. (2009). Multi-robot exploration for search and rescue missions: a
report of map building in RoboCupRescue 2009. In International Workshop
on Safety Security and Rescue Robotics, pages 1–6. 22
[113] Narendra, K. and Viswanathan, R. (1972). A two-level system of schotastic
automata for periodic random environments. IEEE Transactions on Sys-
tems, Man, and Cybernetics, pages 285–289. 52
[114] Narendra, K.S. and Thathachar, M.A.L. (1974). Learning automata: a
survey. IEEE Transactions on Systems, Man, and Cybernetics, pages 323–
334. 48, 49
[115] Narendra, K., Wright, E. and Mason, L. (1977). Applications of learning
automata to telephone traffic routing and control. IEEE Transactions on
Systems, Man, and Cybernetics, pages 785–792. 52
95
BIBLIOGRAPHY
[116] Narendra, K.S. and Thathachar, M.A.L. (1989). Learning automata: an
introduction. Englewood Cliffs, NJ: Prentice-Hall, Inc. 48, 49
[117] Obaidat, M., Papadimitriou, G. and Pomportsis, A. (2002). Guest editorial
learning automata: theory, paradigms, and applications. IEEE Transac-
tions on Systems, Man, and Cybernetics, pages 706–709. 49
[118] Okamura, A.M., Mataric, M.J. and Christensen, H.I. (2010). Medical and
health-care robotics. IEEE Robotics and Automation Magazine, 17(3):26–
37. 22
[119] Oster, G. and Wilson, E. (1978). Caste and ecology in the social insects.
Monographs in Population Biology Princeton Univ. Press. 25, 54
[120] Parker, L.E. (1993). Designing control laws for cooperative agent teams.
In IEEE International Conference on Robotics and Automation, volume 3,
pages 582–587. 25
[121] Parker, L.E. (1998). ALLIANCE: An Architecture for Fault Tolerant
Multi-Robot Cooperation. IEEE Transactions on Robotics and Automa-
tion, 14(2):220–240. 17
[122] Parker, L.E. (2003). Current research in multi-robot systems. Journal of
Artificial Life And Robotics, 7(1-2):1–5. 12
[123] Parker, L.E. and Tang, F. (2006). Building Multirobot Coalitions Through
Automated Task Solution Synthesis. Proceedings of the IEEE, 94(7):1289–
1305.
[124] Parker, L.E. (2008). Multiple Mobile Robot Systems. In: Bruno, S., Ous-
sama, K. (eds.) Springer Handbook of Robotics. 12, 16, 17, 21
[125] Pfeifer, R., Lungarella, M. and Iida, F. (2007). Self-organization, embodi-
ment, and biologically inspired robotics. American Association for the Ad-
vancement of Science, volume 318, pages 1088–1093. 25
96
BIBLIOGRAPHY
[126] Price, R. and Tino, P. (2004). Evaluation of adaptive nature inspired task
allocation against alternate decentralised multiagent strategies. PPSN VIII,
LNCS 3242, pages 982–990. 27
[127] Qu, Z., Wang, J. and Hull, R.A. (2008). Cooperative control of dynamical
systems with application to autonomous vehicles. In IEEE Transactions on
Automatic Control, volume 53, pages 894–911. 25
[128] Quinonez, Y., De Lope, J. and Maravall, D. (2009). Communication and
coordination of robots teams in dynamic environments. In Twelve Interna-
tional Conference on Computer Aided Systems Theory - EUROCAST 2009,
pages 150–151. 7, 20
[129] Quinonez, Y., De Lope, J. and Maravall, D. (2009). Cooperative and com-
petitive behaviors in a multi-robot system for surveillance tasks. In Com-
puter Aided Systems Theory - EUROCAST 2009, volume 5717, pages 437–
444. 8, 20
[130] Quinonez, Y., Baca, J., De Lope, J., Ferre, M. and Aracil, R. (2010). Self-
Alignment approach based on cooperative behaviors for the docking process
of modular mobile robots. In Electronics, Robotics and Automotive Mechan-
ics Conference (CERMA), pages 445–450. 7, 20
[131] Quinonez, Y., De Lope, J. and Maravall, D. (2011). Bio-inspired decentral-
ized self-coordination algorithms for multi-heterogeneous specialized tasks
distribution in multi-robot systems. In Proceedings of the 4th International
Conference on Interplay between Natural and Artificial Computation - Vol-
ume Part I, pages 30–39. 8, 78
[132] Quinonez, Y., De Lope, J. and Maravall, D. (2011). Stochastic learning
automata for self-coordination in heterogeneous multi-Tasks selection in
multi-Robot systems. In Advances in Artificial Intelligence, volumen 7094,
pages 443–453. 8, 78
[133] Quinonez, Y., Maravall, D. and De Lope, J. (2012). Application of self-
organizing techniques for the distribution of heterogeneous multi-tasks in
97
BIBLIOGRAPHY
multi-robot systems. In Electronics, Robotics and Automotive Mechanics
Conference (CERMA), pages 66–71.7
[134] Reed, K.B., Majewicz, A., Kallem, V., Alterovitz, R., Goldberg, K., Cowan,
N.J. and Okamura, A.M. (2011). Robot-assisted needle steering. IEEE
Robotics and Automation Magazine, 18(4):35–46. 22
[135] Ren, W. (2010). Consensus tracking under directed interaction topologies:
algorithms and experiments. In IEEE Transactions on Control Systems
Technology, volume 18, pages 230–237.25
[136] Robinson, G. (1992). Regulation of division of labor in insect societies.
Annual Review of Entomology, 37(1):637–665. 25, 54
[137] Sahin, H. and Guvenc, L. (2007). Household robotics: autonomous de-
vices for vacuuming and lawn mowing. IEEE Control Systems Magazine,
27(2):20–90. 21
[138] Santana, P., Barata, J., Cruz, H., Mestre, A., Lisboa, J. and Flores, L.
(2005). A multi-robot system for landmine detection. In IEEE Conference
on Emerging Technologies and Factory Automation, volume 1, pages 721–
728. 22
[139] Seeley, T., Camazine, S. and Sneyd, J. (1991). Collective decision-making in
honey bees: how colonies choose among nectar sources. Behavioral Ecology
and Sociobiology, pages 277–290. 45
[140] Shang, L. and Wang, X.F. (2004). Decentralized PI control for a congestion
game. In IEEE International Conference on Control, Automation, Robotics
and Vision, pages 316–319. 27, 43
[141] Sheng, W., Yang, Q., Ci, S. and Xi, N. (2004). Multi-robot area exploration
with limited-range communications. In IEEE/RSJ International Confer-
ence on Intelligent Robots and Systems, volume 3, pages 1414–1419. 24
[142] Sheng, W., Yang, Q., Tan, J. and Xi, N. (2006). Distributed multi-
robot coordination in area exploration. Robotics and Autonomous Systems,
54(12):945–955. 24
98
BIBLIOGRAPHY
[143] Shiroma, P. and Campos, M. (2009). CoMutaR: A framework for multi-
robot coordination and task allocation. In IEEE/RSJ International Con-
ference on Intelligent Robots and Systems, pages 4817–4824. 29
[144] Simmons, R., Smith, T., Dias, M.B., Goldberg, D., Hershberger, D., Stentz,
A. and Zlot, R. (2002). A Layered architecture for coordination of mobile
robots. In Multi-Robot Systems: From Swarms to Intelligent Automata,
Proceedings from the 2002 NRL Workshop on Multi-Robot Systems, Kluwer
Academic Publishers. 18
[145] Stone, P. and Veloso, M.(2000). Multiagent Systems: A Survey from a
Machine Learning Perspective. Autonomous Robots, 8(3):345-383. 15
[146] Song, T., Yan, X., Liang, A., Chen, K. and Guan, H. (2009). A distributed
bidirectional auction algorithm for multirobot coordination. In IEEE In-
ternational Conference on Research Challenges in Computer Science, pages
145–148. 30
[147] Soorki, M.N., Talebi, H.A. and Nikravesh, S.K.Y. (2011). A robust dynamic
leader-follower formation control with active obstacle avoidance. In IEEE
International Conference on Systems, Man, and Cybernetics, pages 1932–
1937. 25
[148] Soorki, M.N., Talebi, H.A. and Nikravesh, S.K.Y. (2011). Robust leader-
following formation control of multiple mobile robots using Lyapunov re-
design. In 37th Annual Conference on IEEE Industrial Electronics Society,
pages 277-282. 25
[149] Spletzer, J., Das, A.K., Fierro, R., Taylor, C.J., Kumar, V. and Ostrowski,
J.P. (2001). Cooperative localization and control for multi-robot manipu-
lation. In IEEE/RSJ International Conference on Intelligent Robots and
Systems, volume 2, pages 631–636. 16
[150] Stutzle, T. and Hoos, H. (1997). MAX-MIN ant system and local search
for the travelling salesman problem. In IEEE International Conference on
Evolutionary Computation, pages 309–314. 55
99
BIBLIOGRAPHY
[151] Tambe, T., Pynadath, D.V., Chauvat, N., Das, A. and Kaminka, G.A.
(2000). Adaptive agent integration architectures for heterogeneous team
members. In Proceedings of the International Conference on Multiagent Sys-
tems, pages 301–308. 19
[152] Tanner, H.G., Loizo, S.G. and Kyriakopoulos, K.J. (2002). Nonholonomic
navigation and control of cooperating mobile manipulators. In IEEE Trans-
actions on Robotics and Automation, volume 19, pages 53–64. 23
[153] Thathachar, M.A.L. (2002). Varieties of learning automata: an overview.
IEEE Transactions on Systems, Man, and Cybernetics, 32(6):711–722. 49
[154] Todt, E., Rausch, G. and Suarez, R. (2000). Analysis and classification of
multiple robot coordination methods. In IEEE International Conference on
Robotics and Automation, volume 4, pages 3158–3163. 15
[155] The player and stage project: http://playerstage.sourceforge.net.
[156] Theraulaz, G., Bonabeau, E. and Deneubourg, J.L. (1998). Response
threshold reinforcement and division of labour in insect societies. Proceed-
ings of the Royal Society B: Biological Sciences, 265:327–332. 44
[157] Unsal, C. (1997). Stochastic Learning Automata. Chapter 3 of dissertation
intelligence navigation of autonomous vehicles in an automated highway
system: learning methods and interacting vehicles approach”. 49
[158] Veloso, M.M. and Nardi, D. (2006). Special issue on multirobot systems.
Proceedings of the IEEE, 94(7):1253–1256. 22
[159] Volpe, R., Nesnas, I., Estlin, T., Mutz, D., Petras, R. and Das, H. (2001).
The CLARAty architecture for robotic autonomy. In IEEE Proceedings on
Aerospace Conference, volume 1, pages 121–132. 19
[160] Wang, Z., Nakano, E. and Takahashi, T. (2003). Solving function distribu-
tion and behavior design problem for cooperative object handling by multi-
ple mobile robots. IEEE Transactions on Systems, Man, and Cybernetics,
Part A, 33(5):537–549. xviii, 23
100
BIBLIOGRAPHY
[161] Wei, L. and Yuren, Z. (2010). An effective hybrid ant colony algorithm for
solving the traveling salesman problem. In Proceedings of the International
Conference on Intelligent Computation Technology and Automation, valume
1, pages 497–500. 55
[162] Weihua, Z. and Go, T.H. (2010). Robust cooperative Leader-follower forma-
tion flight control. In 11th International Conference on Control Automation
Robotics and Vision, pages 275–280. 25
[163] Xiao-Lin, L., Jing-Ping, J. and Kui, X. (2004). Towards multirobot commu-
nication. In IEEE International Conference on Robotics and Biomimetics,
pages 307–312. 19
[164] Xiao, F., Wang, L., Chen, J. and Gao, Y. (2009). Finite-time formation
control for multi-agent systems. Automatica, 45(11):2605–2611. 25
[165] Yagmahana, B. and Yanisey, M.M. (2008). Ant colony optimization for
multi-objective flow shop scheduling problem. Computers and Industrial
Engineering, 54(3):411–420. 56
[166] Yamashita, A., Arai, T., Ota, J. and Asama, H. (2003). Motion planning of
multiple mobile robots for cooperative manipulation and transportation. In
IEEE Transactions on Robotics and Automation, volume 19, pages 223–237.
xviii, 23
[167] Yang, Y., Zhou, C. and Tian, Y. (2009). Swarm robots task allocation
based on response threshold model. In IEEE International Conference on
Autonomous Robots and Agents, pages 171–176. 28
[168] Yerpes, A., Baca, J., Escalera, J.A., Ferre, M. and Aracil, R. (2008).
Modular robot based on 3 rotational DoF modules. In Proceedings of the
IEEE/RSJ International Conference on Intelligent Robots and Systems,
pages 889–894. 20
[169] Yuta, S. and Premvuti, S. (1992). Coordinating autonomous and centralized
decision making to achieve cooperative behaviors between multiple mobile
101
BIBLIOGRAPHY
robots. In Proceedings of the 1992 lEEE/RSJ International Conference on
Intelligent Robots and Systems, volume 3, pages 1566–1574. 15
[170] Zhang, W. and Hu, J. (2008). Optimal multi-agent coordination under tree
formation constraints. In IEEE Transactions on Automatic Control, volume
53, pages 692–705. 17
[171] Zhu, A. and Yang, S.X. (2006). A SOM-based multi-agent architecture for
multirobot systems. Int. J. Robot. Autom., volume 21, pages 91–99. 19
[172] Zlot, R., Stentz, A., Dias, B. and Thayer, S. (2002). Multi-robot explo-
ration controlled by a market economy. In IEEE International Conference
on Robotics and Automation, volume 3, pages 3016–3023. 24
[173] Zlot, R. and Stentz, A. (2006). Market-based multirobot coordination for
complex tasks. The International Journal of Robotics Research, 25(1):73–
101. 34
[174] Zlot, R. and Stentz, A. (2006). Market-based multirobot coordination using
task abstraction. In Field and Service Robotics, volume 24, pages 167–177.
34
102