ieee transa ctions on r obotics and a utoma ...web.eecs.utk.edu/~leparker/publications/tra.pdfieee...

IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, VOL. 14, NO. 2, APRIL 1998: 220-240 1

ALLIANCE: An Architecture for Fault TolerantMulti-Robot Cooperation

Lynne E. ParkerCenter for Engineering Systems Advanced Research

Oak Ridge National LaboratoryP. O. Box 2008, Mailstop 6355, Oak Ridge, TN 37831-6355

email: [email protected], phone: (423) 241-4959, fax: (423) 574-0405URL: http://avalon.epm.ornl.gov/~parkerle/

Abstract|ALLIANCE is a software architecture that fa-cilitates the fault tolerant cooperative control of teams ofheterogeneous mobile robots performing missions composedof loosely coupled subtasks that may have ordering depen-dencies. ALLIANCE allows teams of robots, each of whichpossesses a variety of high-level functions that it can per-form during a mission, to individually select appropriateactions throughout the mission based on the requirementsof the mission, the activities of other robots, the current en-vironmental conditions, and the robot's own internal states.ALLIANCE is a fully distributed, behavior-based architec-ture that incorporates the use of mathematically-modeledmotivations (such as impatience and acquiescence) withineach robot to achieve adaptive action selection. Since coop-erative robotic teams usually work in dynamic and unpre-dictable environments, this software architecture allows therobot team members to respond robustly, reliably, exibly,and coherently to unexpected environmental changes andmodi�cations in the robot team that may occur due to me-chanical failure, the learning of new skills, or the additionor removal of robots from the team by human intervention.The feasibility of this architecture is demonstrated in animplementation on a team of mobile robots performing alaboratory version of hazardous waste cleanup.

Keywords|Multi-Robot Cooperation, ALLIANCE, Faulttolerance, Control Architecture

I. Introduction

A. Motivation

A key driving force in the development of mobile roboticsystems is their potential for reducing the need for humanpresence in dangerous applications. Applications such asthe cleanup of toxic waste, nuclear power plant decom-missioning, planetary exploration, �re �ghting, search andrescue missions, security, surveillance, and reconnaissancetasks have elements of danger in which human casualtiesare possible, or even likely. In all of these applications, itis desirable to reduce the risk to humans through the useof autonomous robot technology. Other applications, suchas manufacturing or industrial and/or household mainte-nance, are of a highly repetitive nature, creating tasks thathumans �nd monotonous or fatiguing. In these cases, thequality of the solution may be increased by employing au-tonomous agents.

One approach to creating an autonomous robotic solu-tion to a given application is to try to build a single robotto address that application. This robot would be designedto have all of the capabilities necessary to complete the

mission on its own, perhaps with some guidance from ahuman controller. For smaller-scale applications, the singlerobot approach is often feasible. However, a large numberof the human solutions to these real world applications ofinterest employ the use of multiple humans supporting andcomplementing each other. Rather than having one hu-man performing the task alone, a team of workers is formedthat have a variety of specialized skills. These workers areavailable to help each other, and to provide individualizedexpertise when needed in the application. Each humanteam member is typically assigned a role (or roles) to ful�llduring the mission that is based upon that human's skillsand experience. The humans will also share some commoncapabilities that allow them to perform some tasks inter-changeably, depending upon the workload of the individualduring the mission. Unexpected events may occur duringthe mission that require a dynamic reallocation of tasksby the humans to address the new circumstances. Manyexamples of human teams of this type can be found thatare very successful and eÆcient in performing their mission.Real-world applications that are well-suited for team-basedapproaches include the U.S. Department of Energy appli-cations of decontamination and decommissioning of legacymanufacturing facilities and hazardous waste cleanup; theU.S. Department of Defense applications of surveillanceand reconnaissance and remote warfare; the NASA appli-cations of space exploration; and commercial and privateapplications of industrial and household maintenance, �re-�ghting, search and rescue, and security. Most of theseapplications are composed of tasks that are inherently dis-tributed, either in space, time, or functionality, and thusrequire a distributed solution.

Since realistic human solutions to these types of applica-tions require multiple humans to work together, it is fea-sible to examine the use of robot teams for automated so-lutions to these tasks. There are a number of potentialadvantages to using a distributed mobile robot system. Itmay be possible to reduce the total cost of the system byconstructing multiple simpler robots rather than a mono-lithic single robot. The complexity of many environmentsor missions may actually require a mixture of robotic capa-bilities that is too extensive to design into a single robot.In the general case, a robot team consists of a variety oftypes of robots, each type of which specializes in (possibly

2 IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, VOL. 14, NO. 2, APRIL 1998: 220-240

overlapping) areas of capability. Utilizing a team of robotsmay make it possible to increase the robustness of the sys-tem by taking advantage of the parallelism and redundancyof multiple robots. Additionally, time constraints may re-quire the use of multiple robots working simultaneously ondi�erent aspects of the mission in order to successfully ac-complish the objective. Certainly, a distributed solutionis the only viable approach for the application domainsmentioned above that are inherently distributed in time,space, and/or functionality. Thus, in the most general case,the robot team would consist of a variety of heterogeneoustypes of robots working together to accomplish the missionthat no individual robot could accomplish alone.

However, the use of multiple robots is not without itsdisadvantages. If not properly constructed, multiple-robotsystems may actually increase the complexity of an auto-mated solution rather than simplifying it. In multi-robotapproaches, one has to deal with many challenging issuesthat do not arise in single-robot systems, such as achiev-ing coherence, determining how to decompose and allocatethe problem among a group of robots, determining how toenable the robots to interact, and so forth. In fact, the dif-�culties in designing a cooperative team are signi�cant. In[5], Bond and Gasser describe the basic problems the �eldof Distributed Arti�cial Intelligence must address; thoseaspects directly related to situated multi-robot systems in-clude the following:

� How do we formulate, describe, decompose, and allo-cate problems among a group of intelligent agents?

� How do we enable agents to communicate and interact?� How do we ensure that agents act coherently in theiractions?

� How do we allow agents to recognize and reconcile con- icts?

These are challenging issues which have been extensivelystudied, but yet still have many open research issues re-maining to be addressed. As described in the followingsection, while much research in recent years has addressedthe issues of autonomous robots and multi-robot coopera-tion, current robotics technology is still far from achievingmany of these real world applications. We believe that onereason for this technology gap is that previous work hasnot adequately addressed the issues of fault tolerance andadaptivity. Here, by fault tolerance, we mean the ability ofthe robot team to respond to individual robot failures orfailures in communication that may occur at any time dur-ing the mission. The fault tolerant response of interest inthis article is the dynamic re-selection (or re-allocation) oftasks by robot team members due to robot failures or a dy-namically changing environment. We want the robot teamas a whole to be able to complete its mission to the great-est extent possible in spite of any single-point failure. Byadaptivity, we mean the ability of the robot team to changeits behavior over time in response to a dynamic environ-ment, changes in the team mission, or changes in the teamcapabilities or composition, to either improve performanceor to prevent unnecessary degradation in performance.

The ALLIANCE architecture described in this article of-

fers one solution to multi-robot cooperation that not onlyaddresses the issues inherent in any multi-robot team, butalso allows the robotic teams to be fault tolerant, reliable,and adaptable. Requiring fault tolerance in a cooperativearchitecture emphasizes the need to build teams that mini-mize their vulnerability to individual robot outages (eitherfull or partial outages). Reliability refers to the dependabil-ity of a system, and whether it functions properly and cor-rectly each time it is utilized. This concept di�ers slightlyfrom fault tolerance in that we want to be assured that arobot team correctly accomplishes its mission even whenindividual robot failures do not occur. One measure ofthe reliability of the architecture is its ability to guaranteethat the mission will be solved, within certain operatingconstraints, when applied to any given cooperative robotteam. Adaptivity in a cooperative team allows that teamto be responsive to changes in individual robot skills andperformance, to dynamic environmental changes, and tochanges in the robot team composition as robots dynami-cally join or leave the cooperative team.

This article describes the control architecture, AL-LIANCE, that we have developed which facilitates faulttolerant, reliable, and adaptive cooperation among small-to medium-sized teams of heterogeneous mobile robots,performing (in dynamic environments) missions composedof independent tasks that can have ordering dependen-cies. We begin by describing the related work in this area,followed by a detailed discussion of the features of AL-LIANCE. We then illustrate the viability of this architec-ture by describing the results of implementing ALLIANCEon a team of robots performing a laboratory version of haz-ardous waste cleanup, which requires the robots to �nd theinitial locations of two spills, move the two spills to a goaldestination, and periodically report the team's progress toa human monitoring the mission. In the appendix, we sup-ply a proof of ALLIANCE's mission termination for a re-stricted set of cooperative robotic applications.

II. Related Work

The amount of research in the �eld of cooperative mobilerobotics has grown substantially in recent years. This workcan be broadly categorized into two groups: swarm-typecooperation and \intentional" cooperation. (This paperaddresses the second area of cooperation.) The swarm-type approach to multi-robot cooperation deals with largenumbers of homogeneous robots. This approach is usefulfor non-time-critical applications involving numerous rep-etitions of the same activity over a relatively large area,such a cleaning a parking lot or collecting rock samples onMars. The approach to cooperative control taken in thesesystems is derived from the �elds of neurobiology, ethology,psychophysics, and sociology, and is typically characterizedby teams of large numbers of homogeneous robots, each ofwhich has fairly limited capabilities on its own. However,when many such simple robots are brought together, glob-ally interesting behavior can emerge as a result of the localinteractions of the robots. Such approaches usually relyon mathematical convergence results (such as the random

PARKER: ALLIANCE: AN ARCHITECTURE FOR FAULT TOLERANT MULTI-ROBOT COOPERATION 3

walk theorem [12]) that indicate the desired outcome overa suÆciently long period of time. A key research issue inthis scenario is determining the proper design of the localcontrol laws that will allow the collection of robots to solvea given problem.

A number of researchers have studied the issues of swarmrobotics. Deneubourg et al. [16] describe simulation resultsof a distributed sorting algorithm. Theraulaz et al. [36] ex-tract cooperative control strategies, such as foraging, froma study of Polistes wasp colonies. Steels [34] presents sim-ulation studies of the use of several dynamical systems toachieve emergent functionality as applied to the problemof collecting rock samples on a distant planet. Drogouland Ferber [17] describe simulation studies of foraging andchain-making robots. In [25] Mataric describes the resultsof implementing group behaviors such as dispersion, aggre-gation, and ocking on a group of physical robots. Beni andWang [4] describe methods of generating arbitrary patternsin cyclic cellular robotics. Kube and Zhang [22] present theresults of implementing an emergent control strategy on agroup of �ve physical robots performing the task of locat-ing and pushing a brightly lit box. Stilwell and Bay [35]present a method for controlling a swarm of robots usinglocal force sensors to solve the problem of the collectivetransport of a palletized load. Arkin et al. [1] presentresearch concerned with sensing, communication, and so-cial organization for tasks such as foraging. The CEBOTwork, described in [19] and many related papers, has manysimilar goals to other swarm-type multi-robotic systems;however, the CEBOT robots can be one of a number ofrobot classes, rather than purely homogeneous.

The primary di�erence between these approaches andthe problem addressed in this article is that the aboveapproaches are designed strictly for homogeneous robotteams, in which each robot has the same capabilities andcontrol algorithm. Additionally, issues of eÆciency arelargely ignored. However, in heterogeneous robot teamssuch as those addressed in this article, not all tasks can beperformed by all team members, and even if more thanone robot can perform a given task, they may performthat task quite di�erently. Thus the proper mapping ofsubtasks to robots is dependent upon the capabilities andperformance of each robot team member. This additionalconstraint brings many complications to a workable archi-tecture for robot cooperation, and must be addressed ex-plicitly to achieve the desirable level of cooperation.

The second primary area of research in cooperative con-trol deals with achieving \intentional" cooperation amonga limited number of typically heterogeneous robots per-forming several distinct tasks. In this type of cooperativesystem, the robots often have to deal with some sort ofeÆciency constraint that requires a more directed type ofcooperation than is found in the swarm approach describedabove. Furthermore, this second type of mobile roboticmission usually requires that several distinct tasks be per-formed. These missions thus usually require a smaller num-ber of possibly heterogeneous mobile robots involved inmore purposeful cooperation. Although individual robots

in this approach are typically able to perform some usefultask on their own, groups of such robots are often able toaccomplish missions that no individual robot can accom-plish on its own. Key issues in these systems include ro-bustly determining which robot should perform which taskso as to maximize the eÆciency of the team and ensur-ing the proper coordination among team members to allowthem to successfully complete their mission.

Two bodies of previous research are particularly appli-cable to this second type of cooperation. First, severalresearchers have directly addressed this cooperative robotproblem by developing control algorithms and implement-ing them either on physical robots or on simulations ofphysical robots that make reasonable assumptions aboutrobot capabilities. Examples of this work include Noreils[26], who describes a sense-model-plan-act control archi-tecture which includes three layers of control: the plannerlevel, which manages coordinated protocols, decomposestasks into smaller subunits, and assigns the subtasks to anetwork of robots; the control level, which organizes andexecutes a robot's tasks; and the functional level, whichprovides controlled reactivity. He reports on the implemen-tation of this architecture on two physical mobile robotsperforming convoying and box pushing. In both of theseexamples, one of the robots acts as a leader, and the otheracts as a follower.

Caloud et al. [9] describe another sense-model-plan-actarchitecture which includes a task planner, a task allocator,a motion planner, and an execution monitor. Each robotobtains goals to achieve either based on its own currentsituation, or via a request by another team member. Theyuse Petri Nets for interpretation of the plan decomposi-tion and execution monitoring. In this paper they reporton plans to implement their architecture on three physicalrobots.

In [2] and elsewhere, Asama et al. describe their de-centralized robot system called ACTRESS, addressing theissues of communication, task assignment, and path plan-ning among heterogeneous robotic agents. Their approachrevolves primarily around a negotiation framework whichallows robots to recruit help when needed. They havedemonstrated their architecture on mobile robots perform-ing a box pushing task.

Wang [37] addresses a similar issue to that addressed inthis article | namely, dynamic, distributed task allocationwhen more than one robot can perform a given task. Heproposes the use of several distributed mutual exclusionalgorithms that use a \sign-board" for inter-robot com-munication. These algorithms are used to solve problemsincluding distributed leader �nding, the N-way intersec-tion problem, and robot ordering. However, this earlierpaper does not address issues of dynamic reallocation dueto robot failure and eÆciency issues due to robot hetero-geneity.

Cohen et al. [13] propose a hierarchical subdivisionof authority to address the problem of cooperative �re-�ghting. They describe their Phoenix system, which in-cludes a generic simulation environment and a real-time,


adaptive planner. The main controller in this architectureis called the Fireboss, which maintains a global view of theenvironment, forms global plans, and sends instructions toagents to activate their own local planning.

A huge amount of research in optimal task allocation andscheduling has been accomplished previously (e.g. [14]).However, these approaches alone are not directly applicableto multi-robot missions, since they do not address multi-robot performance in a dynamic world involving robot het-erogeneity, sensing uncertainty, and the nondeterminism ofrobot actions.

The second, signi�cantly larger, body of research relatedto intentional cooperation comes from the Distributed Ar-ti�cial Intelligence (DAI) community, which has produceda great deal of work addressing \intentional" cooperationamong generic agents. However, these agents are typicallysoftware systems running as interacting processes to solvea common problem rather than embodied, sensor-basedrobots. In most of this work, the issue of task allocationhas been the driving in uence that dictates the design ofthe architecture for cooperation. Typically, the DAI ap-proaches use a distributed, negotiation-based mechanismto determine the allocation of tasks to agents. See [5] formany of the seminal papers in this �eld.

Our approach is distinct from these earlier DAI ap-proaches in that we embed task-oriented missions inbehavior-based systems and directly address solutions tofault tolerant, adaptive control. Although the need forfault tolerance is noted in the earlier architectures, theytypically either make no serious e�ort at achieving faulttolerant, adaptive control or they assume the presence ofunrealistic \black boxes" that continually monitor the envi-ronment and provide recovery strategies (usually involvingunspeci�ed replanning mechanisms) for handling varioustypes of unexpected events. Thus, in actuality, if one ormore of the robots or the communication system fails underthese approaches, the entire team is subject to catastrophicfailure. Experience with physical mobile robots has shown,however, that robot failure is very common, not only dueto the complexity of the robots themselves, but also due tothe complexity of the environment in which these robotsmust be able to operate. Thus, control architectures mustexplicitly address the dynamic nature of the cooperativeteam and its environment to be truly useful in real-worldapplications. Indeed, the approach to cooperative controldeveloped in this article has been designed speci�cally withthe view toward achieving fault tolerance and adaptivity.

Additionally, the earlier approaches break the probleminto a traditional AI sense-model-plan-act decompositionrather than the functional decomposition used in behavior-based approaches. The traditional approach has likely beenfavored because it presents a clean subdivision between thejob planning, task decomposition, and task allocation por-tions of the mission to be accomplished | a segmenta-tion that may, at �rst, appear to simplify the coopera-tive team design. However, the problems with applyingthese traditional approaches to physical robot teams arethe same problems that currently plague these approaches

when they are applied to individual situated robots. Asargued by Brooks in [8] and elsewhere, approaches usinga sense-model-plan-act framework have been unable to de-liver real-time performance in a dynamic world becauseof their failure to adequately address the situatedness andembodiment of physical robots. Thus, a behavior-basedapproach to cooperation was utilized in ALLIANCE toincrease the robustness and adaptivity of the cooperativeteam.Refer to [11] for a detailed review of much of the existing

work in cooperative robotics.

III. ALLIANCE

A. Assumptions

In the design of any control scheme, it is important tomake explicit those assumptions underlying the approach.Thus, before describing the ALLIANCE architecture in de-tail, we �rst discuss the assumptions that were made in thedesign of this architecture. Note that these assumptions aremade within the context (described earlier) of solving theproblem of multi-robot cooperation for small- to medium-sized teams of heterogeneous robots performing missionscomposed of independent subtasks that may have orderingdependencies.Our assumptions are as follows:1. The robots on the team can detect the e�ect of theirown actions, with some probability greater than 0.

2. Robot ri can detect the actions of other team mem-bers for which ri has redundant capabilities, with someprobability greater than 0; these actions may be de-tected through any available means, including explicitbroadcast communication.

3. Robots on the team do not lie and are not intention-ally adversarial.

4. The communications medium is not guaranteed to beavailable.

5. The robots do not possess perfect sensors and e�ec-tors.

6. Any of the robot subsystems can fail, with some prob-ability greater than 0.

7. If a robot fails, it cannot necessarily communicate itsfailure to its teammates.

8. A centralized store of complete world knowledge isnot available.

We make the �rst assumption | a robot's detection ofthe e�ect of its own actions | to ensure that robots havesome measure of feedback control and do not perform theiractions purely with open-loop control. However, we do notrequire that robots be able to measure their own e�ective-ness with certainty, because we realize this rarely happenson real robots.The second assumption deals with the problem of action

recognition | the ability of a robot to observe and inter-pret the behavior of another robot. Previous research incooperative robotics has investigated several possible waysof providing action recognition to robot teams | from im-plicit cooperation through sensory feedback to explicit co-operation using the exchange of communicated messages.


For example, Huber and Durfee [20] have developed a mul-tiple resolution, hierarchical plan recognition system to co-ordinate the motion of two interacting mobile robots basedupon belief networks. Other researchers have studied thee�ect of communication in providing action knowledge. Forexample, MacLennan [24] investigates the evolution of com-munication in simulated worlds and concludes that thecommunication of local robot information can result in sig-ni�cant performance improvements; Werner and Dyer [38]examine the evolution of communication which facilitatesthe breeding and propagation of arti�cial creatures; andBalch and Arkin [3] examine the importance of communi-cation in robotic societies performing forage, consume, andgraze tasks, �nding that communication can signi�cantlyimprove performance for tasks involving little implicit com-munication through the world, and that communication ofcurrent robot state is almost as e�ective as communicationof robot goals.

In the current article, we do not require that a robot beable to determine a teammate's actions through passive ob-servation, which can be quite diÆcult to achieve. Instead,we enable robots to learn of the actions of their teammatesthrough an explicit communication mechanism, wherebyrobots broadcast information on their current activities tothe rest of the team.

Third, we assume that the robots are built to work ona team, and are neither in direct competition with eachother, nor are attempting to subvert the actions of theirteammates. Although con icts may arise at a low leveldue to, for example, a shared workspace, we assume thatat a high level the robots share compatible goals. (Note,however, that some multi-robot team applications may re-quire the ability to deal with adversarial teams, such asmilitary applications or team competitions (e.g. robot soc-cer). This is not covered in this article.)

We further assume that any subsystem of the team, suchas communications, sensors, and e�ectors, or individualrobots, are subject to failure, thus leading to assumptionsfour through seven.

Finally, we assume that robots do not have access tosome centralized store of world knowledge, and that nocentralized agent is available that can monitor the stateof the entire robot environment and make controlling deci-sions based upon this information.

B. Overview of ALLIANCE

As already discussed, a major design goal in the devel-opment of ALLIANCE is to address the real-world issuesof fault tolerance and adaptivity when using teams of fal-lible robots with noisy sensors and e�ectors. Our aim is tocreate robot teams that are able to cope with failures anduncertainty in action selection and action execution, andwith changes in a dynamic environment. Because of thesedesign goals, we developed ALLIANCE to be a fully dis-tributed, behavior-based software architecture which givesall robots the capability to determine their own actionsbased upon their current situation. No centralized controlis utilized, so that we can investigate the power of a fully

distributed robotic system to accomplish group goals. Thepurpose of this approach is to maintain a purely distributedcooperative control scheme which a�ords an increased de-gree of robustness; since individual agents are always fullyautonomous, they have the ability to perform useful actionseven amidst the failure of other robots.

ALLIANCE de�nes a mechanism that allows teams ofrobots, each of which possesses a variety of high-level func-tions that it can perform during a mission, to individ-ually select appropriate actions throughout the missionbased on the requirements of the mission, the activities ofother robots, the current environmental conditions, and therobot's own internal states. This mechanism is based uponthe use of mathematically-modeled motivations within eachrobot, such as impatience and acquiescence, to achieveadaptive action selection. Under the behavior-based frame-work, the task-achieving behaviors of each robot receivesensory input and control some aspect of the actuator out-put. Lower-level behaviors, or competences, correspondto primitive survival behaviors such as obstacle avoidance,while the higher-level behaviors correspond to higher goalssuch as map building and exploring. The output of thelower-level behaviors can be suppressed or inhibited by theupper layers when necessary. Within each layer of compe-tence may be a number of simple modules interacting viainhibition and suppression to produce the desired behavior.This approach has been used successfully in a number ofrobotic applications, several of which are described in [7].

Extensions to this approach are necessary, however,when a robot must select among a number of compet-ing actions | actions which cannot be pursued in paral-lel. Unlike typical behavior-based approaches, ALLIANCEdelineates several behavior sets that are either active as agroup or are hibernating. Figure 1 shows the general archi-tecture of ALLIANCE and illustrates three such behaviorsets. Each behavior set of a robot corresponds to thoselevels of competence required to perform some high-leveltask-achieving function. Because of the alternative goalsthat may be pursued by the robots, the robots must havesome means of selecting the appropriate behavior set toactivate. This action selection is controlled through theuse of motivational behaviors, each of which controls theactivation of one behavior set. Due to con icting goals,only one behavior set is active at any point in time (im-plemented via cross-inhibition of behavior sets). However,other lower-level competences such as collision avoidancemay be continually active regardless of the high-level goalthe robot is currently pursuing. Examples of this type ofcontinually active competence are shown generically in �g-ure 1 as layer 0, layer 1, and layer 2.

C. Motivational Behaviors

In ALLIANCE, the ability for robots to respond to un-expected events, robot failures, and so forth, is providedthrough the use of motivations. These motivations are de-signed to allow robot team members to perform tasks onlyas long as they demonstrate their ability to have the de-sired e�ect on the world. This di�ers from the commonly


Layer 0

Layer 1

Layer 2

MotivationalBehavior



BehaviorSet 0

BehaviorSet 1

BehaviorSet 2

Sensors

Actuators

Inter-RobotCommuni-

cation

cross-inhibition

The ALLIANCE Architecture

s

sss

s

Fig. 1. The ALLIANCE architecture, implemented on each robotin the cooperative team, delineates several behavior sets, eachof which correspond to some high-level task-achieving function.The primary mechanism enabling a robot to select a high-levelfunction to activate is the motivational behavior. The symbolsthat connect the output of each motivational behavior with theoutput of its corresponding behavior set (vertical lines with shorthorizontal bars) indicate that a motivational behavior either al-lows all or none of the outputs of its behavior set to pass throughto the robot's actuators. The non-bold, single-bordered rectan-gles correspond to individual layers of competence that are alwaysactive.

used technique for task allocation that begins with break-ing down the mission (or part of the mission) into subtasks,and then computing the \optimal" robot-to-task mappingbased upon agent skill levels, with little recourse for robotfailures after the allocation has occurred.

The motivations are utilized in each motivational behav-ior, which is the primary mechanism for achieving adaptiveaction selection in this architecture. At all times during themission, each motivational behavior receives input from anumber of sources, including sensory feedback, inter-robotcommunication, inhibitory feedback from other active be-haviors, and internal motivations. This input is combined(in a manner described below) to generate the output of amotivational behavior at any point in time. This outputde�nes the activation level of its corresponding behaviorset, represented as a non-negative number. When this ac-tivation level exceeds a given threshold, the correspondingbehavior set becomes active.

Two types of internal motivations are modeled in AL-LIANCE | robot impatience and robot acquiescence. Theimpatience motivation enables a robot to handle situationswhen other robots (outside itself) fail in performing a giventask. The acquiescence motivation enables a robot to han-dle situations in which it, itself, fails to properly performits task. Intuitively, a motivational behavior works as fol-lows. A robot's motivation to activate any given behaviorset is initialized to 0. Then over time, that robot's moti-vation to perform a given behavior set increases at a fastrate of impatience (de�ned explicitly below) as long as thetask corresponding to that behavior set is not being accom-plished by any robot team member. The robot, however,

should also be responsive to the actions of its teammates,adapting its task selection to the activities of other robotteam members. Thus, if the ith robot ri is aware that thekth robot rk is working on a particular task, ri should besatis�ed for some period of time that that task is going tobe accomplished even without its own participation in thetask, and thus go on to some other applicable action. Robotri's motivation to activate its corresponding behavior setcontinues to increase, but at a slower rate of impatience.This characteristic prevents robots from replicating eachother's actions and thus wasting needless energy.

As a simple example, consider a team of two robots un-loading boxes from a truck and placing them on one of twoconveyor belts, depending upon the labeling on the box.Both of the robots, call them A and B, have the abilityto unload boxes from the truck to a temporary storage lo-cation, and the ability to move them from the temporarystorage location to the appropriate conveyor belt. (We as-sume that, due to the way the loading dock is designed,the robots cannot move boxes immediately from the truckto the conveyor belt). At the beginning of the mission, sayrobot A elects to unload the boxes from the truck. RobotB is then satis�ed that the boxes will be unloaded, and pro-ceeds to move the boxes from the temporary location to theconveyor belt when applicable. As the mission progresses,however, let us assume that robot A's box-detection sen-sor (e.g. a camera) becomes dirty and prevents A fromlocating boxes. Since no more boxes are arriving at thetemporary location, robot B becomes more and more im-patient to take over the task of unloading boxes. After aperiod of time, B takes over the task of unloading boxes,even though robot A is still attempting to accomplish thattask | unaware that its sensor is returning faulty readings.

Before continuing with this example, we note here thatdetecting and interpreting the actions of other robots isnot a trivial problem, and often requires perceptual abil-ities that are not yet possible with current sensing tech-nology. As it stands today, the sensory capabilities ofeven the lower animals far exceed present robotic capa-bilities. Thus, to enhance the robots' perceptual abilities,ALLIANCE utilizes a simple form of broadcast communi-cation to allow robots to inform other team members oftheir current activities, rather than relying totally on sens-ing through the world. Thus, at some pre-speci�ed rate,each robot ri broadcasts a statement of its current action,which other robots may listen to or ignore as they wish.No two-way conversations are employed in this architec-ture. Each robot is designed to be somewhat impatient inthe e�ect of communicated messages, however, in that arobot ri is only willing for a certain period of time to allowthe communicated messages of another robot to a�ect itsown motivation to activate a given behavior set. Continuedsensory feedback indicating that a task is not getting ac-complished overrides the statements of another robot thatit is performing that task. This characteristic allows robotsto adapt to failures of other robots, causing them to ignorethe activities of a robot that is not successfully completingits task.


A complementary characteristic in these robots is that ofacquiescence. Just as the impatience characteristic re ectsthe fact that other robots may fail, the acquiescence char-acteristic indicates the recognition that a robot itself mayfail. This feature operates as follows. As a robot ri per-forms a task, its willingness to give up that task increasesover time as long as the sensory feedback indicates the taskis not being accomplished. As soon as some other robot rkindicates it has begun that same task and ri feels it (i.eri) has attempted the task for an adequate period of time,the unsuccessful robot ri gives up its task in an attemptto �nd an action at which it is more productive. How-ever, even if another robot rk has not taken over the task,robot ri may give up its task anyway if the task is not com-pleted in an acceptable period of time. This allows ri thepossibility of working on another task that may prove tobe more productive rather than becoming stuck perform-ing the unproductive task forever. With this acquiescencecharacteristic, therefore, a robot is able to adapt its actionsto its own failures. Continuing our earlier simple example,this characteristic of acquiescence is demonstrated in robotA, which gives up trying to unload the truck when robot Btakes over that task. Depending upon its self-monitoringcapabilities, robot A will then either attempt to performthe second task of moving boxes to the conveyor belt (prob-ably unsuccessfully) or give up altogether. Robot B wouldthen be motivated to move the boxes from the storage siteto the conveyor belt after it had completed unloading thetask. In ALLIANCE, the mission could also be designed insuch a way that robot B interleaves a period of unloadingthe truck with a period of moving boxes to the conveyorbelt.

The design of the motivational behaviors in ALLIANCEalso allows the robots to adapt to unexpected environmen-tal changes which alter the sensory feedback. The need foradditional tasks can suddenly occur, requiring the robotsto perform additional work (e.g. another truck of boxes ar-rives in our simple example), or existing environmental con-ditions can disappear and thus relieve the robots of certaintasks (e.g. the truck of boxes drives away). In either case,the motivations uidly adapt to these situations, causingrobots to respond appropriately to the current environmen-tal circumstances.

We note that the parameters controlling motivationalrates of robots under the ALLIANCE architecture canadapted over time based upon learning. A learning mecha-nism we have developed allows the rates of impatience andacquiescence to be context and environment sensitive. Anextension to ALLIANCE, called L-ALLIANCE (for Learn-ing ALLIANCE), provides the mechanisms for accomplish-ing this adaptation. This parameter adaptation capabilityis described in section III-E. For more details, refer to [31].

C.1 Comparison to Negotiation

We now compare the ALLIANCE task allocation mech-anism to a common approach used in distributed arti�cialintelligence (DAI) systems for multi-agent task allocation| a negotiation-based mechanism. In general, the negotia-

tion schemes have no centralized agent that has full controlover which tasks individual team members should perform.Instead, many agents know which subtasks are requiredfor various portions of the mission to be performed, alongwith the skills required to achieve those subtasks. Theseagents then broadcast a request for bids to perform thesesubtasks, to which other agents may respond if they areavailable and want to perform these tasks. The broadcast-ing agent then selects an agent from those that respondand awards the task to the winning agent, who then com-mences to perform that task, recruiting yet other agentsto help if required. One example of a popular negotiationprotocol that has been used extensively is the contract-netprotocol [15], [33]; other negotiation schemes are describedin [18], [21], [32], [39].

However, although DAI work has demonstrated successwith this approach in a number of domains (e.g. dis-tributed vehicle monitoring [23] and distributed air traÆccontrol [10]), the proposed solutions have not been ade-quately demonstrated in situated agent (i.e. robotic) teams,which have to live in, and react to, dynamic and uncertainenvironments amidst noisy sensors and e�ectors, frequentagent failures, and a limited bandwidth, noisy communi-cation mechanism. The DAI approaches typically assumethe presence of \black boxes" to provide high-level, nearly-perfect sensing and action capabilities. However, these DAIapproaches typically ignore or only give brief treatment tothe issues of robot performance of those tasks after theyhave been allocated. Such approaches usually assume therobots will eventually accomplish the task they have beenassigned, or that some external monitor will provide infor-mation to the robots on dynamic changes in the environ-ment or robot performance. However, to realistically de-sign a cooperative approach to robotics, we must includemechanisms within the software control of each robot thatallow the team members to recover from dynamic changesin their environment, or failures in the robot team or com-munication mechanism.

Thus, we developed the ALLIANCE architecture to ex-plicitly address the issue of fault tolerance amidst possiblerobot and communication failures. The ALLIANCE ar-chitecture provides each robot with suÆcient autonomy tomake decisions on its actions at all times during a mission,taking into consideration the actions of other agents andtheir e�ects upon the world. While some eÆciency maybe lost as a consequence of not negotiating the task subdi-vision in advance, robustness is gained if robot failures orother dynamic events occur at any time during the mission.We speculate that a hybrid combination of negotiation andthe ALLIANCE motivational mechanisms would enable asystem to experience the bene�ts of both approaches. Thisis a topic for future research.

D. Discussion of Formal Model of ALLIANCE

Let us now look in detail at how our philosophy regard-ing action selection is incorporated into the motivationalbehavior mechanism.

First, let us formally de�ne our problem as follows. Let


the set R = fr1; r2; :::; rng represent the set of n het-erogeneous robots composing the cooperative team, andthe set T = ftask1; task2; :::; taskmg represent m indepen-dent subtasks which compose the mission. We use theterm high-level task-achieving function to correspond in-tuitively to the functions possessed by individual robotsthat allow the robots to achieve tasks required in the mis-sion. These functions map very closely to the upper lay-ers of the subsumption-based control architecture [6]. Inthe ALLIANCE architecture, each behavior set suppliesits robot with a high-level task-achieving function. Thus,in the ALLIANCE architecture, the terms high-level task-achieving function and behavior set are synonymous. Werefer to the high-level task-achieving functions, or behav-ior sets, possessed by robot ri in ALLIANCE as the setAi = fai1; ai2; :::g. Since di�erent robots may have di�er-ent ways of performing the same task, we need a way ofreferring to the task a robot is working on when it acti-vates a behavior set. Thus, we de�ne the set of n func-tions fh1(a1k); h2(a2k); :::; hn(ank)g, where hi(aik) returnsthe task in T that robot ri is working on when it activatesbehavior set aik.

In the discussion below, we �rst discuss the thresholdof activation of a behavior set, and then describe the �veprimary inputs to the motivational behavior. We concludethis section by showing how these inputs are combined todetermine the current level of motivation of a given behav-ior set in a given robot. Throughout this section, a numberof parameters are de�ned that regulate the motivationallevels. A discussion on how the parameter values are ob-tained and adapted over time is provided in section III-E.

D.1 Threshold of activation

The threshold of activation of a behavior set is given byone parameter, �. This parameter determines the level ofmotivation beyond which a given behavior set will becomeactive. Although di�erent thresholds of activation couldbe used for di�erent behavior sets and for di�erent robots,in ALLIANCE one threshold is suÆcient since the rates ofimpatience and acquiescence can vary across behavior setsand across robots.

D.2 Sensory feedback

The sensory feedback provides the motivational behav-ior with the information necessary to determine whetherits corresponding behavior set needs to be activated at agiven point during the current mission. This feedback is as-sumed to be noisy, and can originate from either physicalrobot sensors or virtual robot sensors. We de�ne a simplefunction to capture the notion of sensory feedback in themotivational level computation as follows:

sensory feedback ij(t) =8<:

1 if the sensory feedback in robot ri at time tindicates that behavior set aij is applicable

0 otherwise

D.3 Inter-robot communication

Two parameters are utilized in ALLIANCE to controlthe broadcast communication among robots: �i and �i.The �rst parameter, �i, gives the rate at which robot ribroadcasts its current activity. The second parameter,�i, provides an additional level of fault tolerance by giv-ing the period of time robot ri allows to pass withoutreceiving a communication message from a speci�c team-mate before deciding that that teammate has ceased tofunction. While monitoring the communication messages,each motivational behavior aij of robot ri must also notewhen a team member is pursuing task hi(aij). To refer tothis type of monitoring in the formal model, the functioncomm received is de�ned as follows:

comm received (i; k; j; t1; t2) =8>><>>:

1 if robot ri has received message from robot rkconcerning task hi(aij) in the time span(t1; t2), where t1 < t2

0 otherwise

D.4 Suppression from active behavior sets

When a motivational behavior activates its behavior set,it simultaneously begins inhibiting other motivational be-haviors within the same robot from activating their respec-tive behavior sets. At this point, a robot has e�ectively\selected an action". In the formal model, this activitysuppression is modeled by the following simple function:

activity suppression ij(t) =8<:

0 if another behavior set aik is active, k 6= j, onrobot ri at time t

1 otherwise

This function says that behavior set aij is being sup-pressed at time t on robot ri if some other behavior set aikis currently active on robot ri at time t.

D.5 Robot impatience

Three parameters are used to implement the robot impa-tience feature of ALLIANCE: �ij(k; t), Æ slow ij(k; t), andÆ fast ij(t). The �rst parameter, �ij(k; t), gives the timeduring which robot ri is willing to allow robot rk's com-munication message to a�ect the motivation of behaviorset aij . Note that robot ri is allowed to have di�erent �parameters for each robot rk on its team, and that theseparameters can change during the mission, as indicated bythe dependence on t. This allows ri to be in uenced moreby some robots than others, perhaps due to reliability dif-ferences across robots.The next two parameters, Æ slow ij(k; t) and Æ fast ij(t),

give the rates of impatience of robot ri concerning behaviorset aij either while robot rk is performing the task corre-sponding to behavior set aij (i.e. hi(aij)) or in the absenceof other robots performing the task hi(aij), respectively.We assume that the fast impatience parameter correspondsto a higher rate of impatience than the slow impatienceparameter for a given behavior set in a given robot. The


reasoning for this assumption should be clear | a robot rishould allow another robot rk the opportunity to accom-plish its task before becoming impatient with rk; however,there is no reason for ri to remain idle if a task remainsundone and no other robot is attempting that task.The question that now arises is the following: what slow

rate of impatience does a motivational behavior controllingbehavior set aij use when more than one other robot is per-forming task hi(aij)? The method used in ALLIANCE isto increase the motivation at a rate that allows the slowestrobot rk still under its allowable time �ij(k; t) to continueits attempt.The speci�cation of when the impatience rate for a be-

havior set aij should grow according to the slow impatiencerate and when it should grow according to the fast impa-tience rate is given by the following function:

impatience ij(t) =8>>>>>><>>>>>>:

mink(Æ slow ij(k; t)) if (comm received(i; k; j;t� �i; t) = 1)and(comm received (i; k; j; 0;t� �ij(k; t)) = 0)

Æ fast ij(t) otherwise

Thus, the impatience rate will be the minimum slow rate,Æ slow ij(k; t), if robot ri has received communication indi-cating that robot rk is performing the task hi(aij) in thelast �i time units, but not for longer than �ij(k; t) timeunits. Otherwise, the impatience rate is set to Æ fast ij(t).The �nal detail to be addressed is to cause a robot's

motivation to activate behavior set aij to go to 0 the �rsttime it hears about another robot performing task hi(aij).This is accomplished through the following:

impatience reset ij(t) =8>><>>:

0 if 9k:((comm received (i; k; j; t� Æt; t) = 1) and(comm received(i; k; j; 0; t� Æt) = 0)); whereÆt = time since last communication check

1 otherwise

This reset function causes the motivation to be reset to0 if robot ri has just received its �rst message from robotrk indicating that rk is performing task hi(aij). This func-tion allows the motivation to be reset no more than oncefor every robot team member that attempts task hi(aij).Allowing the motivation to be reset repeatedly by the samerobot would allow a persistent, yet failing robot to jeopar-dize the completion of the mission.

D.6 Robot acquiescence

Two parameters are used to implement the robot ac-quiescence characteristic of ALLIANCE: ij(t) and �ij(t).The �rst parameter, ij(t), gives the time that robot riwants to maintain behavior set aij activation before yield-ing to another robot. The second parameter, �ij(t), givesthe time robot ri wants to maintain behavior set aij ac-tivation before giving up to possibly try another behaviorset.

The following acquiescence function indicates when arobot has decided to acquiesce its task:

acquiescence ij(t) =8>>>>>>>><>>>>>>>>:

0 if ((behavior set aij of robot ri has been activefor more than ij(t) time units at time t) and(9x:comm received(i; x; j; t� �i; t) = 1))or(behavior set aij of robot ri has been activefor more than �ij(t) time units at time t)

1 otherwise

This function says that a robot ri will not acquiesce be-havior set aij until one of the following conditions is met:

� ri has worked on task hi(aij) for a length of time ij(t)and some other robot has taken over task hi(aij)

� ri has worked on task hi(aij) for a length of time �ij(t)

D.7 Motivation calculation

All of the inputs described above are combined into thecalculation of the levels of motivation as follows:

mij(0) = 0

mij(t) = [mij(t� 1) + impatience ij(t)]

�sensory feedback ij(t)

�activity suppression ij(t)

�impatience reset ij(t)

�acquiescenceij(t) (1)

Initially, the motivation to perform behavior set aij inrobot ri is set to 0. This motivation then increases at somepositive rate impatience ij(t) unless one of four situationsoccurs: (1) the sensory feedback indicates that the behav-ior set is no longer needed, (2) another behavior set in riactivates, (3) some other robot has just taken over taskhi(aij) for the �rst time, or (4) the robot has decided toacquiesce the task. In any of these four situations, themotivation returns to 0. Otherwise, the motivation growsuntil it crosses the threshold �, at which time the behaviorset is activated and the robot can be said to have selectedan action. Whenever some behavior set aij is active inrobot ri, ri broadcasts its current activity to other robotsat a rate of �i.A topic of future research is to investigate other possible

combinations of the above input values to compute the mo-tivational behavior. The current linear form was selectedfor its simplicity, and resulted in good performance. Ad-ditional studies are needed to determine the utility of anyspeci�c combination.

E. Parameter Settings

As with any parameter-based system, the parameter set-tings in ALLIANCE strongly in uence the global perfor-mance of the system. A robot's selection of actions underALLIANCE is dependent upon the parameter settings of


the motivational behaviors. In addition, a number of eÆ-ciency issues are a�ected by the parameter settings, such asthe amount of robot idle time, the time of response to theneed for task reallocation, the selection between alternativemethods of accomplishing a task, and so forth. In practice,�nding the proper parameter settings in ALLIANCE hasnot proved to be diÆcult. This architecture has been im-plemented on a number of di�erent robotic applications[31] | one of which is reported later in this article | andparameter tuning did not prove to be a problem. How-ever, ideally, the robot team members should be able toinitialize and adapt these values with experience to �ndthe proper parameter settings, rather than relying on hu-man tuning. In the design of a parameter update strategy,we wanted to ensure that we did not sacri�ce the desirablecharacteristics of fault tolerance and adaptivity that arepresent in ALLIANCE while enabling increases in robotteam eÆciency. We have thus developed an extension toALLIANCE, called L-ALLIANCE, which provides mech-anisms that allow the robots to dynamically update theirparameter settings based upon knowledge learned from pre-vious experiences. In this section, we outline the approachutilized in L-ALLIANCE for parameter tuning. Refer to[31] for a more complete discussion of L-ALLIANCE.

The general idea for dynamically updating the param-eters is to have each robot \observe", evaluate, and cata-logue the performance (measured in task execution time)of any robot team member whenever it performs a taskof interest to that robot. The performance measurementsare then used to update the ALLIANCE parameters in themanner speci�ed below. These updates allow the robotsto adapt their action selections over time in response tochanges in the robot team composition, the robot teamcapabilities, or in the environment.

In this approach to parameter updates, we make two as-sumptions: (1) a robot's average performance in executinga speci�c task over a few recent trials is a reasonable indi-cator of that robot's expected performance in the future,and (2) if robot ri is monitoring environmental conditionsC to assess the performance of another robot rk, and theconditions C change, then the changes are attributable torobot rk.

Figure 2 illustrates the L-ALLIANCE extension to AL-LIANCE, which incorporates the use of performance mon-itors for each motivational behavior within each robot.Formally, robot ri, programmed with the b behavior setsA = fai1; ai2; :::; aibg, also has b monitors MONi =fmoni1;moni2; :::;monibg, such that monitor monij ob-serves the performance of any robot performing taskhi(aij), keeping track of the time of task completion (orother appropriate performance quality measure) of thatrobot. Monitor monij then uses the mechanism describedbelow to update the control parameters of behavior set aijbased upon this learned knowledge.

Once the robot performance data has been obtained, itmust be input to a control mechanism that allows the robotteam to improve its eÆciency over time while not sacri�cingthe fault tolerant characteristics of the behavior-based AL-

Layer 0

Layer 1

Layer 2




BehaviorSet 0

BehaviorSet 1

BehaviorSet 2

Sensors

Actuators

Inter-RobotCommuni-

cation

cross-inhibition

The L-ALLIANCE Architecture

Monitor Monitor Monitor

s

sss

s

Fig. 2. The L-ALLIANCE architecture, which adds in the capabilityfor dynamic parameter adaptation.

LIANCE architecture. Through an extensive series of em-pirical studies reported in [31], we derived an action selec-tion algorithm that speci�es how the parameters should beupdated to achieve increased eÆciency in the robot team'saction selection.

The algorithm derived in [31] speci�es that each robot rishould obey the following algorithm in selecting its tasks:

1. Divide the tasks into two categories:(a) Those tasks which ri expects to be able toperform better than any other team mem-ber, and which no other robot is currentlyperforming.

(b) All other tasks ri can perform.2. Repeat the following until sensory feedbackindicates that no more tasks are left:(a) Select tasks from the �rst category accord-ing to the longest task �rst approach, unlessno more tasks remain in the �rst category.

(b) Select tasks from the second category ac-cording to the shortest task �rst approach.

If a robot has no learned knowledge about team membercapabilities, all of its tasks fall into the second category.

Note that although the above algorithm is stated interms of a centralized controlling mechanism, the algorithmis in fact distributed across the behavior sets of ALLIANCEthrough the motivational behavior parameter settings.

The key parameters in ALLIANCE that a�ect the actionselection are:

� Æ fast ij(t): the rate of impatience of ri at timet concerning the behavior set aij when no otherrobot is performing task hi(aij)� Æ slow ij(k; t): the rate of impatience of ri attime t concerning the behavior set aij whenrobot rk is performing task hi(aij)� ij(t): the time ri will maintain aij 's activitybefore acquiescing to another robot


We now describe how these parameters are set and up-dated during the mission. First, we de�ne the performancemetric |task completion time | as:

task time i(k; j; t) =(average time over last � trials of rk's performance oftask hi(aij)) + (one standard deviation of these �attempts, as measured by ri)

The robot impatience parameters of Æ slow ij(k; t) andÆ fast ij(t) are then updated as follows.

�ij(k; t) = Time during which robot ri is willing toallow robot rk's communication message toa�ect the motivation of behavior set aij .

= task time i(i; j; t)

Æ slow ij(k; t) = Rate of impatience of robot riconcerning behavior set aij afterdiscovering robot rk performingthe task corresponding to thisbehavior set

= ��ij(k;t)

min delay = minimum allowed delaymax delay = maximum allowed delayhigh = max

k;jtask timei(k; j; t)

low = mink;j

task timei(k; j; t)

scale factor = max delay�min delay

high�low

Æ fast ij(t) = Rate of impatience of robot riconcerning behavior set aij in theabsence of other robots performing asimilar behavior set

=

�zcase1 if task categoryij(t) = 2zcase2 otherwise

where:zcase1 = �

min delay+(task timei(i;j ;t)�low)�scale factor

zcase2 = �max delay�(task timei(i;j ;t)�low)�scale factor

The robot acquiescence parameters are updated as fol-lows:

ij(t) = Time robot ri wants to maintain behavior setaij 's activity before yielding to another robot.

= task timei(i; j; t)These parameter settings cause the robot team to e�ec-

tively select their actions according to the algorithm givenearlier in this subsection. Refer to [31] for the derivationof this algorithm.

IV. Results

The ALLIANCE architecture has been successfully im-plemented in a variety of proof-of-concept applications onboth physical and simulated mobile robots. The applica-tions implemented on physical robots include two versionsof a hazardous waste cleanup mission and a cooperativebox pushing demonstration [28]. The applications usingsimulated mobile robots include a janitorial service mis-sion [27] and a bounding overwatch mission (reminiscent

of military surveillance) [31]. All of these missions usingthe ALLIANCE architecture have been well-tested. Over60 logged (and many videotaped) physical robot runs ofthe hazardous waste cleanup mission and over 30 phys-ical robot runs (many of which were videotaped) of thebox pushing demonstration were completed to elucidatethe important issues in heterogeneous robot cooperation.The missions implemented on simulated robots encompassdozens of runs each, most of which were logged in the studyof the action selection mechanism.

The experimental mission we describe here to illustratethe fault tolerant action selection features of ALLIANCEis a laboratory version of hazardous waste cleanup. (Refer[29], [31] for a somewhat di�erent version of the hazardouswaste cleanup mission, which involved the use of only onespill, rather than the two spills described below.) We �rstdescribe the robots used in these experimental studies, fol-lowed by a description of the mission the robots were given.We then describe the behavior set design of the robots forthis mission, followed by the results of the implementation.The results described below are available on videotape [30].

A. The Robots

Our empirical studies were conducted on teams of threeR-2 robots purchased commercially from IS Robotics. Eachof these robots is a small, fully autonomous wheeled vehiclemeasuring approximately 25 centimeters wide, 31 centime-ters deep, and 35 centimeters tall. The R-2 has two drivewheels arranged as a di�erential pair, two caster wheelsin the rear for stability, and a two-degree-of-freedom par-allel jaw gripper for grasping objects. The robot sensorysuite includes eight infrared proximity sensors for use incollision avoidance, piezoelectric bump sensors distributedaround the base of the robot for use in collision detection,and additional bump sensors inside the gripper for use inmeasuring gripping force.

We note here that although these robots are of the sametype and thus have the potential of maximum redundancyin capabilities, mechanical drift and failure can cause themto have quite di�erent actual abilities. For example, oneof our robots had full use of its side infrared (IR) sen-sors which allowed it to perform wall-following, whereasthe side IR sensors of two of the other robots had becomedysfunctional. The L-ALLIANCE learning and parameterupdate system outlined in section III-E gives these robotsthe ability to take advantage of these di�erences and thusdetermine from trial to trial which team member is bestsuited for which task.

A radio communication system allows robot team mem-bers to communicate with each other. This radio systemis integrated with a positioning system, which consists ofa transceiver unit attached to each robot plus two sonarbase stations for use in triangulating the robot positions.The positioning system is accurate to about 15 centimetersand is useful for providing robots with information on theirown position with respect to their environment and withrespect to other robot team members.


B. The Hazardous Waste Cleanup Mission

Illustrated in �gure 3, the laboratory version of the haz-ardous waste cleanup mission requires two arti�cially \haz-ardous" waste spills in an enclosed room to be cleaned upby a team of three robots. This mission requires robotteam members to perform the following distinct tasks: therobot team must locate the two waste spills, move thetwo spills to a goal location, while also periodically report-ing the team progress to humans monitoring the system.These tasks are referred to in the remainder of this arti-cle as �nd-locations, move-spill(left), move-spill(right), andreport-progress, where left and right refer to the locationsof the two spills relative to the room entrance.

Desired FinalSpill Location

Initial SpillLocations

Site from whichto report progress

*Ent

ranc

e

Robots

Fig. 3. The experimental mission: hazardous waste cleanup.

A diÆculty in this mission is that the human monitordoes not know the exact location of the spills in robot coor-dinates, and can only give the robot team qualitative infor-mation on the initial location of the two spills and the �naldesired location to which the robots must move the spills.Thus, the robots are told qualitatively that one spill is lo-cated in the right half of the front third of the room, whilethe other spill is located in the left half of the front thirdof the room. Furthermore, the robots are also told thatthe desired �nal location of the spill is in the back, centerof the room, relative to the position of the entrance. Thisinformation is used as described below to locate the initialand �nal spill locations. To prevent interference amongrobots, ideally only one robot at a time attempts to �ndthe spill, broadcasting the computed locations to the otherteam members once the task is complete.

Each robot was preprogrammed to have the following be-havior sets, which correspond to high-level tasks that mustbe achieved on this mission: �nd-locations-methodical, �nd-locations-wander, move-spill(loc), and report-progress. Alow-level avoid-obstacles behavior was active at all times inthese robots except during portions of the move-spill task,when it was suppressed to allow the robot to pick up thespill object. The organization of the behavior sets for thismission is shown in �gure 4.

Two behavior sets are provided which both accomplishthe task of �nding the initial and �nal spill locations |�nd-locations-methodical and �nd-locations-wander|bothof which depend upon the workspace being rectangular and

Motiv. Beh:find-locs-

meth.

Motiv. Beh:move-spill

(loc)

Motiv. Beh:report-

progress

Behavior Set:find-locs-

meth.

Behavior Set:find-locs-wander

Behavior Set:move-spill

(loc)

Behavior Set:report-progress

avoid-obstacles

Left, RightMotor Velocities

Motiv. Beh:find-locs-wander

FrontIRs

Curr.x,y

pos.

SideIRs

Inter-RobotComm.

cross-inhibition

Grip, Lift Pos.

S

S

Spill StartFinal Locs.

Radio Report

Hazardous Waste Cleanup: Behavior Organization

Fig. 4. The ALLIANCE-based control of each robot in the hazardouswaste cleanup mission. Not all sensory inputs to the behaviorsets are shown here. In this �gure, the high-level task achievingfunctions �nd-locations-methodical and �nd-locations-wander areabbreviated as �nd-locs-meth and �nd-locs-wander, respectively.

on the sides of the room being parallel to the axes of theglobal coordinate system. Because of these assumptions,these behavior sets do not serve as generally applicablelocation-�nders. However, we made no attempt to gener-alize these algorithms, since the point of this experiment isto demonstrate the adaptive action selection characteristicsof ALLIANCE. Shown in more detail in �gure 5, the me-thodical version of �nding the spill location is much morereliable than the wander version, and involves the robot�rst noting its starting (or home) x; y position and thenfollowing the walls of the room using its side IRs until ithas returned to its home location while tracking the min-imum and maximum x and y positions it reaches. It thenuses these x; y values to calculate the coordinates of theright and left halves of the front third of the room (for thetwo initial spill locations) and the back center of the room(for the �nal spill location). These locations are then madeavailable to the move-spill(loc) behavior set, which requiresthis information to perform its task.

The wander version of �nding the initial and desired �nalspill locations, shown in �gure 6, avoids the need for side IRsensors by causing the robot to wander in each of the fourdirections (west, north, east, and south) for a �xed timeperiod. While the robot wanders, it tracks the minimumand maximum x and y positions it discovers. Upon theconclusion of the wandering phase, the robot calculates thedesired initial and �nal locations from these minimum andmaximum x; y values.

The move-spill(loc) behavior set, shown in more detail in�gure 7, can be activated whenever there are spill objectsneeding to be picked up at loc, the locations of the initialand �nal spill positions are known, and the robot is notaware of any other robot currently working on the spill atloc. It involves having the robot (1) move to the vicinityof the initial spill location, (2) wander in a straight linethrough the area of the spill while using its front IR sensorsto scan for spill objects, (3) \zero in" on a spill object onceit is located to center it in the gripper, (4) grasp and liftthe spill object, (5) move to the vicinity of the �nal spill


correct drifttoo far away

go straight

correct drifttoo close

Circle Perimeter Layer

S

S

Left, RightWheel

Velocities

Side IRs

Mark Home

Look forHome

Track x,y

Calculatestart & final

locs

currentx, y

position

Spill start andfinal

locations

Monitor Location Layer

ready

We're there

min,maxx, y

found

Control fromFind-Locations-Methodical

Motivational Behavior

Behavior Set:Find-Locations-Methodical

Fig. 5. The robot control organization within the �nd-locations-methodical behavior set.

Turn and goin direction dir

Set direction W, N, E, then S

Wander 4 directions Layer

Left, RightWheel

Velocities

Track x,y

Calculatestart & final

locs

currentx, y

position

Spill start andfinal

locations

Monitor Wander Location Layer

We're there

min,maxx, y

found

Control fromFind-Locations-Wander Motivational Behavior

Behavior Set:Find-Locations-Wander

Clock

dir"Compass"

Fig. 6. The robot control organization within the �nd-locations-wander behavior set.

location, and then (6) lower and release the spill object. Tominimize interference among robots in a relatively smallspace, ideally only one robot at a time should work on agiven spill.

The report-progress behavior set, shown in �gure 8, cor-responds to the high-level task that the robot team is re-quired to perform approximately every 4 minutes duringthe mission. This task involves returning to the room en-trance and informing the human monitoring the system ofthe activities of the robot team members and some infor-mation regarding the success of those activities. Note thatthis task only needs to be performed by the team as a whole

Transport Spill Object Layer

Left, RightWheel

Velocities

Wander Start area

currentx, y

position

Get to Start Layer

Control fromMove-Spill(loc)

Motivational BehaviorBehavior Set: Move-Spill(loc)

"Compass"

spill startlocation

Center

Reject non-spill object

Grasp

Hold

Transport

Drop

S

S

S

S

S

S

Grip, LiftPositions

spill finallocation

finger force,breakbeam

Front IRs

Fig. 7. The robot control organization within the move-spill(loc)behavior set.

Go to entrance

Get to Entrance Layer

Left, RightWheel

Velocities

Stop & Broadcastthe report

currentx, y

position

Give the Report Layer

Control fromReport Progress

Motivational Behavior

Behavior Set: Report-Progress

"Compass"

S

RadioReport

Fig. 8. The robot control organization within the report-progressbehavior set.

every 4 minutes, not by all team members. In a real-life ap-plication of this sort, the progress report would most likelybe delivered via a radio message to the human. However, inthis experiment no actual progress information was main-tained (although it could easily be accomplished by loggingthe radio-transmitted robot activities), and delivering thereport consisted of playing an audible tune on the robot'spiezoelectric buzzer from the room entrance rather thanrelaying a radio message.

C. Experiments

We report here the experiments we conducted to testthe ability of ALLIANCE to achieve fault-tolerant coop-erative control of our team of mobile robots performingthe hazardous waste cleanup mission. In all of the follow-ing experiments, teams of three R-2 robots were utilized inan environmental setup very similar to that depicted in �g-ure 3; we will refer to these robots individually as GREEN,BLUE, and GOLD. All the robots began their missions atthe room entrance, as shown in �gure 9.


Fig. 9. The robot team at the beginning of the hazardous wastecleanup mission.

0 200 400 600 800 1000 1200 1400 1600 1800time (seconds)

RPMS(L)MS(R)

FLWFLMIdle

RPMS(L)MS(R)

FLWFLMIdle

RPMS(L)MS(R)

FLWFLMIdle

GREEN

BLUE

GOLD

ACTION

Fig. 10. Typical robot actions selected during experiment with norobot failures. This is one instance of many runs of this mis-sion. In this and the following �gures showing traces of ac-tion selections, the meanings of the abbreviations are as fol-lows: RP stands for report-progress; MS(L) and MS(R) stand formove-spill(left) and move-spill(right), respectively; FLW standsfor �nd-locations-wander; and FLM stands for �nd-locations-methodical.

Figure 10 shows the action selection results of a typicalexperimental run when no robot failures occur; �gure 11shows the corresponding motivation levels during this run.As re ected in these �gures, at the beginning of the mis-sion, GREEN has the highest motivation to perform behav-ior set �nd-locations-methodical, causing it to initiate thisaction. This causes BLUE and GOLD to be satis�ed for awhile that the initial and �nal spill locations are going tobe found; since no other task can currently be performed,they sit idle, waiting for the locations to be found.However, they do not idle forever waiting on the loca-

tions to be found. As they wait, they become more andmore impatient over time, which can cause one of BLUEor GOLD to decide to �nd the spill and goal locations.

BLUE

Report-Progress

Move-Spill(left)

Move-Spill(right)

Find-locs-wander

Find-locs-methodical

GREEN

GOLD

Time

0

100

200

300

400

500

600

700

800

900

1000

1100

1200

1300

1400

Report-Progress

Move-Spill(left)

Move-Spill(right)

Find-locs-wander


Report-Progress

Move-Spill(left)

Move-Spill(right)

Find-locs-wander


Fig. 11. Motivational levels for behavior sets during experiment withno robot failures. The dashed lines represent the thresholds ofactivation of each behavior set.

Indeed, this does happen in a situation as shown in thephotograph in �gure 12, in which we intentionally interferewith GREEN's ability to �nd the spill and goal locations.As shown in the action trace of �gure 13, (with the corre-sponding motivational trace in �gure 14) this leads to oneof the remaining robots | namely, BLUE | to activate its�nd-locations-wander behavior set. (Note that BLUE doesnot activate its �nd-locations-methodical behavior set be-cause its side infrared sensors failed during previous runs,preventing BLUE from successfully accomplishing that be-havior set. This behavior set is left in BLUE's repertoire toallow it to respond to some potential future event that mayrestore the working-order of the infrared sensors. Its moti-vations were altered based upon the L-ALLIANCE mech-anism outlined in section III-E.) In this case, GREENacquiesces its attempt to �nd the spill and goal locationsto BLUE, since GREEN realized it was encountering diÆ-culties of some sort. In either case, the robot �nding thespill and goal locations reports these locations to the restof the team.At this point, the environmental feedback and knowledge

of the spill and goal locations indicate to the robot teamthat the move-spill(loc) behavior set is applicable. As wesee in �gure 10, GREEN selects to move the left spill whileBLUE selects to move the right spill. Since only one robotat a time should work on a given spill (as described in sec-tion IV-B), GOLD sits idle, satis�ed that the left and rightspills are going to be moved. Figure 15 shows a photographof the robots at this stage in the mission.In the meantime, the robots' impatience motivations to

report the team's progress are increasing. Since GOLD is


Fig. 12. Here, we interfere with GREEN, which is attempting tolocate the spill and goal locations, thus preventing it from com-pleting this task.

0 200 400 600 800 1000 1200 1400 1600 1800time (seconds)

RPMS(L)MS(R)

FLWFLMIdle

RPMS(L)MS(R)

FLWFLMIdle

RPMS(L)MS(R)

FLWFLMIdle

GREEN

BLUE

GOLD

ACTION

Fig. 13. Typical robot actions selected during experiment when theinitial robot action of �nding the spill fails. This is one instanceof many runs of this mission.

not performing any other tasks, it is the �rst to activateits report-progress behavior set. This reporting satis�esthe remainder of the team, so they continue to move thetwo spills. This periodic reporting of the progress through-out the mission by GOLD is re ected in the diagrams in�gures 10 and 13. In these particular examples, GOLDhas e�ectively specialized as the progress reporting robot,whereas GREEN and BLUE have specialized as the move-spill robots. The mission continues in this way until bothspills are moved from their starting location to the goaldestination. Figure 16 shows two of the robots deliveringspill objects to the goal destination.

To illustrate the e�ect of unexpected events on the ac-tion selection of the team, we next experimented with dy-namically altering the composition of the team during themission. Figure 17 shows the e�ect on the mission when

BLUE

Report-Progress

Move-Spill(left)

Move-Spill(right)

Find-locs-wander


GREEN

GOLD

Time

Report-Progress

Move-Spill(left)

Move-Spill(right)

Find-locs-wander


Report-Progress

Move-Spill(left)

Move-Spill(right)

Find-locs-wander

Find-locs-methodical0

0

100

200

300

400

500

600

700

800

900

1000

1100

1200

1300

1400

1500

1600

1700

Fig. 14. Motivational levels for behavior sets during experiment withspill �nding error. The dashed lines represent the thresholds ofactivation of each behavior set.

Fig. 15. Now knowing the location of the two spills, two R-2 robotsare in the process of moving their respective spills to the goallocation.

Fig. 16. Robots delivering spill objects to the goal destination.


0 200 400 600 800 1000 1200 1400 1600 1800time (seconds)

RPMS(L)MS(R)

FLWFLMIdle

RPMS(L)MS(R)

FLWFLMIdle

RPMS(L)MS(R)

FLWFLMIdle

GREEN

BLUE

GOLD

ACTION

Fig. 17. Typical robot actions selected during experiment when oneof the robot team members which is moving a spill is removed.This is one instance of many runs of this mission. In this �gure,the \X" indicates the point in time when the corresponding robotwas removed from the robot team.

we removed BLUE from the team. Figures 18, 19, and 20show the corresponding motivation, impatience, and ac-quiescence levels for the three robots GREEN, BLUE, andGOLD. The removal of BLUE caused GOLD to becomeimpatient that the right spill was not being moved, whichin turn caused GOLD to activate its behavior set to movethe right spill. However, this then e�ectively removes fromthe team the robot that is performing all of the progressreports, leading the remaining two robots | GREEN andGOLD | to have to interrupt their spill-moving activitiesto occasionally report the progress. A similar e�ect can beobserved in �gure 21 (with the corresponding motivationsshown in �gure 22), when we remove GOLD from the team.

D. Discussion

These experiments illustrate a number of primary char-acteristics we consider important in developing cooperativerobotic teams. First of all, the cooperative team underALLIANCE control is robust, in that robots are allowedto continue their actions only as long as they demonstratetheir ability to have the desired e�ect on the world. Thiswas illustrated in the experiments by BLUE and GOLDbecoming gradually more impatient with GREEN's searchfor the spill. If GREEN did not locate the spill in a reason-able length of time then one of the remaining robots wouldtake over that task, with GREEN acquiescing the task.

Secondly, the cooperative team is able to respond au-tonomously to many types of unexpected events either inthe environment or in the robot team without the needfor external intervention. As we illustrated, at any timeduring the mission, we could disable or remove robot teammembers, causing the remaining team members to performthose tasks that the disabled robot would have performed.

Report-Progress

Move-Spill(left)

Move-Spill(right)

Find-locs-wander


Impatience

Report-Progress

Move-Spill(left)

Move-Spill(right)

Find-locs-wander


Acquiescence

Report-Progress

Move-Spill(left)

Move-Spill(right)

Find-locs-wander


GREEN

Motivation Level

Time

0

100

200

300

400

500

600

700

800

900

1000

1100

1200

1300

1400

1500

1600

1700

1800

Fig. 18. Motivational, impatience, and acquiescence levels for be-havior sets of GREEN during experiment when a team memberthat is moving a spill is removed. The dashed lines represent thethresholds of activation.

Report-Progress

Move-Spill(left)

Move-Spill(right)

Find-locs-wander


Impatience

Report-Progress

Move-Spill(left)

Move-Spill(right)

Find-locs-wander


Acquiescence

BLUE

Report-Progress

Move-Spill(left)

Move-Spill(right)

Find-locs-wander

Find-locs-methodical ROBOT REMOVED FROM TEAM

Time

0

100

200

300

400

500

600

700

800

900

1000

1100

1200

1300

1400

1500

1600

1700

1800

Motivation Level

Fig. 19. Motivational, impatience, and acquiescence levels for be-havior sets of BLUE during experiment when a team memberthat is moving a spill is removed. The dashed lines represent thethresholds of activation.


Report-Progress

Move-Spill(left)

Move-Spill(right)

Find-locs-wander


Impatience

Time

0

100

200

300

400

500

600

700

800

900

1000

1100

1200

1300

1400

1500

1600

1700

1800

Report-Progress

Move-Spill(left)

Move-Spill(right)

Find-locs-wander


Acquiescence

GOLD

Report-Progress

Move-Spill(left)

Move-Spill(right)

Find-locs-wander


Motivation Level

Fig. 20. Motivational, impatience, and acquiescence levels for be-havior sets of GOLD during experiment when a team memberthat is moving a spill is removed. The dashed lines represent thethresholds of activation.

0 200 400 600 800 1000 1200 1400 1600 1800time (seconds)

RPMS(L)MS(R)

FLWFLMIdle

RPMS(L)MS(R)

FLWFLMIdle

RPMS(L)MS(R)

FLWFLMIdle

GREEN

BLUE

GOLD

ACTION

Fig. 21. Typical robot actions selected during experiment whenone of the robot team members which is reporting the progressis removed. In this �gure, the \X" indicates the point in timewhen the corresponding robot was removed from the robot team.This is one instance of many runs of this mission.

BLUE

Report-Progress

Move-Spill(left)

Move-Spill(right)

Find-locs-wander


GREEN

GOLD

Time

Report-Progress

Move-Spill(left)

Move-Spill(right)

Find-locs-wander


Report-Progress

Move-Spill(left)

Move-Spill(right)

Find-locs-wander


0

100

200

300

400

500

600

700

800

900

1000

1100

1200

1300

1400

1500

1600

1700

1800

ROBOT REMOVED FROM TEAM

Fig. 22. Motivational levels for behavior sets during experimentwhen a team member that is reporting the progress is removed.The dashed lines represent the thresholds of activation of eachbehavior set.

Clearly, we could also have easily increased or decreased thesize of the spill during the mission and the robots wouldnot be adversely a�ected.

Third, the cooperative team need have no a priori knowl-edge of the abilities of the other team members to e�ec-tively accomplish the task. As previously noted, the learn-ing/parameter update system, L-ALLIANCE, allows theteam to improve its eÆciency on subsequent trials when-ever familiar robots are present.

Other characteristics of ALLIANCE have also been stud-ied which show that this architecture allows robot teamsto accomplish their missions even when the communicationsystem providing it with the awareness of team memberactions breaks down. Although the team's performance interms of time and energy may deteriorate, at least the teamis still able to accomplish its mission. Refer to [31] for adeeper discussion of these and related issues.

The primary weakness of ALLIANCE is its restrictionto independent subtasks. As it is designed, one has to ex-plicitly state ordering dependencies in the preconditions ifthe order of subtask completion is important. No mecha-nism for protecting subgoals is provided in ALLIANCE. Forexample, consider a janitorial service team of two robotsthat can both empty the garbage and clean the oor. How-ever, let us say that the robots are clumsy in emptying thegarbage, and usually drop garbage on the oor while theyempty the trash. Then, it is logical that the trash shouldbe emptied before beginning to clean the oor. Under AL-LIANCE, this mission can be accomplished in one of threeways:

1. An explicit ordering dependency can be stated in ad-


vance between the two subtasks, causing the robots toautomatically choose to empty the garbage �rst

2. The robots may fortuitously select to empty thegarbage �rst, because of the parameter settings, or

3. The robots will be ineÆcient in their approach, clean-ing the oor before or during the garbage emptyingtask, which leads to the need to clean the oor again.

Although all of these events lead to the mission beingcompleted, eÆciency is lost in the third situation, and ad-ditional explicit knowledge has to be provided in the �rstsituation. For the applications that have been implementedin ALLIANCE (e.g. janitorial service, bounding overwatch,mock hazardous waste cleanup, cooperative box pushing),this limitation has not been a serious problem. Neverthe-less, enhancing the ALLIANCE action selection mechanismto enable more eÆcient execution of these special case situ-ations without the need for providing additional controllinginformation is a topic of future research.

V. Conclusions

We have presented a fully distributed, behavior basedarchitecture called ALLIANCE, which facilitates fault tol-erant mobile robot cooperation. A number of key char-acteristics of ALLIANCE provide these fault tolerant co-operative features. ALLIANCE enhances team robustnessthrough the use of the motivational behavior mechanismwhich constantly monitors the sensory feedback of the tasksthat can be performed by an individual robot, adapting theactions selected by that robot to the current environmen-tal feedback and the actions of its teammates. Whetherthe environment changes to require the robots to performadditional tasks or to eliminate the need for certain tasks,ALLIANCE allows the robots to handle the changes u-idly and exibly. This same mechanism allows robot teammembers to respond to their own failures or to failures ofteammates, leading to adaptive action selection to ensuremission completion. ALLIANCE further enhances teamrobustness by making it easy for robot team members todeal with the presence of overlapping capabilities on theteam. The ease with which redundant robots can be in-corporated on the team provides the human team designerthe ability to utilize physical redundancy to enhance theteam's fault tolerance. The feasibility of this architecturefor achieving fault tolerance has been illustrated through anexample implemented on a physical robot team performinga laboratory version of hazardous waste cleanup.

Acknowledgements

The author wishes to thank Prof. Rodney A. Brooks ofthe Massachusetts Institute of Technology's Arti�cial Intel-ligence Laboratory, who supervised this research, and thethree anonymous reviewers who provided valuable sugges-tions for improving earlier versions of this paper.

Appendix

A. Proofs of Termination

When evaluating a control architecture for multi-robotcooperation, it is important to be able to predict the team's

expected performance using that architecture in a wide va-riety of situations. One should be justi�ably wary of usingan architecture that can fail catastrophically in some situ-ations, even though it performs fairly well on average. Atthe heart of the problem is the issue of reliability | how de-pendable the system is, and whether it functions properlyeach time it is utilized. To properly analyze a cooperativerobot architecture we should separate the architecture itselffrom the robots on which the architecture is implemented.Even though individual robots on a team may be quiteunreliable, a well-designed cooperative architecture couldactually be implemented on that team to allow the robotsto very reliably accomplish their mission, given a suÆcientdegree of overlap in robot capabilities. On the other hand,an architecture should not be penalized for a team's fail-ure to accomplish its mission even though the architecturehas been implemented on extremely reliable robots, if thoserobots do not provide the minimally acceptable mix of ca-pabilities. A major diÆculty, of course, is de�ning rea-sonable evaluation criteria and evaluation assumptions bywhich an architecture can be judged. Certain characteris-tics of an architecture that extend its application domainin some directions may actually reduce its e�ectiveness forother types of applications. Thus, the architecture mustbe judged according to its application niche, and how wellit performs in that context.

ALLIANCE is designed for applications involving a sig-ni�cant amount of uncertainty in the capabilities of robotteam members which themselves operate in dynamic, un-predictable environments. Within this context, a key pointof interest is whether the architecture allows the team tocomplete its mission at all, even in the presence of robotdiÆculties and failure. This section examines this issue byevaluating the performance of ALLIANCE in certain dy-namic environments.

Let us consider realistic applications involving teams ofrobots that are not always able to successfully accomplishtheir individual tasks; we use the term limitedly-reliablerobot to refer to such robots. The uncertainty in the ex-pected e�ect of robots' actions clearly makes the coop-erative control problem quite challenging. Ideally, AL-LIANCE's impatience and acquiescence factors will allow arobot team to successfully reallocate actions as robot fail-ures or dynamic changes in the environment occur. Withwhat con�dence can we know that this will happen in gen-eral? As we shall see below, in many situations ALLIANCEis guaranteed to allow a limitedly-reliable robot team tosuccessfully accomplish its mission.

It is interesting to note that with certain restrictions onparameter settings, the ALLIANCE architecture is guar-anteed to allow the robot team to complete its mission fora broad range of applications. We describe these circum-stances here, along with the proof of mission termination.

We �rst de�ne the notions of goal-relevant capabilitiesand task coverage.

De�nition 1: The goal-relevant capabilities of robot ri,GRCi, are given by the set:


GRCi = faij jhi(aij) 2 Tg

where T is the set of tasks required by the current mission.In other words, the capabilities of robot ri that are rel-

evant to the current mission (i.e. goal) are simply thosehigh-level task-achieving functions which lead to some taskin the current mission being accomplished.We use the term task coverage to give a measure of the

number of capabilities on the team that may allow someteam member to achieve a given task. However, we cannotalways predict robot failures; thus, at any point during amission, a robot may reach a state from which it cannotachieve a task for which it has been designed. This impliesthat the expected task coverage for a given task in a mis-sion may not always equal the true task coverage once themission is underway.De�nition 2: Task coverage is given by:

task coverage(taskk) =

nXi=1

Xj

�1 if (hi(aij) = taskk)0 otherwise

�

The task coverage measure is useful for composing ateam of robots to perform a mission from an available poolof heterogeneous robots. At a minimum, we need the teamto be composed so that the task coverage of all tasks in themission equals 1. This minimum requirement ensures that,for each task required in the mission, a robot is present thathas some likelihood of accomplishing that task. Withoutthis minimum requirement, the mission simply cannot becompleted by the available robots. Ideally, however, therobot team is composed so that the task coverage for alltasks is greater than 1. This gives the team a greater degreeof redundancy and overlap in capabilities, thus increasingthe reliability and robustness of the team amidst individualrobot failures.Let us now de�ne the notion of suÆcient task coverage

as follows:Condition 1: (SuÆcient task coverage):

8(taskk 2 T ):(task coverage(taskk)) � 1

This condition ensures that, barring robot failures, alltasks required by the mission should be able to be accom-plished by some robot on the team.Now, we de�ne the notion of an active robot team, since

we consider our robots to be useful only if they can bemotivated to perform some action:De�nition 3: An active robot team is a group of robots,

R, such that:

8(ri 2 R):8(aij 2 GRCi):8(rk 2 R):8t:[(Æ slow ij(k; t) >0)V(Æ fast ij(t) > 0)

V(� is �nite)]

In other words, an active robot has a monotonically in-creasing motivation to perform any task of the missionwhich that robot has the ability to accomplish. Addition-ally, the threshold of activation of all behavior sets of anactive robot is �nite.

Finally, we de�ne a condition that holds in many multi-robotic applications.

Condition 2: (Progress when Working):Let z be the �nite amount of work remaining to completea task w. Then whenever robot ri activates a behavior setcorresponding to task w, either (1) ri remains active for asuÆcient, �nite length of time � such that z is reduced bya �nite amount which is at least some constant Æ greaterthan 0, or (2) ri experiences a failure with respect to taskw. Additionally, if z ever increases, the increase is due toan in uence external to the robot team.

Condition 2 ensures that even if robots do not carry atask through to completion before acquiescing, they stillmake some progress toward completing that task wheneverthe corresponding behavior set is activated for some timeperiod at least equal to �. One exception, however, is if arobot failure has occurred that prevents robot ri from ac-complishing task w, even if ri has been designed to achievetask w.

This condition also implies that if more than one robotis attempting to perform the same task at the same time,the robots do not interfere with each others' progress sobadly that no progress towards completion of the task ismade. The rate of progress may be slowed somewhat, oreven considerably, but some progress is made nevertheless.

Finally, Condition 2 implies that the amount of workrequired to complete the mission never increases as a re-sult of robot actions. Thus, even though robots may notbe any help towards completing the mission, at least theyare not making matters worse. Although this may not al-ways hold true, in a wide variety of applications this is avalid assumption. As we shall see, this assumption is nec-essary to prove the e�ectiveness of ALLIANCE in certainsituations. Of course, this does not preclude dynamic en-vironmental changes from increasing the workload of therobot team, which ALLIANCE allows the robots to handlewithout problem.

What we now show is that whenever conditions 1 and2 hold for a limitedly-reliable, active robot team, then ei-ther ALLIANCE allows the robot team to accomplish itsmission, or some robot failure occurs. Furthermore, if arobot failure occurs, then we can know that any task thatremains incomplete at the end of the mission is either atask that the failed robot was designed to accomplish, or atask that is dependent upon the capabilities of that robot.

We can now show the following:

Theorem 1: Let R be a limitedly-reliable, active robotteam, and M be the mission to be solved by R, such thatConditions 1 and 2 hold. Then either (1) ALLIANCE en-ables R to accomplish M , or (2) a robot failure occurs.Further, if robot rf fails, then the only tasks of M thatare not completed are some subset of (a) the set of tasksrf was designed to accomplish, unioned with (b) the set oftasks dependent upon the capabilities of rf .

Proof:

First, we show that the calculation of the motivationalbehavior guarantees that each robot eventually activatesa behavior set whose sensory feedback indicates that the


corresponding task is incomplete. From equation 1 in sec-tion III-D.7, we see that at time t, robot ri's motivationmij(t) to perform behavior set aij either (1) goes to 0, or(2) changes from mij(t�1) by the amount impatience ij(t).The motivation goes to 0 in one of four cases: (1) if the sen-sory feedback indicates that the behavior set is no longerapplicable, (2) if another behavior set becomes active, (3)if some other robot has taken over task hi(aij), or (4) ifthe robot has acquiesced its task. If the sensory feedbackindicates that the behavior set is no longer applicable, weknow either that the task hi(aij) must be successfully ac-complished, or the robot's sensory system has failed. If an-other behavior set aik becomes active in ri, then at somepoint task hi(aij) will either become complete, thus allow-ing ri to activate behavior set aij , or the robot has failed.If some other robot has taken over task hi(aij), then eitherthat other robot will eventually accomplish task hi(aij),thus eliminating the need to activate task aij , or robot riwill become impatient with that other robot. Since ri isactive, then we know that impatience ij(t) is greater thanor equal to mink(Æ slow ij(k; t)), which is greater than 0.Therefore, we can conclude that an idle, yet active robotalways has a strictly increasing motivation to perform someincomplete task. At some point, the �nite threshold of ac-tivation, �, will thus be surpassed for some behavior set,causing ri to activate the behavior set corresponding totask hi(aij).

We now build upon these observations to prove that ei-ther the mission becomes accomplished, or a robot failureoccurs.

PART I (Either ALLIANCE succeeds or a robot fails):Assume no robot fails. Then after a robot ri has performeda task w for any period of time greater than �, one of �veevents can occur:

1. Robot rj takes over task w, leading robot ri to acqui-esce.

2. Robot ri gives up on itself and acquiesces w.3. Robot rj takes over task w, but ri does not acquiesce.4. Robot ri continues w.5. Robot ri completes w.

Since Condition 2 holds, we know that the �rst four casesreduce the amount of work left to complete task w by atleast a positive, constant amount Æ. Since the amount ofwork left to accomplish any task is �nite, the task musteventually be completed in �nite time. In the �fth case,since task w is completed, the sensory feedback of therobots no longer indicates the need to perform task w, andthus the robots will go on to some other task required bythe mission.

Thus, for every task that remains to be accomplished,either (1) a robot able to accomplish that task eventuallyattempts the task enough times so that it becomes com-plete, or (2) all robots designed to accomplish that taskhave failed.

PART II (Incomplete tasks are dependent upon a failed

robot's capabilities):Let F be the set of robots that fail during a mission, andAF be the union of (a) the tasks that the robots in F weredesigned to accomplish and (b) those tasks of the missionthat are dependent upon a task that a robot in F was de-signed to accomplish.

First, we show that if a task is not in AF , then it willbe successfully completed. Let w be some task requiredby the mission that is not included in AF . Since Condi-tion 1 holds and this robot team is active, there must besome robot on the team that can successfully accomplishw. Thus, as long as w remains incomplete, one of thesesuccessful robots will eventually activate its behavior setcorresponding to the task w; since condition 2 holds, thattask will eventually be completed in �nite time. Thus, alltasks not dependent upon the capabilities of a failed robotare successfully completed in ALLIANCE.

Now, we show that if a task is not completed, it must bein AF . Let w be a task that was not successfully completedat the end of the mission. Assume by way of contradictionthat w is not in AF . But we know from Part I that alltasks w not in AF must be completed. Therefore, task wmust be in AF .

We can thus conclude that if a task is not accomplished,then it must be a task for which all robots with that ca-pability have failed, or which is dependent upon some taskfor which all robots with that capability have failed. 2

Note that it is not required here that robot team mem-bers be aware of the actions of their teammates in orderto guarantee that ALLIANCE allows the team to completeits mission under the above conditions. However, aware-ness does have an e�ect on the quality of the team's perfor-mance, both in terms of the time and the energy required tocomplete the mission. These e�ects on team performanceare discussed in [31].

References

[1] Ronald C. Arkin, Tucker Balch, and Elizabeth Nitz. Commu-nication of behavioral state in multi-agent retrieval tasks. InProceedings of the 1993 International Conference on Roboticsand Automation, pages 588{594, 1993.

[2] H. Asama, K. Ozaki, A. Matsumoto, Y. Ishida, and I. Endo. De-velopment of task assignment system using communication formultiple autonomous robots. Journal of Robotics and Mecha-tronics, 4(2):122{127, 1992.

[3] Tucker Balch and Ronald C. Arkin. Communication in reac-tive multiagent robotic systems. Autonomous Robots, 1(1):1{25,1994.

[4] Gerardo Beni and Jing Wang. On cyclic cellular robotic sys-tems. In Japan { USA Symposium on Flexible Automation,pages 1077{1083, Kyoto, Japan, 1990.

[5] Alan Bond and Less Gasser. Readings in Distributed Arti�cialIntelligence. Morgan Kaufmann, 1988.

[6] Rodney A. Brooks. A robust layered control system for a mobilerobot. IEEE Journal of Robotics and Automation, RA-2(1):14{23, March 1986.

[7] Rodney A. Brooks. Elephants don't play chess. Robotics andAutonomous Systems, 6:3{15, 1990.

[8] Rodney A. Brooks. New approaches to robotics. Science,253:1227{1232, September 1991.

[9] Philippe Caloud, Wonyun Choi, Jean-Claude Latombe,Claude Le Pape, and Mark Yim. Indoor automation with manymobile robots. In Proceedings of the IEEE International Work-


shop on Intelligent Robots and Systems, pages 67{72, Tsuchiura,Japan, 1990.

[10] Stephanie Cammarata, David McArthur, and Randall Steeb.Strategies of cooperation in distributed problem solving. In Pro-ceedings of 8th International Joint Conference on Arti�cial In-telligence, pages 767{770, 1983.

[11] Y. Uny Cao, Alex Fukunaga, Andrew Kahng, and Frank Meng.Cooperative mobile robotics: Antecedents and directions. InProceedings of 1995 IEEE/RSJ International Conference on In-telligent Robots and Systems (IROS '95), pages 226{234, 1995.

[12] K. L. Chung. Elementary Probability Theory with StochasticProcesses. Springer-Verlag, New York, 1974.

[13] Paul Cohen, Michael Greenberg, David Hart, and Adele Howe.Real-time problem solving in the Phoenix environment. COINSTechnical Report 90-28, U. of Mass., Amherst, 1990.

[14] R. W. Conway, W. L. Maxwell, and L. W. Miller. Theory ofScheduling. Addison-Wesley, Reading, Massachusetts, 1967.

[15] Randall Davis and Reid Smith. Negotiation as a metaphor fordistributed problem solving. Arti�cial Intelligence, 20(1):63{109, 1983.

[16] J. Deneubourg, S. Goss, G. Sandini, F. Ferrari, and P. Dario.Self-organizing collection and transport of objects in unpre-dictable environments. In Japan-U.S.A. Symposium on FlexibleAutomation, pages 1093{1098, 1990.

[17] Alexis Drogoul and Jacques Ferber. From Tom Thumb to theDockers: Some experiments with foraging robots. In Proceed-ings of the Second International Conference on Simulation ofAdaptive Behavior, pages 451{459, 1992.

[18] Edmund Durfee and Thomas Montgomery. A hierarchical pro-tocol for coordinating multiagent behaviors. In Proceedings ofthe Eighth National Conference on Arti�cial Intelligence, pages86{93, 1990.

[19] T. Fukuda, S. Nakagawa, Y. Kawauchi, and M. Buss. Self orga-nizing robots based on cell structures | CEBOT. In Proceedingsof 1988 IEEE International Workshop on Intelligent Robots andSystems (IROS '88), pages 145{150, 1988.

[20] Marcus Huber and Edmund Durfee. Observational uncertaintyin plan recognition among interacting robots. In Proceedings ofthe 1993 IJCAI Workshop on Dynamically Interacting Robots,pages 68{75, 1993.

[21] Thomas Kreifelts and Frank von Martial. A negotiation frame-work for autonomous agents. In Proceedings of the Second Eu-ropean Workshop on Modeling Autonomous Agents and Multi-Agent Worlds, pages 169{182, 1990.

[22] C. Ronald Kube and Hong Zhang. Collective robotic intelli-gence. In Proceedings of the Second International Workshop onSimulation of Adaptive Behavior, pages 460{468, 1992.

[23] Victor R. Lesser and Daniel D. Corkill. The distributed vehiclemonitoring testbed: A tool for investigating distributed problemsolving networks. AI Magazine, pages 15{33, Fall 1983.

[24] Bruce MacLennan. Synthetic ethology: An approach to thestudy of communication. In Proceedings of the 2nd interdisci-plinary workshop on synthesis and simulation of living systems,pages 631{658, 1991.

[25] Maja Mataric. Designing emergent behaviors: From local inter-actions to collective intelligence. In J. Meyer, H. Roitblat, andS. Wilson, editors, Proc. of the Second Int'l Conf. on Simulationof Adaptive Behavior, pages 432{441. MIT Press, 1992.

[26] Fabrice R. Noreils. Toward a robot architecture integrating co-operation between mobile robots: Application to indoor environ-ment. The International Journal of Robotics Research, 12(1):79{98, February 1993.

[27] Lynne E. Parker. Adaptive action selection for cooperative agentteams. In Jean-Arcady Meyer, Herbert Roitblat, and StewartWilson, editors, Proceedings of the Second International Confer-ence on Simulation of Adaptive Behavior, pages 442{450. MITPress, 1992.

[28] Lynne E. Parker. ALLIANCE: An architecture for fault tolerant,cooperative control of heterogeneous mobile robots. In Proc. ofthe 1994 IEEE/RSJ/GI Int'l Conf. on Intelligent Robots andSystems (IROS '94), pages 776{783, Munich, Germany, Sept.1994.

[29] Lynne E. Parker. An experiment in mobile robotic cooperation.In Proceedings of the ASCE Specialty Conference on Robotics forChallenging Environments, Albuquerque, NM, February 1994.

[30] Lynne E. Parker. Fault tolerant multi-robot cooperation. MITArti�cial Intelligence Lab Videotape AIV-9, December 1994.

[31] Lynne E. Parker. Heterogeneous Multi-Robot Cooperation. PhD

thesis, Massachusetts Institute of Technology, Arti�cial Intelli-gence Laboratory, Cambridge, MA, February 1994. MIT-AI-TR1465 (1994).

[32] Je�ery Rosenschein and M. R. Genesereth. Deals among rationalagents. In Proceedings of the International Joint Conference onArti�cial Intelligence, pages 91{99, 1985.

[33] Reid G. Smith and Randall Davis. Frameworks for cooperationin distributed problem solving. IEEE Transactions on Systems,Man, and Cybernetics, SMC-11(1):61{70, January 1981.

[34] Luc Steels. Cooperation between distributed agents through self-organization. In Yves Demazeau and Jean-Pierre Muller, editors,Decentralized A.I. Elsevier Science, 1990.

[35] Daniel Stilwell and John Bay. Toward the development of amaterial transport system using swarms of ant-like robots. InProceedings of IEEE International Conference on Robotics andAutomation, pages 766{771, 1993.

[36] Guy Theraulaz, Simon Goss, Jacques Gervet, and Jean-LouisDeneubourg. Task di�erentiation in Polistes wasp colonies: amodel for self-organizing groups of robots. In Proceedings ofthe First International Conference on Simulation of AdaptiveBehavior, pages 346{355, 1990.

[37] Jing Wang. DRS operating primitives based on distributed mu-tual exclusion. In Proceedings of the IEEE/RSJ InternationalConference on Intelligent Robots and Systems, pages 1085{1090,Yokohama, Japan, 1993.

[38] Gregory M. Werner and Michael G. Dyer. Evolution of com-munication in arti�cial organisms. In Proceedings of the 2ndinterdisciplinary workshop on synthesis and simulation of livingsystems, pages 659{687, 1991.

[39] Gilad Zlotkin and Je�ery Rosenschein. Negotiation and con ictresolution in non-cooperative domains. In Proceedings of theEighth National Conference on AI, pages 100{105, 1990.

ieee transa ctions on r obotics and a utoma ...web.eecs.utk.edu/~leparker/publications/tra.pdfieee...

Documents