design of an adaptive r obot controller for a predator-prey task using e-puck robots

MENG PROJECT

Design of an adaptive robot controller for a predator-prey

task using e-puck robots

The Goal

To design an adaptive robot controller capable of performing a predator-prey task,

in a reconfigurable maze

Project Specifics

Hardware

Software

• Programming in C;

• Player/Stage;

IR Sensors

Camera

Obstacles

Maze

Project Breakdown

The project can be broken down into 4 separate problems: Obstacle Avoidance – how to prevent a robot from

colliding with obstacles in the maze; Object Identification – how each robot can identify

each other and some additional objects; Predator-Prey Task – how to make a predator

chase a prey and a prey escape from a predator; Evolution – how to evolve the controllers so that

their performance increases. Evolution can also give a controller some adaptability.

Behaviour-based Architectures

BBTraditional

Obstacle Avoidance

Infra-red sensors

Obstacle Avoidance

Obstacle Avoidance

Rules included

Obstacle Avoidance

Exploring

Exploring

Monitor the turnrate of the robot for a given period of time;

If the variance of the turnrate is below a defined threshold, then the robot must perform a turn;

Otherwise, maintain course.

Exploring

• Bigger chance of sweeping the whole maze;

• Increased number of different perspectives.

Exploring

Object Identification

First solution:

Predator should identify yellow objects as its food; Prey should identify green objects as its food and yellow

objects as predators. Problem comes up: how to isolate an object of a certain

colour in na image? Image subtraction was the first solution implemented.


Which images should be subtracted?


Red Channel Blue Channel Green Channel


Red - Blue Blue - Green Green - Blue

Red - Green Blue - Red Green - Red


Segmenting the chosen images using the Otsu Method

• Due to poor results, green is ruled out as a

possible colour for the prey’s food. Blue and red are the new candidates.


Problem found: when no coloured objects are present in the image, the subtraction and Otsu Method give false results.

Object Detection

New solution: include an mbed board, that has 4 bright blue LEDs.


Establish new fixed threshold of 205, determined experimentally.

New feature: food is available for 3 minutes and unavailable for 20 seconds, cyclically.


Red Channel Blue Channel Green Channel


By looking at the images from all 3 channels, one can see that the yellow object is very dark in the blue channel, are bright in both red and green channels.

By processing each pixel individually:

If both the red and green components of a pixel are 40% larger than the blue component, then that pixel is considered to be yellow.


Results of the yellow object identification


How to retrieve information from the thresholded images?


A scope is then defined for both food searching and for predator avoidance. Food scope – if the target’s centre of mass is within the scope,

no turning occurs. It provides additional stability to the controller.

Avoid scope – if the centre of mass of the object identified is outside of the scope, no turning takes place. If it is within the scope, then the robot must turn to escape. It is the equivalent to the field of view in the natural world.

Predator-Prey Task

The main premises of the task are: The predator must chase the prey, and get as close to

it as possible; The prey must try to escape from the predator; The prey, like the predator, also feeds. In its case, the

food is the blue light source.

Predator-Prey Task

An energy level was included in each controller, to reflect the events of feeding and dying. Therefore, new specifications are made:

Each robot starts off with a random energy level between 80 and 100%;

If the robot’s energy falls below a defined Hungry Level, then the robot becomes hungry and starts to look for food. The predator begins searching for the prey, and the prey begins searching for the blue light source;

When a prey is caught by the predator, its energy level becomes 0% (death);

When a predator catches a prey, it eats until it is no longer hungry; A prey, when eating, should take approximately 3 seconds to eat

enough to give it 20 percentual energy points. It should eat until it is no longer hungry;

To be allowed to “eat”, the distance between the robot and the food source must not be larger than 5 cm.

Predator-Prey Task

Additional features included: Randomness in rebirth – when a robot dies, it is reborn

after a few seconds. When it is reborn, there is a 51% chance it will be reborn with the same role it had before, and a 49% chance it will be reborn with the other role (i.e., predator becomes prey, and vice-versa);

“Cannibalism” – one predator might make the decision of eating another predator. This is possible because of randomness in rebirth;

Role-changing – if a predator who is very low on energy attacks a prey that has a very high energy value, then they change roles;

360 “sweep” – the robot comes to a halt and rotates 360 degrees around itself, hoping to find food.

Predator-Prey Task

Communication between robots

Predator-Prey Task

Evolution

Birth

Life

Death

Evaluation

Selection

Mutation

Evolution

After a robot dies, the following processes take place: Evaluation – if the controller’s fitness value for the role it is

playing is equal or superior to the best fitness value found up to that point, then the controller is selected. Otherwise, the controller gets discarded, and skips to Mutation.

Selection – if the controller is selected, then its parameters become the best for that role, and are stored in the robot’s memory.

Mutation – the best set of parameters for the role it is playing is downloaded from the robot’s memory, and they are slightly changed. Each parameter gets its value changed by [-RANGE; RANGE].

The range is defined for each parameter, and it is percentual. The larger the parameter, the wider the range.

After these tasks are performed, the robot is then reborn with its new set of parameters, and competes until it runs out of energy. The process is then repeated.

Evolution

Evolution

How to design the fitness function?First of all, it is important to be aware that if one chooses

the fractional configuration for the fitness function:

The variables should be included in the function as shown.The three variables chose to compose the fitness function are:

o Number of times eaten by the predator;o Number of times that the robot fed;o Ratio between chases in which the robot caught food and

total chases that the robot performed.

Evolution

Keeping the fitness model in mind, the first value has to decrease the fitness, so it goes on the denominator. The other two contribute positively for the robot’s performance, and therefore should go on the numerator of the fraction model.

Evolution

The parameters chosen to be included in the Evolutionary Algorithm are as follows: Prey’s Food Gain – gain associated to rule 20, if prey; Prey’s Avoid Gain – gain associated to rule 21, if prey; Predator’s Food Gain – gain associated to rule 20, if predator; Predator’s Food Gain – gain associated to rule 21, if predator; Size of “food scope” – width of the scope previously mentioned; Size of “avoid scope” – width of the “field of view”; Hungry Level – energy level below which the robot becomes

hungry; Energy Danger Level – energy level below which the robot knows

it is about to die; Variance Threshold – the limit that regulates the exploratory

behaviour of the robot.

Evolution

Each robot then has two different sets of parameters: the ones assigned to the predator and the ones assigned to the prey. Since two robots participate in the experiment, and they evolve differently, then 4 sets of paramateres will evolve in different ways.

Evolution

Standard Configuration refers to when robot A is the predator and robot B the prey.

Swapped Configuration refers to when robot B is the predator and robot A the prey.

Results

The controllers were allowed to run for a few generations, and an interesting result came up:

Standard Configuration Swapped Configuration

Results

The prey’s fitness value increased dramatically because a prey-prey scenario occurred. Without anything to decrease its fitness value, the parameters’ path of evolution became corrupted. The environment would make the prey even less suitable to compete with the predator.

Therefore, to prevent both prey-prey and predator-predator scenarios, randomness in birth and cannibalism were excluded from the project.

Even so, the 4 sets of parameters all get a chance to evolve, since the role-changing still occurs when a predator is weak and catches a strong prey.

Evolution

Evolution is restarted, and tests are carried out. During the rest of the project, the controllers were left to evolve, without any intereference.

The Standard Configuration prey evolved throughout 141 generations, and the predator throughout 29.

The Swapped Configuration prey evolved throughout 72 generations, and the predator throughout 19.

It is important to be aware that all sets of parameters evolve from the same seed. The original set of parameters was defined through some basic experimentation.

Results

The fitness values of each generation

Standard Configuration

Results

The fitness values of each generation

Swapped Configuration

Results

To keep it simple, the results will now be focused on the Standard Configuration, that evolved throughout more generations.

Results

This is how the parameters evolved in Standard Configuration:

Prey

Predator

Results

Prey

Predator

Results

Results

However, it is difficult to look at this data and withdraw conclusions. There are, nevertheless, some comments to be made: The predator develops a much narrower food scope, in order to

quickly deal with an evasive prey. For some reason, the evolution of the Hungry Level and also of the

Energy Danger Level parameters in both predator and prey is very similar. What can this mean?

The predator’s variance threshold seems to have a tendency to become smaller, whereas the prey’s seems to be rising. Why? Maybe because the predator already performs periodic 360 sweeps, and therefore does not need to explore the maze as much as the prey.

The avoid scope is the strange result. Since it represents the robot’s “field of view” for the objects it is trying to avoid, shouldn’t the prey develop a wide avoid scope?

Results

The fitness obviously increased, but has the performance of the controller followed?

To prove this, a battery of tests was carried out, creating competitions between the Swapped Configuration predator/prey, the Standard Configuration predator/prey, and also the predator/prey with the original non-evolved set of parameters.

For these tests, evolution was halted. Each competition lasted approximately 20 minutes.

Results

Results

Major observations: Both predator and prey of the Standard Configuration show the

best performances on all tests. The predator shows a fitness improvement of 680%, and when competing against a non-evolved prey it didn’t even die. The prey shows a smaller improvement, of around 66%.

Both Standard and Swapped Configurations appear to have improved the controllers’ performance. The Swapped Configuration, having had less time to evolve, had a milder improvement of performance, of about 115% for the predator, and the prey actually suffered a 3% decrease in performance. This can be due to the fact that the number of generations is simply too small.

Also interesting is the fact that the Standard Configuration prey only dies once without eating when competing against a non-evolved predator. This is a major achievement.

Results

Major observations: Also, there are some parameters that evolve in totally

different ways in each configuration. For instance, the prey’s avoid gain and avoid scope size have a tendency to become large in the Swapped Configuration, and an opposing tendency to become small in the Standard Configuration. Comparing the performances, maybe it’s better for the prey to have no fear of the predator.

The Standard Configuration predator (the best) has also developed a food scope about 5 times smaller than the Swapped Configuration one. That may prove to be determinant in improving its performance.

Conclusions

Implementing a behaviour-based robotic controller using only a set of rules proved to be a clean and simple way to solve problems such as Obstacle Avoidance and Exploring, or even chasing a target.

The global behaviour of the robot is made up of small individual behaviours, competing and cooperating between themselves.

The image processing part was the bottleneck of the project. A lot of time spent on finding alternative solutions to the problem.

Conclusions

It is also the most sensitive part of the system. If lighting conditions change significantly, the Object Identification might yield false results, compromising the predator-prey task.

A larger number of generations would have produced a larger amount of data, which would have been useful to withdraw more conclusions.

On one pair of parameter sets, the performance of both predator and prey improved, thus confirming the Red Queen Effect.

Conclusions

If the maze changes, the parameters should be reset, and evolution should make the robots adapt to new conditions.

Without a map of its surroundings, the controllers are unable to come up with new strategies, and they are also unable to calculate either an absolute or relative localization in the maze.

Without localization, they cannot effectively understand its surroundings, and therefore adaptation to the environment is not exactly possible, but instead the robots end up adapting to each other.

Future Work

To add further constraints to the robot’s movement, one needs only to include new rules, and incorporate them in the functions that control the actuators.

The program is ready to incorporate a GRN;A camera with improved quality could really boost the

performance of the controllers, and allow for simpler processing.

A third robot might produce some very interesting developments in the project. With 2-on-1 scenarios, the fitness values might change completely, and even help speed up evolution.

Changes to the fitness function could also speed up evolution. For instance, rewarding a prey for the amount of time it stays alive.

The End

That’s it, thank you for your attention!

design of an adaptive r obot controller for a predator-prey task using e-puck robots

Documents

object identificationresults

object identificationproblem

object detectionnew

predator avoidance

yellow objects

green objects

food scope

mazeobject identification