artificial neural networks for robot control

Artificial Neural networks for Robot Control

Neural Networks 15/16

Why use ANNs for robotics?

All (!) we need for robot control is some method of transforming sensory input into motor output so why use ANNs?

• Argument from existence proof: the most successful adaptive machines we know of have some form of neural network.

While our nets are impoverished imitations of nature this supports the idea that networks ofrelatively simple units can generate adaptive behaviour over time,

If we want to reproduce similar types of adaptivity, it might seem sensible to start from similar types of system.

ANNSensory Data Motor outputs

OutputsInput data

2. ANNs are extremely flexible, with many ways that architecture can be modified, (changing weights to changing the entire architecture.

3. ANNs are well suited to incorporating mechanisms such as lifetime learning, potentially enabling agents to adapt to changing environments

4. ANNs can take input from a variety of sources, including both continuous and discrete sensor readings, and similarly produce discrete or analog motor outputs

5. Memory can easily be incorporated into the network through retaining activity over time,

6. ANNs are quite robust to noisy input data as would be expected from sensor data from non-trivial environments

Training proceduresMajority of the training methods we have seen are supervised learning methods: implies the existence of a target output for each input pattern

This is true of some robot control tasks where desired behaviour is specified precisely eg Case Western cockroach: used supervised techniques to set parameters to produce a particular gait

However, consider a robot navigating to a target: don’t necessarily know ‘correct’ trajectory; trajectory will depend on starting conditions; environment could be dynamic; we need output over TIME: inherently difficult to do function approximation over time and also how do we know at which point we went wrong???

Also behaviour may be reliant on sensory data from previous time-steps: ANNs for robot control are dynamical systems changing over time.

Also gradient descent techniques require continuously differentiable functions, thus focus is on feedforward fully-connected nets (though limited recurrency is possible) and varying continuous variables (weights)

Not always possible to differentiate error term

Also consider the change in error from eg removing a node in a network: all or nothing procedure and could be vastly disruptive. VERY bad for gradient descent

Can use unsupervised mtehods like reinforcement learning but generally need short trials and arenas where majority of possible inputs are available eg robot football: shooting

Use Evolutionary Algorithms!

Eg GAs, simulated annealing, evolutionary strategies, genetic programming, evolutionary programming etc (as long as

Good points: Avoids many of the problems of gradient descent as we only need to know network’s overall performance. Also can incorporate lifetime learning, changes of architecture etc and can work with wide range of network models (as long as genetic operators can be defined): VERY useful as we do not know what types of network are good

Here we will focus on GAs (Sussex - and my - bias)

NOT a panacea for all search problem ills: introduce a whole host of other problems which we shall explore

Not the only approach eg hill climbing, net crawling etc

Basic GAIn GA have a population of genotypes which encode potential solutions to the problem

Every generation they are all tested on the problem and assigned a score based on their success known as their fitness

Offspring are created for the next generation of solutions via recombination between two parents, followed by application of a mutation operator

Generate the probability of a genotype being picked as a parent proportional to its fitness (or fitness ranked over the population)

Artificial natural selection where traditionally recombination emphasised as the driving force of evolution

However recent work has focussed on role of mutation

- Initialise population of N genotypes

- Evaluate initial genotype fitnesses

- Repeat until termination criteria met:

- Repeat until N offspring placed in new population:

- Two parents P selected probabilistically (proportional to fitness)

- Two offspring O created through recombination of P

- Offspring O mutated

- Offspring O evaluated

- Fittest offspring placed in new population

- Replace current population with new population

There are an enormous number of variations on the canonical genetic algorithm above, many of which blur the distinction between GAs and other evolutionary algorithms

Genetic operators: mutation and crossoverThe genetic operators used to create offspring must do 2 things:

There must be a significant amount of similarity between the parents and offspring, heredity, so as to allow exploitation of current solution

There must be variation in order for evolution to discover new solutions: exploration of nearby areas

Operators depend crucially on the solution representation used, cf binary bit-flips vs real –valued Gaussian mutations.

Also want to generate viable solutions (cf telecoms networks must be connected) so genetic operators must be matched to the problem at hand

000 01111111 10000 11101111

crossover mutation

11101111 11111111

Solution representation: Encoding schemes

Before we design our genetic operators, must first decide how to represent problem solutions

GAs use a string of values (binary, real, letters etc) which must be used to encode all the parameters of the network

Split into 2 styles (though really a continuum of types): direct and indirect

However, can be bad with respect to heredity

Eg What if circled bit is ‘good’ bit of network: how can we retain this bit without taking the other nodes

Also can have problems if we want networks to grow and shrink …

x1

xn

Direct EncodingDirect schemes code all parameters directly Eg Take matrix of weight values (0 for no connection), and write out as one long string. Can work well. Grows with network size.

Problems of variable length genotypesOften want networks to have the capacity to grow and shrink thus we will have genotypes of different lengths in our search spaces

Can cause many problems eg if using direct encoding and basic crossover could get 5x5 matrix crossed with 2x3 matrix at position 19 …

Also get problems with connections: if above matrices are crossed at position 5, ie 1st 5 from 2x3 matrix, these weights are now all weights to neuron 1: this is NOT what they were in the 2x3 matrix which has no knowledge of extra nodes of the 5x5

While these are not insurmountable they illustrate the deeper problem that children are unlike parents

Indirect EncodingCan avoid some of these problems by use of indirect encoding schemes

Various forms: developmental schemes (where genotype encodes growth process of phenotype), cellular encoding, tree structures and various wild and wacky ideas

Can be useful in eg getting heredity or in getting repeated self structure (cf Gruau cellular encoding: could replicate network features n-fold)

Often developed task-specifically

One problem is that while fitness is evaluated on phenotype, movement of population is in genotype space: extra layer of complexity in working out eg how crossover affects phenotype

Some ExamplesGasNet encoding scheme: to allow for problems of variable length genotypes and to allow nodes to have similar proerties across neurons have used a spatial connection scheme. In this way, node x will always connect to nodes in the same region of the plane. Also node properties kept together so crossover can’t mess up a node

2 styles of encoding a telecoms network: Indirect – genotype encoded sets of 2x2 matrices which represented eigenvalues of dynamical systems ie sets of attractors in 2d space. Connectiosn sent out from attractors and attracted to others to form connectivity. Also had self similarity operator. Result?

Rubbish! Didn’t work very well at all. Chaotic dynamical system so smallest change of genotype could in principle change whole network. No heredity.

Other style semi-direct: want to ensure connected network so made basic genotype a minimum spanning tree (ensures there’s a path from evry point to every other in least amount of connections). Then added in extra connections. Result? OK

Not sure about heredity with respect to spanning tree, but had nice crossover operator in general: add a connection to child with probability 0.9 if both parents have it, and 0.4 if only one has it. By varying probabilities can get smaller/bigger nets

Fitness functionsNeed a fitness function to evaluate robot’s performance: bit of a black art

Problematic as it is of central importance to evolutionary computation methods; if there is little chance of differentiating between good and bad solutions, the evolutionary process cannot hope to succeed. Basically defines the search space

Ideally, there should be smooth paths in the problem space leading to the optimal solutions but in reality may not be possible

Basically one should ensure that there is a gradient for evolution to follow and avoid having large local optima (though this process is generally post-hoc ie make fitness function, population gets stuck, design new fitness function which avoids local optimum, population gets stuck again. Repeat ad nauseum till you regret ever criticising wonderful gradient-descent techniques

Also often have noisy fitness, often due to not being evaluated overexactly the same conditions (eg where sample sets for training are potentially huge, so fitness evaluated over some smaller set)

EG robot fitnesses are often highly dependent on the initial conditions

Alternatively environment could be a source of noise

Typically this noise will obscure differences between the fitnesses of neighbouring solutions, reducing the performance of the evolutionary process (although sometimes the noise can be helpful to eg allow populations to escape from local optima, or smooth search space)

Noisy fitnesses

SelectionRelated to evaluation of fitness is the selection applied to solutions

If there is not sufficient selective pressure to drive the evolutionary process to better parts of the landscape, much time will be spent evaluating solutions of poor fitness.

By contrast, if the selection pressure is too strong the evolutionary process will halt at the first local optimum reached, with little chance of escaping.

Highlights the conflict between allowing exploration of the problem space, and exploitation of local regions of the space

Search spacesAs stated earlier, the fitness function defines the search space of the problem we are looking at

Search space is N-dimensional where N is (maximum!) length of bit string

How we move through the search space is defined by our recombination operators.

Search space can therefore be seen to be a connected graph where connected points are those that can be reached by crossover and mutation

Also depending on operators will be more likely to get to certain destinations

If operators are well designed in terms of heredity should be able to get to all nearby areas of space

Fitness landscapesOften search space viewed as a an N+1 dimensional landscape where extra dimension is the fitness eg Bit string of length 2 gives us nice landscape below

Can be a useful metaphor (despite Inman’s protestaions) but ONLY if you reject all cosy notions of local maxima and minima

Eg GasNet search-space average of 200 dimensions: lots of places to go

Also standard mutation operator can in principle take us to any part of the space

Also have addition/deletion of nodes: difficult to view as movement

Also, noisy fitness: how to define maxima if fitness is a distribution???

Epistasis, ruggedness and local optimalityIf fitness dependent on a non-linear combination of the genotype loci, the genotype is said to be epistatically-linked.

Ie individual locus fitnesses are dependent on the context of other loci values and inter-locus interactions

This will generally be the case for ANN robot controllers

Epistatically-linked genotypes give rise to the two major properties of fitness landscapes thought to influence search dynamics, ruggedness and local optimality.

Ruggedness is regarded as similar to fitness noise, where direction to good solutions may be obscured by local noise

By contrast, local optimality is typically thought of in more global terms, with landscapes containing numbers of deceptive peaks

However there is no rigorous distinction between the two properties

Search space properties

Neutrality

Global vslocal optima

Smooth vsepistatic

Other?

NeutralityRecently much work has gone into analysing neutrality of landscapes (eg RNA, nkp, evolvable hardware and some robotics)

ie landscapes where one can move to points of equal fitness: moving along a neutral network

Evolution on fitness landscapes with high levels of neutrality is characterised by periods when fitness does not increase (fitness epochs) interspersed by short periods of rapid fitness increase (epochal evolution or punctuated equilibrium)

Adaptive evolution on neutral landscapes has shown that populations tend to move to areas of space which have more neutral neighbours ie the neutral evolution of robustness

Neutrality may be of use in escaping from (nearly) locally-optimal solutions, but in practice in high dimensional spaces, quite hard to tell if one is moving neutrally or hovering around a local optimum

Neuron Models

Many types of neuron model used in robotics

CTRNNs: based on leaky integrator neuron model from computational neuroscience

Spiking models: similar to above but with a spike generated when activation reaches a threshold

GasNet models: incorporate an abstrcat notion of a diffusible neuromodulator into an ANN

Firing rate based models

Etc …..

However, how can we decide what type of neuron model to use for a particular task?

Similarly, how do we know if we have good fitness functions/recombination operators to use in conjunction with our neuron model?

Can use intuition, or try several combinations. But will our results tell us what we want to know about the problem we were working on

• Why did a particular neuron/GA combination work well?

• Were our intuitions correct?

• What are the implications for generating a more successful model?

An Example: GasNet evolution

GasNet evolves faster than NoGas over range of reombination schemes …

… and mutation rates … and connection architectures … and robotic problems …

??WHY??

GasNets Background• Classically neurotransmission is viewed as occurring

Point-to-point at the synapse i.e. locally• Occurs over a short temporal scale• Overriding metaphor is electrical nodes connected by

wires • Inspiration for standard connectionist ANN

Neuromodulation by nitric oxide (NO)Recently neuromodulatory gases have been discovered

(NO, CO, H2S). By far the most studied is NO• Small and non-polar freely diffusing• Act over a large spatial scale: volume signalling • Act over a wide range of temporal scales (ms to years)• Modulatory effects • Interaction between neurons not connected synaptically• Loose coupling between the 2 signalling systems (electrical and

chemical) i.e. neurons that are connected electrically are not necessarily affected by the gas and vice versa.

new style of ANN?

Inspiration for new form of ANN: GasNets

• Node emits 1 of 2 ‘gases’ due to high electrical activity or high gas concentration

• Computationally fast, crude diffusion method, but space and time crucial, local processes

3. Gases diffuse through the network and alter slope of transfer functions of other neurons in concentration-dependent manner. 4. Gas 1 increases gain, gas 2 decreases gain.

Ojt = tanh[kj

t(ΣwijOit-1 + Ij) + bj]

Analysis of search space propertiesIf networks evolve faster they must, in some sense, be making the space of solutions easier to search in (smoother? More densely packed with good solutions? Less optima? More neutral??)

Analysis will hopefully tell us what the search space properties are like and what features of our networks are ‘good’ for search

Also can help us to understand the dynamics of an EA search through high-dimensional space: not well understood

Hopeful approach since many of the intuitions we have about how EA’s search spaces have come from such analyses

Eg work on nk, nkp and royal road landscapes etc have attempted to address the role of neutrality, crossover vs mutation and much more

Properties to examine:

Neutrality

Global vslocal optima

Smooth vsrugged

Other?

However…Abstract mathematical landscapes like nk and nkp are generally designed to have tunable ruggedness, neutrality, local modality etc.

Real-world problems have no direct link between solution architecture and landscape properties …

… And maybe no understandable link between landscape properties and evolutionary dynamics (how does adding virtual gas affect neutrality??? What is neutrality in a noisy space????)

We’ve found no explanation for GasNet evolvability in terms of fitness landscape properties (partly because 99.9% of real spaces evaluate to 0 fitness meaning standard measures see them as a homogeneous flat landscape)

EG Mutational robustness same

What about functional analysis?

•Oscillator sub-networks very commonly evolve•Node 5 provides electrical stimulation to RMnode, this causes gas1 emission from RMnode, this causes gas2 emission from node 5, this reduces gain of Rmnode which decreases activity which shuts off gas emission from RM and hence from 5 … and cycle begins again ..

Rmnode when Gain low Rmnode when gain high

Interaction of chemical and electrical provides transition between 2 regimes with diffusion controlled transition timings

GasNet and NoGas “Timers” (1):

‘timer’

‘bright object finder’

inhibit

Timer sub-circuitsnaturally become activebefore object finder circuits

GasNet and NoGas “Timers” (2):

GasNet timer = build-up of gas concentrationSimple architecture, mechanisms easily tuned?

NoGas timer = 3 fully connected nodes.Convoluted architecture, difficult to tune?

Re-evolution in environment with changed time scales (func =)

GasNet (x2)

NoGas GasNet (x 0.25)

NoGas

Num cases

20 20 20 20

Mean re-eval fitn.

0.17(0.07) 0.15(0.06) 0.36(0.1) 0.21(0.02)

Mean re-evol gens

10(5) 409(336) 30(31) 591(346)

Median re-evol

10 360 19 608

Re-evolution in environment with changed time scales (sample)

GasNet (x2)

NoGas GasNet (x 0.25)

NoGas

Num cases

20 20 20 20

Mean re-eval fitn.

0.27(0.13) 0.26(0.18) 0.35(0.27) 0.29(0.19)

Mean re-evol gens

107(190) 240(363) 108(229) 116(252)

Median re-evol

36 49 13 21

Conclusions• It seems that easy temporal adaptivity is an

important feature in GasNet evolvability• Dynamics used in surprising ways (e.g. sensory

noise filters), so important even in ‘reactive’ tasks

• Evolution and re-evolution of various kinds of rhythmic networks backs this up

But… • … not whole story • Have investigated several GasNet variants which have

improve the performance of the original• They have same temporal properties so what is going on

there?? • Maybe need to look at more phenotypic properties (eg

coupling of gaseous and electrical signalling mechanisms)

• To do this must introduce GasNets and variants

Diffusion in Original GasNets

1. Gas cloud centred on emitting node and builds up linearly with time (to a maximum) at a genetically specified rate

2. Gas varies spatially as an inverse exponential:

exp(-d2/r).

Based on the spatial distribution of NO produced by a single spherical neuron

• However, initial gas diffusion model was intentionally simplistic

• Cannot capture the rich range of spatio-temporal properties seen in real systems

• Therefore decided to develop 2 new versions incorporating aspects of NO signalling seen in nervous systems

• Will hopefully lead to more powerful/evolvable robotic systems

• Also tests the potential utility of the features in real nervous systems

Mammalian cortical plexus

• NO involved in mediating link between neural activity and blood flow

• Fibres are too small to generate an effective NO signal individually

• NO from many fibres summate: NO signal different to that from single neuron

• Fineness of fibres leads to a uniform signal• Combined effect maintains high concentration levels over large volume• Also delay until fibres interact serves to act as a noise filter: Plexuses of fine fibres signal persistent neuronal activity to blood vessels

Plexus GasNet model

2. Gas clouds centred in a genetically specified position in the network plane

1. Gas cloud has uniform spatial distribution

Receptor GasNet • In nervous systems, only neurons with one of the

receptors for NO will be affected• Each node may have quantities of receptors (none,

medium, maximum)• Receptor specific modulations

iirhcm

Gas concentration

Receptor concentration

Receptor GasNet modulations

• Increase gain• Decrease gain• Activation includes a proportion of previous

activation• Transfer function switched

One that was particularly successful was using only one gas which increased gain

Example experimentTask: Evolution of visual object discrimination

under noisy lighting with minimal vision

ExperimentsDiscrimination between:

and

and or or

Compared: Original GasNet, Receptor GasNet and Plexus GasNet

Evolvability results (expt 1)

840>10000>10000Worst

46101136Best

1585121201Median

260(161)1579(2609)3042(3681)Mean (sd)

404040Num Runs

ReceptorPlexusOriginal

Evolvability results (expt 2)

3540>10000>10000Worst

68501831Best

29813125784Median

380(246)1979(987)7642(3778)Mean (sd)

404040Num Runs

ReceptorPlexusOriginal

Why do new versions improve evolvability?

In initial GasNet model genotype to phenotype mapping means that there’s quite a tight coupling between the electrical and chemical processes

Electrical connections depend on spatial organisation of nodes Also, gas diffuses locally to all nearby neurons => if electrically coupled likely to be chemically coupledNew models allow for a more flexible

Coupling (expt 1)

11.4%(5.5)10.8%(8.1)40.5%(13.2)Overlap

2.1(0.7)2.78(0.84)2.27(0.93)Gas conn

1.61(0.3)1.72(0.41)1.89(0.52)Elec conn

403733Num succ runs

receptorplexusoriginal

Evolvability and coupling• Gardner and Ashby (70) showed that random systems of

multiply connected weakly interacting components or sparsely connected strongly interacting components are more likely to be stable

• New GasNets incorporate flexibly coupled processes with distinct characteristics (electrical, chemical)

• Helps to satisfy the conflicting pressures for genotypic instability and phenotypic stability needed for successful evolution (Conrad, 90)

• Allows non-destructive “tuning” of functionality of one against the other (eg elec does bright object, chem does discrimination)

Discussion + future directions

• 2 new models presented which significantly improve network evolvability

• Both systems are flexibly coupled • Clearly not the whole story• A better measure of the degree of coupling

is needed (allow a continuum of coupling)

For more deatil (on any of past 2 lectures) see:

http://www.cogs.susx.ac.uk/ccnr

artificial neural networks for robot control

Documents