artificial neural networks for robot control
DESCRIPTION
Artificial Neural networks for Robot Control. Neural Networks 15/16. Outputs. Why use ANNs for robotics?. Input data. ANN. Sensory Data. Motor outputs. All (!) we need for robot control is some method of transforming sensory input into motor output so why use ANNs? - PowerPoint PPT PresentationTRANSCRIPT
Artificial Neural networks for Robot Control
Neural Networks 15/16
Why use ANNs for robotics?
All (!) we need for robot control is some method of transforming sensory input into motor output so why use ANNs?
• Argument from existence proof: the most successful adaptive machines we know of have some form of neural network.
While our nets are impoverished imitations of nature this supports the idea that networks ofrelatively simple units can generate adaptive behaviour over time,
If we want to reproduce similar types of adaptivity, it might seem sensible to start from similar types of system.
ANNSensory Data Motor outputs
OutputsInput data
2. ANNs are extremely flexible, with many ways that architecture can be modified, (changing weights to changing the entire architecture.
3. ANNs are well suited to incorporating mechanisms such as lifetime learning, potentially enabling agents to adapt to changing environments
4. ANNs can take input from a variety of sources, including both continuous and discrete sensor readings, and similarly produce discrete or analog motor outputs
5. Memory can easily be incorporated into the network through retaining activity over time,
6. ANNs are quite robust to noisy input data as would be expected from sensor data from non-trivial environments
Training proceduresMajority of the training methods we have seen are supervised learning methods: implies the existence of a target output for each input pattern
This is true of some robot control tasks where desired behaviour is specified precisely eg Case Western cockroach: used supervised techniques to set parameters to produce a particular gait
However, consider a robot navigating to a target: don’t necessarily know ‘correct’ trajectory; trajectory will depend on starting conditions; environment could be dynamic; we need output over TIME: inherently difficult to do function approximation over time and also how do we know at which point we went wrong???
Also behaviour may be reliant on sensory data from previous time-steps: ANNs for robot control are dynamical systems changing over time.
Also gradient descent techniques require continuously differentiable functions, thus focus is on feedforward fully-connected nets (though limited recurrency is possible) and varying continuous variables (weights)
Not always possible to differentiate error term
Also consider the change in error from eg removing a node in a network: all or nothing procedure and could be vastly disruptive. VERY bad for gradient descent
Can use unsupervised mtehods like reinforcement learning but generally need short trials and arenas where majority of possible inputs are available eg robot football: shooting
Use Evolutionary Algorithms!
Eg GAs, simulated annealing, evolutionary strategies, genetic programming, evolutionary programming etc (as long as
Good points: Avoids many of the problems of gradient descent as we only need to know network’s overall performance. Also can incorporate lifetime learning, changes of architecture etc and can work with wide range of network models (as long as genetic operators can be defined): VERY useful as we do not know what types of network are good
Here we will focus on GAs (Sussex - and my - bias)
NOT a panacea for all search problem ills: introduce a whole host of other problems which we shall explore
Not the only approach eg hill climbing, net crawling etc
Basic GAIn GA have a population of genotypes which encode potential solutions to the problem
Every generation they are all tested on the problem and assigned a score based on their success known as their fitness
Offspring are created for the next generation of solutions via recombination between two parents, followed by application of a mutation operator
Generate the probability of a genotype being picked as a parent proportional to its fitness (or fitness ranked over the population)
Artificial natural selection where traditionally recombination emphasised as the driving force of evolution
However recent work has focussed on role of mutation
- Initialise population of N genotypes
- Evaluate initial genotype fitnesses
- Repeat until termination criteria met:
- Repeat until N offspring placed in new population:
- Two parents P selected probabilistically (proportional to fitness)
- Two offspring O created through recombination of P
- Offspring O mutated
- Offspring O evaluated
- Fittest offspring placed in new population
- Replace current population with new population
There are an enormous number of variations on the canonical genetic algorithm above, many of which blur the distinction between GAs and other evolutionary algorithms
Genetic operators: mutation and crossoverThe genetic operators used to create offspring must do 2 things:
There must be a significant amount of similarity between the parents and offspring, heredity, so as to allow exploitation of current solution
There must be variation in order for evolution to discover new solutions: exploration of nearby areas
Operators depend crucially on the solution representation used, cf binary bit-flips vs real –valued Gaussian mutations.
Also want to generate viable solutions (cf telecoms networks must be connected) so genetic operators must be matched to the problem at hand
000 01111111 10000 11101111
crossover mutation
11101111 11111111
Solution representation: Encoding schemes
Before we design our genetic operators, must first decide how to represent problem solutions
GAs use a string of values (binary, real, letters etc) which must be used to encode all the parameters of the network
Split into 2 styles (though really a continuum of types): direct and indirect
However, can be bad with respect to heredity
Eg What if circled bit is ‘good’ bit of network: how can we retain this bit without taking the other nodes
Also can have problems if we want networks to grow and shrink …
x1
xn
Direct EncodingDirect schemes code all parameters directly Eg Take matrix of weight values (0 for no connection), and write out as one long string. Can work well. Grows with network size.
Problems of variable length genotypesOften want networks to have the capacity to grow and shrink thus we will have genotypes of different lengths in our search spaces
Can cause many problems eg if using direct encoding and basic crossover could get 5x5 matrix crossed with 2x3 matrix at position 19 …
Also get problems with connections: if above matrices are crossed at position 5, ie 1st 5 from 2x3 matrix, these weights are now all weights to neuron 1: this is NOT what they were in the 2x3 matrix which has no knowledge of extra nodes of the 5x5
While these are not insurmountable they illustrate the deeper problem that children are unlike parents
Indirect EncodingCan avoid some of these problems by use of indirect encoding schemes
Various forms: developmental schemes (where genotype encodes growth process of phenotype), cellular encoding, tree structures and various wild and wacky ideas
Can be useful in eg getting heredity or in getting repeated self structure (cf Gruau cellular encoding: could replicate network features n-fold)
Often developed task-specifically
One problem is that while fitness is evaluated on phenotype, movement of population is in genotype space: extra layer of complexity in working out eg how crossover affects phenotype
Some ExamplesGasNet encoding scheme: to allow for problems of variable length genotypes and to allow nodes to have similar proerties across neurons have used a spatial connection scheme. In this way, node x will always connect to nodes in the same region of the plane. Also node properties kept together so crossover can’t mess up a node
2 styles of encoding a telecoms network: Indirect – genotype encoded sets of 2x2 matrices which represented eigenvalues of dynamical systems ie sets of attractors in 2d space. Connectiosn sent out from attractors and attracted to others to form connectivity. Also had self similarity operator. Result?
Rubbish! Didn’t work very well at all. Chaotic dynamical system so smallest change of genotype could in principle change whole network. No heredity.
Other style semi-direct: want to ensure connected network so made basic genotype a minimum spanning tree (ensures there’s a path from evry point to every other in least amount of connections). Then added in extra connections. Result? OK
Not sure about heredity with respect to spanning tree, but had nice crossover operator in general: add a connection to child with probability 0.9 if both parents have it, and 0.4 if only one has it. By varying probabilities can get smaller/bigger nets
Fitness functionsNeed a fitness function to evaluate robot’s performance: bit of a black art
Problematic as it is of central importance to evolutionary computation methods; if there is little chance of differentiating between good and bad solutions, the evolutionary process cannot hope to succeed. Basically defines the search space
Ideally, there should be smooth paths in the problem space leading to the optimal solutions but in reality may not be possible
Basically one should ensure that there is a gradient for evolution to follow and avoid having large local optima (though this process is generally post-hoc ie make fitness function, population gets stuck, design new fitness function which avoids local optimum, population gets stuck again. Repeat ad nauseum till you regret ever criticising wonderful gradient-descent techniques
Also often have noisy fitness, often due to not being evaluated overexactly the same conditions (eg where sample sets for training are potentially huge, so fitness evaluated over some smaller set)
EG robot fitnesses are often highly dependent on the initial conditions
Alternatively environment could be a source of noise
Typically this noise will obscure differences between the fitnesses of neighbouring solutions, reducing the performance of the evolutionary process (although sometimes the noise can be helpful to eg allow populations to escape from local optima, or smooth search space)
Noisy fitnesses
SelectionRelated to evaluation of fitness is the selection applied to solutions
If there is not sufficient selective pressure to drive the evolutionary process to better parts of the landscape, much time will be spent evaluating solutions of poor fitness.
By contrast, if the selection pressure is too strong the evolutionary process will halt at the first local optimum reached, with little chance of escaping.
Highlights the conflict between allowing exploration of the problem space, and exploitation of local regions of the space
Search spacesAs stated earlier, the fitness function defines the search space of the problem we are looking at
Search space is N-dimensional where N is (maximum!) length of bit string
How we move through the search space is defined by our recombination operators.
Search space can therefore be seen to be a connected graph where connected points are those that can be reached by crossover and mutation
Also depending on operators will be more likely to get to certain destinations
If operators are well designed in terms of heredity should be able to get to all nearby areas of space
Fitness landscapesOften search space viewed as a an N+1 dimensional landscape where extra dimension is the fitness eg Bit string of length 2 gives us nice landscape below
Can be a useful metaphor (despite Inman’s protestaions) but ONLY if you reject all cosy notions of local maxima and minima
Eg GasNet search-space average of 200 dimensions: lots of places to go
Also standard mutation operator can in principle take us to any part of the space
Also have addition/deletion of nodes: difficult to view as movement
Also, noisy fitness: how to define maxima if fitness is a distribution???
Epistasis, ruggedness and local optimalityIf fitness dependent on a non-linear combination of the genotype loci, the genotype is said to be epistatically-linked.
Ie individual locus fitnesses are dependent on the context of other loci values and inter-locus interactions
This will generally be the case for ANN robot controllers
Epistatically-linked genotypes give rise to the two major properties of fitness landscapes thought to influence search dynamics, ruggedness and local optimality.
Ruggedness is regarded as similar to fitness noise, where direction to good solutions may be obscured by local noise
By contrast, local optimality is typically thought of in more global terms, with landscapes containing numbers of deceptive peaks
However there is no rigorous distinction between the two properties
Search space properties
Neutrality
Global vslocal optima
Smooth vsepistatic
Other?
NeutralityRecently much work has gone into analysing neutrality of landscapes (eg RNA, nkp, evolvable hardware and some robotics)
ie landscapes where one can move to points of equal fitness: moving along a neutral network
Evolution on fitness landscapes with high levels of neutrality is characterised by periods when fitness does not increase (fitness epochs) interspersed by short periods of rapid fitness increase (epochal evolution or punctuated equilibrium)
Adaptive evolution on neutral landscapes has shown that populations tend to move to areas of space which have more neutral neighbours ie the neutral evolution of robustness
Neutrality may be of use in escaping from (nearly) locally-optimal solutions, but in practice in high dimensional spaces, quite hard to tell if one is moving neutrally or hovering around a local optimum
Neuron Models
Many types of neuron model used in robotics
CTRNNs: based on leaky integrator neuron model from computational neuroscience
Spiking models: similar to above but with a spike generated when activation reaches a threshold
GasNet models: incorporate an abstrcat notion of a diffusible neuromodulator into an ANN
Firing rate based models
Etc …..
However, how can we decide what type of neuron model to use for a particular task?
Similarly, how do we know if we have good fitness functions/recombination operators to use in conjunction with our neuron model?
Can use intuition, or try several combinations. But will our results tell us what we want to know about the problem we were working on
• Why did a particular neuron/GA combination work well?
• Were our intuitions correct?
• What are the implications for generating a more successful model?
An Example: GasNet evolution
GasNet evolves faster than NoGas over range of reombination schemes …
… and mutation rates … and connection architectures … and robotic problems …
??WHY??
GasNets Background• Classically neurotransmission is viewed as occurring
Point-to-point at the synapse i.e. locally• Occurs over a short temporal scale• Overriding metaphor is electrical nodes connected by
wires • Inspiration for standard connectionist ANN
Neuromodulation by nitric oxide (NO)Recently neuromodulatory gases have been discovered
(NO, CO, H2S). By far the most studied is NO• Small and non-polar freely diffusing• Act over a large spatial scale: volume signalling • Act over a wide range of temporal scales (ms to years)• Modulatory effects • Interaction between neurons not connected synaptically• Loose coupling between the 2 signalling systems (electrical and
chemical) i.e. neurons that are connected electrically are not necessarily affected by the gas and vice versa.
new style of ANN?
Inspiration for new form of ANN: GasNets
• Node emits 1 of 2 ‘gases’ due to high electrical activity or high gas concentration
• Computationally fast, crude diffusion method, but space and time crucial, local processes
3. Gases diffuse through the network and alter slope of transfer functions of other neurons in concentration-dependent manner. 4. Gas 1 increases gain, gas 2 decreases gain.
Ojt = tanh[kj
t(ΣwijOit-1 + Ij) + bj]
Analysis of search space propertiesIf networks evolve faster they must, in some sense, be making the space of solutions easier to search in (smoother? More densely packed with good solutions? Less optima? More neutral??)
Analysis will hopefully tell us what the search space properties are like and what features of our networks are ‘good’ for search
Also can help us to understand the dynamics of an EA search through high-dimensional space: not well understood
Hopeful approach since many of the intuitions we have about how EA’s search spaces have come from such analyses
Eg work on nk, nkp and royal road landscapes etc have attempted to address the role of neutrality, crossover vs mutation and much more
Properties to examine:
Neutrality
Global vslocal optima
Smooth vsrugged
Other?
However…Abstract mathematical landscapes like nk and nkp are generally designed to have tunable ruggedness, neutrality, local modality etc.
Real-world problems have no direct link between solution architecture and landscape properties …
… And maybe no understandable link between landscape properties and evolutionary dynamics (how does adding virtual gas affect neutrality??? What is neutrality in a noisy space????)
We’ve found no explanation for GasNet evolvability in terms of fitness landscape properties (partly because 99.9% of real spaces evaluate to 0 fitness meaning standard measures see them as a homogeneous flat landscape)
EG Mutational robustness same
What about functional analysis?
•Oscillator sub-networks very commonly evolve•Node 5 provides electrical stimulation to RMnode, this causes gas1 emission from RMnode, this causes gas2 emission from node 5, this reduces gain of Rmnode which decreases activity which shuts off gas emission from RM and hence from 5 … and cycle begins again ..
Rmnode when Gain low Rmnode when gain high
Interaction of chemical and electrical provides transition between 2 regimes with diffusion controlled transition timings
GasNet and NoGas “Timers” (1):
‘timer’
‘bright object finder’
inhibit
Timer sub-circuitsnaturally become activebefore object finder circuits
GasNet and NoGas “Timers” (2):
GasNet timer = build-up of gas concentrationSimple architecture, mechanisms easily tuned?
NoGas timer = 3 fully connected nodes.Convoluted architecture, difficult to tune?
Re-evolution in environment with changed time scales (func =)
GasNet (x2)
NoGas GasNet (x 0.25)
NoGas
Num cases
20 20 20 20
Mean re-eval fitn.
0.17(0.07) 0.15(0.06) 0.36(0.1) 0.21(0.02)
Mean re-evol gens
10(5) 409(336) 30(31) 591(346)
Median re-evol
10 360 19 608
Re-evolution in environment with changed time scales (sample)
GasNet (x2)
NoGas GasNet (x 0.25)
NoGas
Num cases
20 20 20 20
Mean re-eval fitn.
0.27(0.13) 0.26(0.18) 0.35(0.27) 0.29(0.19)
Mean re-evol gens
107(190) 240(363) 108(229) 116(252)
Median re-evol
36 49 13 21
Conclusions• It seems that easy temporal adaptivity is an
important feature in GasNet evolvability• Dynamics used in surprising ways (e.g. sensory
noise filters), so important even in ‘reactive’ tasks
• Evolution and re-evolution of various kinds of rhythmic networks backs this up
But… • … not whole story • Have investigated several GasNet variants which have
improve the performance of the original• They have same temporal properties so what is going on
there?? • Maybe need to look at more phenotypic properties (eg
coupling of gaseous and electrical signalling mechanisms)
• To do this must introduce GasNets and variants
Diffusion in Original GasNets
1. Gas cloud centred on emitting node and builds up linearly with time (to a maximum) at a genetically specified rate
2. Gas varies spatially as an inverse exponential:
exp(-d2/r).
Based on the spatial distribution of NO produced by a single spherical neuron
• However, initial gas diffusion model was intentionally simplistic
• Cannot capture the rich range of spatio-temporal properties seen in real systems
• Therefore decided to develop 2 new versions incorporating aspects of NO signalling seen in nervous systems
• Will hopefully lead to more powerful/evolvable robotic systems
• Also tests the potential utility of the features in real nervous systems
Mammalian cortical plexus
• NO involved in mediating link between neural activity and blood flow
• Fibres are too small to generate an effective NO signal individually
• NO from many fibres summate: NO signal different to that from single neuron
• Fineness of fibres leads to a uniform signal• Combined effect maintains high concentration levels over large volume• Also delay until fibres interact serves to act as a noise filter: Plexuses of fine fibres signal persistent neuronal activity to blood vessels
Plexus GasNet model
2. Gas clouds centred in a genetically specified position in the network plane
1. Gas cloud has uniform spatial distribution
Receptor GasNet • In nervous systems, only neurons with one of the
receptors for NO will be affected• Each node may have quantities of receptors (none,
medium, maximum)• Receptor specific modulations
iirhcm
Gas concentration
Receptor concentration
Receptor GasNet modulations
• Increase gain• Decrease gain• Activation includes a proportion of previous
activation• Transfer function switched
One that was particularly successful was using only one gas which increased gain
Example experimentTask: Evolution of visual object discrimination
under noisy lighting with minimal vision
ExperimentsDiscrimination between:
and
and or or
Compared: Original GasNet, Receptor GasNet and Plexus GasNet
Evolvability results (expt 1)
840>10000>10000Worst
46101136Best
1585121201Median
260(161)1579(2609)3042(3681)Mean (sd)
404040Num Runs
ReceptorPlexusOriginal
Evolvability results (expt 2)
3540>10000>10000Worst
68501831Best
29813125784Median
380(246)1979(987)7642(3778)Mean (sd)
404040Num Runs
ReceptorPlexusOriginal
Why do new versions improve evolvability?
In initial GasNet model genotype to phenotype mapping means that there’s quite a tight coupling between the electrical and chemical processes
Electrical connections depend on spatial organisation of nodes Also, gas diffuses locally to all nearby neurons => if electrically coupled likely to be chemically coupledNew models allow for a more flexible
Coupling (expt 1)
11.4%(5.5)10.8%(8.1)40.5%(13.2)Overlap
2.1(0.7)2.78(0.84)2.27(0.93)Gas conn
1.61(0.3)1.72(0.41)1.89(0.52)Elec conn
403733Num succ runs
receptorplexusoriginal
Evolvability and coupling• Gardner and Ashby (70) showed that random systems of
multiply connected weakly interacting components or sparsely connected strongly interacting components are more likely to be stable
• New GasNets incorporate flexibly coupled processes with distinct characteristics (electrical, chemical)
• Helps to satisfy the conflicting pressures for genotypic instability and phenotypic stability needed for successful evolution (Conrad, 90)
• Allows non-destructive “tuning” of functionality of one against the other (eg elec does bright object, chem does discrimination)
Discussion + future directions
• 2 new models presented which significantly improve network evolvability
• Both systems are flexibly coupled • Clearly not the whole story• A better measure of the degree of coupling
is needed (allow a continuum of coupling)
For more deatil (on any of past 2 lectures) see:
http://www.cogs.susx.ac.uk/ccnr