evolving a sigma-pi network as a network simulator by justin basilico
Post on 29-Dec-2015
221 Views
Preview:
TRANSCRIPT
Problem description To evolve a neural network that acts
as a general network which can be “programmed” by its inputs to act as a variety of different networks. Input: Another network and its input. Output: The output of the given network
on the input. Use a sigma-pi network and evolve
its connectivity using a genetic algorithm.
Problem motivation If a network can be created to
simulate other networks given as input, then perhaps we can build neural networks that act upon other neural networks.
It would be interesting to see if one network could apply backpropagation to another network.
Previous work This problem remains largely
unexplored. Evolutionary techniques have been
applied to networks similar to sigma-pi networks Janson & Frenzel, Training Product Unit
Neural Networks with Genetic Algorithms (1993) Evolved product networks, which are “more
powerful” than sigma-pi networks since they allow a variable exponent
Previous work Papers from class that are provide
some information and inspiration: Plate, Randomly connected sigma-pi
neurons can form associator networks (2000)
Belew, McInerney, & Schraudolph, Evolving Networks: Using the Genetic Algorithm with Connectionist Learning (1991)
Chalmers, The Evolution of Learning: An Experiment in Genetic Connectionism (1990)
Approach (Overview) Generate a testing set of 100
random networks to simulate. Generate initial population of
chromosomes for sigma-pi network. For each generation, decode each
chromosome into a sigma-pi network and use fitness function to evaluate network’s fitness as a simulator using the testing set.
Approach (Overview) First try to simulate single-layer
networks: 2 input and 1 output units 2 input and 2 output units
Then try it on a multi-layer network: 2 input, 2 hidden, and 2 output units
Approach Input encoding
The simulation network is given the input to the simulated network along with the weight values for the network it is simulating.
Generate a fully-connected, feed-forward network with random weights along with a random input, then feed the input through the network to get the output.
Approach Input encoding
Example:
outputinput1
bias
w30 w31 w32 w40 w41 w42 w50
input2
w30
w31
w32
w53 w54
Network:
w40
w41
w42
w50
w53
w54
inputs weight layer 1 weight layer 2
Input encoding:
hidden 1
hidden 2
input1
input2
Approach Target output encoding
The output that the randomly weighted network produces on its random input.
Approach Chromosome encoding
Each chromosome encodes the connectivity (architecture) of the sigma-pi network.
To simplify things, allow network weights to either be 1.0, signifying there is a connection there, or 0.0 signifying that there is not.
Initialize chromosome to random string of bits.
Approach Chromosome encoding:
To encode the connectivity of a layer with m units to a layer with n units, use a binary string of length:
(m + 1) n Example: 011010 110001
bias
Approach Genetic algorithm
Selection: Chromosomes ranked by fitness, probability of selection based on rank.
Crossover: Randomly select bits in chromosome for crossover. (I might add in some sort of functional unit here.)
Mutation: Each bit in every chromosome has a mutation rate of 0.01.
Approach Fitness function
Put build a sigma-pi network using the chromosome.
Test the sigma-pi network on a testing set of 100 networks.
Better chromosomes have smaller fitness value.
Approach Fitness function
Attempt 1: Mean squared error. Problem: Evolved networks just always
guessed 0.5 because a sigmoid activation function was used.
Attempt 2: Number of incorrect outputs within a threshold of 0.05. Problem: We want an optimum solution with
as few weights in the network as possible.
Attempt 3: Use second function and also factor in number of 1’s in chromosome.
Results So far:
Tried to train a backpropagation network to do simulation, but it did not work.
Managed to evolve sigma-pi network architectures to simulate simple, one layer networks with linear units.
Still working on simulating networks with two layers and with sigmoid units.
Results Network with 2 input, 1 output units
and linear activation Population: 100 Optimal solution after 24 generations
bias
w30
input1
input2 w31 w32
Results Network with 2 input, 2 output units
and linear activation Population: 150 Optimal solution after 121 generations
(stabilizes after 486)bias
w30
input1
input2 w31 w32 w40 w41 w42
Results (so far) Network with 2 input, 2 hidden, and 1
output units and linear activation Have not gotten it to work yet. Sigma-pi network has 3 hidden layers (11-
9-5-3-1) Might be a problem to do with sparse
connectivity of solution where input weights need to be “saved” for later layers.
Potential solution: Fix more of the network network architecture so the size of the chromosome is smaller.
Future work Expand evolution parameters to allow
wider variation in the evolved networks (network weights, activation functions).
Try to simulate larger networks. Evolve a network that implements
backpropagation: Start small with just the delta-rule for
output and hidden units. Work up to a network that does full
backpropagation.
top related