implementation of digital circuits using...implementation of digital circuits using neuro - swarm...
TRANSCRIPT
Implementation of Digital Circuits Using Neuro - Swarm Based on FPGA
Assist. Prof. Dr. Hanan A. R. Akkar, M. Sc. Student Firas R. Mahdi
64
Implementation of Digital Circuits Using Neuro - Swarm Based on FPGA
1Assist. Prof. Dr. Hanan A. R. Akkar,
2M. Sc. Student Firas R. Mahdi
1, 2Department of Electrical and Electronic Engineering, University of Technology, Baghdad,
Iraq
Email: 1
[email protected], [email protected]
doi: 10.4156/ijact.vol2.issue2.6
Abstract This paper constructs fully parallel NN hardware realization of Artificial Neural Network (ANN)
depends on the efficient execution of single neuron. Field Programmable Gate Array (FPGA)
reconfigurable computing architecture is appropriated for hardware achievement of ANN. Numerous
implementation of ANNs have been reported in scientific documents, trying to reduce Neural Networks
NNs hardware circuitry. This paper constructs fully parallel NN hardware architecture reduces neuron
hardware to perform an efficient NN through two main parts; the first part covers network training
using Particle Swarm Optimization (PSO) modified MATLAB tools utilized of PSO advantages, the
second part represents the hardware implementation of the trained network through Xilinx high
performance Virtex FPGA schematic entry design tools.
Keywords: Artificial Neural Network (ANN), Particle Swarm Optimization (PSO),
Field Programmable Gate Array (FPGA).
1. Introduction
1.1 Artificial Neural Network (ANN)
An Artificial Neuron (AN) is a model of biological neuron. Where each AN receives signals from
the environment or other ANs, gathers these signals applying some activation function to the signals
sum and when fired transmits signal to all connected ANs. Input signals are inhibited or excited
through positive or negative numerical weights associated with each connection to AN the firing of the
AN and the strength of the exciting signal are controlled via a function referred to as activation
function. The AN collects all incoming signals and computes a net input signal as a function of the
respective weights. The net input serves to the activation function which calculated the output signal of
the AN. An ANN is a layered network of ANs. ANN may consist of input, hidden and output layers.
ANs in one layer are connected fully or partially to the ANs in the next layer [1].
sssssssNi (x1 , .. , xmi )=ai ( ) …(1)
Where xj input signal, wji the weight, aj the activation function, bj bias.
There are several methods of training ANN Backpropagation is by far the most common. In this
research PSO are supposed as the best training algorithm to our application.
1.2 Particle Swarm Optimization (PSO)
Particle swarm optimization is a population based evaluation optimization technique developed by J.
Kennedy and R. Eberhart in 1995 motivated by the social behavior of bird flocking or fish schooling
[2].
PSO is a kind of random search algorithm that simulates nature evolutionary process and performs
good characteristic in solving some difficulty optimization problems. The basic concept of PSO comes
from a large number of birds flying randomly and looking for food together. Each bird is an individual
and called a particle. As the birds looking for food, the particles fly in a multidimensional search space
looking for the optimal solution. Here all the particles are composed of a family rather than the isolated
individual for each other. They can remember their own flying experience and share theirs
International Journal of Advancements in Computing Technology
Volume 2, Number 2, June, 2010
65
companions’ flying experience [3]. The basic PSO model consists of swarm of particles moving in a D-
dimensional search space the direction and distance of each particle in the hyper dimensional space is
resolute by its fitness and velocity. In general the fitness is primarily related with the optimization objective and velocity is updated according to an urbane rule [4].
In PSO, populations starts with random initialization of individuals in the search space and then
repeat the social behavior of the particles in the swarm till achieves the best possible result by iterative
searching. At each iterative step the velocity (position change) is updated and the particle is moved
towards a new position. The best previously visited position at the nth
particle is denoted by the
personal best position (pbest), while the position of the best individuals of the whole swarms is denoted
as the global best position (gbest). In other words, the particle swarm optimization idea consists of at
each time step, changing the velocity and location of each particle towards its (pbest) and (gbest)
locations according to equations 2 and 3:-
Vid = W*Vi+C1*rand1*(Pid-Xid) +C2*rand2*(Pgd-Xid) … (2)
Xid = Xid + Vid … (3) Where W is the inertia weight which bring a stability between global exploration and local exploration,
C1 and C2 are two constants called learning factors [5], rand1 and rand2 are two independent random
numbers uniformly spread in the range of [0,1], for equation 2 the first part represents the inertia of
previous velocity; the second part is the cognition part which represents the private thinking by itself;
the third part is the social part, which represents the assistance among the particles [6].
During the past few years PSO has been shown successful for many applications [7-9] several
papers discuss how to apply PSO in training NNs and their advantages [10-13]. For the purpose of NN
learning the empirical error referred to as the objective function to be optimized by the optimization
method (minimized to 0) is given by:-
M.S.E = … (4)
Where n is the number of training patterns, m is the number of outputs, T is the target and Y is the
actual value [1].
1.3 Field Programmable Gate Array (FPGA)
FPGA is especially designed IC that is often used for prototyping. Each FPGA has three main parts;
the Configurable Logic Block (CLB) is the most significant part. CLB provides physical support for
the program downloaded on FPGA. Another part is the Input Output Block (IOB) which provides input
and output for FPGA and makes it possible to communicate outside the FPGA. The last part is the
Programmable Interconnect (PI) which connects the different part of FPGA and allows them to
communicate with each other [14].
FPGA are an excellent technology for implementing NNs hardware. Executing a NN on FPGA is a
relatively easy process. For lessening the design circuitry the training will be done independently off
line the FPGA, once the training is completed and the correct network weights is obtained these
weights will be hard implied on FPGA. The accuracy in which these weights can be coded will
depends upon the number of bits existing to implement the weights. Parallelism and dynamic adaption
are two computational characteristics typically related with ANN FPGA - based reconfigurable
computing architecture are well suited to implement ANNs as one can develop concurrency and rapidly
reconfigure to adapt the weights and topologies of an ANN [15]. FPGA realization of ANN with large
number of neurons is still a not easy task because ANN algorithm is wealthy with multiplication
process and it’s relatively expensive to realize. Various work reported in this area includes new
multiplication algorithm for ANN, NNs with some constraints to achieve higher speed of process at
lower price and multichip realization [16-18].
2. Theory Description
Implementation of Digital Circuits Using Neuro - Swarm Based on FPGA
Assist. Prof. Dr. Hanan A. R. Akkar, M. Sc. Student Firas R. Mahdi
66
There has been significant increase in research and development in the area of applying
evolutionary computation techniques for the purpose of evolving one or more aspects of ANNs. These
evolutionary techniques have usually been used to develop NN structure or the network learning
algorithm [19]. A new learning algorithm combined ANN was proposed to determined the optimal
weights, these weights adjusted by PSO.
PSO has its own advantages and disadvantages over other computational algorithms, advantages
like its probabilistic mechanism and multi starting points, hence PSO can avoid getting into the local
optimal solution [20], but the most utilized property PSO in this paper is its free derivative activation
function, which means that we will train feed forward NNs using PSO as the learning algorithm with
only Hardlimit activation function (shown in Figure 1) for all network layers. According to the
Hardlimit activation function properties the output will be either one or zero, this property will be very
helpful simplifying the network multiplication process.
Figure 1. Hard-limit transfer function.
Numerous studies have further explored the power of PSO as a training algorithm, these studies
shown that NNs trained by PSO have more accurate results than other training algorithms, but in the
same time its slowest than other algorithms like Back propagation. For the purpose of neural network
implementation of digital logic circuits we have modified the MATLAB PSO tools to be suitable to our
application. The modified PSO tools will give us the exactly weights needed for the network training
these weights will be only integers numbers this will be obliging in execution the multiplication
process using only AND gates.
3. The Proposed Design of PSO Neuron
Neuron is the main part of each ANN. Each neuron can have countless inputs; however it will only
have one output value. Any digitally design for the neuron must take into consideration to minimize the
neuron hardware circuitry. Since our application is to construct digital circuits using NNs, then the
inputs to the network will be blocked between only two values either one or zero, while the weights
will be just an integer values for the facility of representation and also to reduce complexity. From
these two facts the multiplication process will do not needs more than AND gates where each input
will multiplied each time with a single bit of the weight using an AND gate. Suppose we have a single
input neuron and the weights are restricted in the integer range of [-3, 3] which means that each
weight will be represented using only two bits and a single bit for the representation of the weight sign,
therefore three AND gates are sufficient for the sign multiplication process representation, where 0
stands for positive sign and 1 stands for negative sign as shown in Figure 2.
International Journal of Advancements in Computing Technology
Volume 2, Number 2, June, 2010
67
Figure 2. Two bit with sign multiplication digital circuit.
This structure will be repeated for each input. The product produced by the multiplication process
will be added or subtracted according to the weights signs using a special design adder\ subtracter
with sign as illustrated in Figure 3.
Figure 3. Two Bit Adder/Subtracter circuit.
As we said, Hardlimit transfer function will be applied for all network layers, therefore the output of
the neuron will be 1 if the net (final neuron inputs weights product summation) greater or equal to 0
and 0 if the net less than 0. The neuron structure is shown in Figure 4. The output of the neuron will be
the same as the inverted sign of the final neuron net as shown in Figure5.
Figure 4. Two input neuron with two bit weight circuit.
Implementation of Digital Circuits Using Neuro - Swarm Based on FPGA
Assist. Prof. Dr. Hanan A. R. Akkar, M. Sc. Student Firas R. Mahdi
68
Figure 5. Two input neuron with two bit weight circuit.
4. The Proposed Design of Digital Circuit
In this paper we will implement the trained by PSO ANN that performed two digital circuits; the
first will be the one bit full adder while the other will be a 4 bit Arithmetic Logic Unit (ALU).
4.1 Full Adder
To design 1 bit full adder we must first obtained the right integer weights using MATLAB modified
PSO tools. The parameters used for this purpose: W =0.6, C1 and C2 =1.7, No. of particles 1000. 3
neurons in the input layer, 3 neurons in the hidden layer and 2 neurons in the output layer. Weights
arranged between integer [-3, 3]. Where Figure 6. shows the particle dynamics through the training
process, while Figure 7. shows the Full Adder Gbest against epochs until minimize the error to zero
diagram, Figure 8. shows the ANN Full Adder circuit, finally Figure 9 shows logic simulator timing
diagram for the ANN Full Adder circuit. Error goal reached successful termination after 18 iterations.
W {1, 1} = [1 0 -3; -3 -3 1; -1 -2 -2] B {1} = [0; 2; 2]
W {2, 1} = [-2 -1 2; 0 0 -3] B {2} = [0; 2]
Figure 6. Particles dynamics.
International Journal of Advancements in Computing Technology
Volume 2, Number 2, June, 2010
69
Figure 7. Gbest vs. epochs for Full adder circuit.
Figure 8. The ANN Full Adder circuit.
Implementation of Digital Circuits Using Neuro - Swarm Based on FPGA
Assist. Prof. Dr. Hanan A. R. Akkar, M. Sc. Student Firas R. Mahdi
70
Figure 9. Logic simulator timing diagram ANN Full Adder circuit.
4.1 4-Bit ALU
Arithmetic Logic Unit ALU is a 4-bit high speed parallel ALU, controlled by the four Function
Select inputs (S0–S3) and the Mode Control input (M). It can perform all the 16 possible logic
operations or 16 different arithmetic operations on active HIGH or active LOW operands. The function
table lists these operations. When the Mode Control input (M) is HIGH, all internal carries are
inhibited and the device performs logic operations on the individual bits as listed. When the mode
control input is LOW, the carries are enabled and the device performs arithmetic operations on the two
4-bit words [21].
Table1. ALU operation [21].
Since we have a large data set 16,384 due to 14 inputs we will divide the network into 5 parts with
maximum 8 inputs and output to simplify the design. Figure 10. shows the first ALU part, while Figure
11. shows Gbest against epochs until minimize the error to zero diagram and Figure 12. shows part 1.
trained Neural Network digital circuit. W=0.6, C1 and C2=1.7, No. of particles 1000
International Journal of Advancements in Computing Technology
Volume 2, Number 2, June, 2010
71
Figure 10. Part 1 ALU digital circuit.
6 neurons in the input layer and 2 neurons in the output layer. Weights range blocked between
integer [-7, 7]. Error goal reached successful termination after 19 iterations. W {1, 1} = [-4 -3 0 0 -1
-2; -5 -7 2 -2 1 0] B {1} = [7; 5]
Figure 11. Gbest vs. epochs part 1 ALU.
Implementation of Digital Circuits Using Neuro - Swarm Based on FPGA
Assist. Prof. Dr. Hanan A. R. Akkar, M. Sc. Student Firas R. Mahdi
72
Figure 12. Part 1 trained Neural Network ALU.
Figure 13. shows the second part ALU digital circuit, while Figure 14. shows Gbest against epochs
for part 2 NN training until reached 0 error and Figure 15. shows trained neural network part 2 digital
circuit.
Figure 13. Part 2 ALU digital circuit
No. of particles 10000. 4 neurons in the input layer, 3 neurons in the hidden layer and 1
neuron in the output layer. Weights range blocked between integer [-3, 3]. Error goal
reached successful termination after 31 iterations. W{1,1}=[2 2 1 -1;2 1 -1 0;-1 1 2 -1] W {2,1}=[2 -2 -2] , B{1}=[-3;-1;-2] , B{2} = 1
Figure 14. Gbest vs. epochs Part 2 ALU.
International Journal of Advancements in Computing Technology
Volume 2, Number 2, June, 2010
73
Figure 15. Part 2 trained NN ALU.
Figure 16 shows part 3 ALU digital circuit, while Figure 17 shows Gbest vs. epochs for part 3 ALU
training, and Figure 18 shows the trained part 3 NN ALU digital circuit.
Figure 16. Part 3 ALU.
No. of particles 10000. 6 neurons in the input layer, 3 neurons in the hidden layer and 1 neuron in the
output layer. Weights range blocked between integer [-7, 7]. Error goal reached successful termination
after 55 iterations.
W{1,1}=[4 0 7 -4 1 -4; -6 -1 -3 -3 -1 7; 2 1 -5 5 1 -3] , B{1} = [-2;4;-2] , B{2} = 7 W {2, 1}
= [-5 -5 -3]
Implementation of Digital Circuits Using Neuro - Swarm Based on FPGA
Assist. Prof. Dr. Hanan A. R. Akkar, M. Sc. Student Firas R. Mahdi
74
Figure 17. Part 3 Gbest vs. epochs ALU.
Figure 18. Part 3 trained NN ALU.
Figure 19 shows part 4 ALU digital circuit, while Figure 20 shows Gbest vs. epochs for the NN
training, and Figure 21 shows the trained part 3 NN ALU digital circuit.
Figure 19. Part 4 ALU.
No. of particles 10000. 8inputs, 3hidden neurons and 1 output neuron. Weights range blocked
between integers
[-31, 31]. Error goal reached successful termination after 64 iterations.
International Journal of Advancements in Computing Technology
Volume 2, Number 2, June, 2010
75
Figure 20. Part 4 Gbest vs. epochs ALU.
Figure 21. Part 4 trained NN ALU.
Figure 22 shows part 5 ALU digital circuit, while Figure 23. shows part 5 Gbest vs. epochs, and
Figure24. show part 5 trained NN ALU digital circuit.
Figure 22. Part 5 ALU.
Implementation of Digital Circuits Using Neuro - Swarm Based on FPGA
Assist. Prof. Dr. Hanan A. R. Akkar, M. Sc. Student Firas R. Mahdi
76
No. of particles 1000. 8 inputs and 1 output neuron.Weights range blocked between integer [-31, 31].
Error goal reached successful termination after 27 iterations. Figure 25 shows the trained 4- bit ALU
NN.
Figure 23. Part 5 Gbest vs. epochs ALU.
Figure 24. Part 5 trained NN ALU.
International Journal of Advancements in Computing Technology
Volume 2, Number 2, June, 2010
77
Figure 25. 4-bit ALU.
Figure 26. Logic simulater timing diagram for the trained NN 4-bit ALU three rondom output
readings.
5. Conclusions
This paper has presented the implmentation of ANNs trained by PSO learning algorithm on FPGA.
The proposed learning algorithm was very useful for reducing the neuron circuitry by lessining the
multiplication process to only AND gates. PSO learning algorithm was advanced over other learning
algorithms in decreasing the number of neurons needed for minimizing the mean squre error to zero
which means 100% accuracy. The drawback of the pso training algorithm was its slowness espicially
for large number of particles over other training algorithm, but this is a normal matter because PSO is a
multi starting points algorithm unlike other algorithm like Backpropagation, however even this
slowness is not big essue because as we said before the training of the network will be totally outside
the FPGA and for only once time, once the network have been trained the weights will be presented
and hard codad as a constants 1 and 0 which means VCC and Ground.
6. References
[1] A. P. Engelbrecht, Computational Intelligence: An Introduction , John Wiley & Sons Ltd, 2007.
[2] J. Kennedy and R. Eberhart, “ Particle Swarm Optimization”, IEEE Int. Conf. on Neural
Networks, 1995, PP 1942-1948.
[3] L. Wang, X. Wang, J. Fu and L. Zhen, “A Novel Probability Binary Partical Swarm Optimization
Algorithm and its Application”, Journal of software, Vol. 3, No. 9, 2008.
Implementation of Digital Circuits Using Neuro - Swarm Based on FPGA
Assist. Prof. Dr. Hanan A. R. Akkar, M. Sc. Student Firas R. Mahdi
78
[4] X. Feng Xie, W. Jun Zhang and Z. Lian Yang, “Adaptive Particle Swarm Optimization on
Individual Level”, IEEE 0-7803-7488-6, International Conference on Signal Processing
(ICSP),2002, PP 1215-1218.
[5] R. Eberhart and Y. Shi, “ Particle Swarm Optimization: Developments, Applications and
Resources”, IEEE Int. Conf. on Evolutionary Computation, 2001, PP 81-86.
[6] J. Kennedy, “The Particle Swarm: Social Adaptation of knowledg”, IEEE Int. Conf. on
Evolutionary Computation, 1997, PP 303-308.
[7] V. G. Gudise and G. K. Venayagamoorthy, “FPGA Placement and Routing Using Particle Swarm
Optimization”, IEEE ,Computer Society Annual Symposium on VLSI Emerging Trends in VLSI
Systems Design, 0-7695-2097-9/04, 2004.
[8] G. Kendall and Y. Su, “ A Particle Swarm Optimization Approach in the Construction of Optimal
Risky Portfolios”, Proceedings of a 23rd IASTED International Multi-Conference Artificial
Inteligence and Applications, 2005.
[9] K. Chandramouli and E.Izquierdo, “Image Classification using Chaotic Particle Swarm
Optimization”, IEEE 1¬4244¬0481, ICIP 2006.
[10] V.G. Gudise and G.K. Venayagamoorthy, “Comparison of Particle Swarm Optimization and
Backpropagation as Training Algorithms for Neural Networks”, IEEE Symposium on Swarm
Intelligence, Indianapolis, IN, USA, 2003, pp 110-117.
[11] F. Bergh and A. Engelbrecht, “ Cooperative Learning in Neural Networks using Particle Swarm
Optimization”, SACJ/SART, 2000.
[12] W. Zha, G. K. Venayagamoorthy, “Neural Networks Based Non-Uniform Scalar Quantizer
Design with Particle Swarm Optimization”, IEEE 0-7803-8916, 2005.
[13] J. Ru Zhang, J. Zhang, T. MingLok and M. Lyu, “ A hybrid Particle Swarm Optimization Back-
Propagation Algorithm for Feedforward Neural Network Training”, Applied Mathematics and
Computation185, 2007, PP 1026-1037.
[14] R. Raeisi and A. Kabir, “Implementation of Artificial Neural Network on FPGA”, American
Society for Engineering Education, Indiana and North Central Joint Section Conference, 2006.
[15] A. Muthuramalingam, S. Himavathi and E. Srinivasan, “Neural Network Implementation Using
FPGA: Issues and Application”, International Journal of Information Technology, Vol. 4, No. 2,
PP 86-92.
[16] R. H. Turner, R. F. Woods, “Highly efficient limited range multipliers for LUT- based FPGA
architecture”, IEEE transactions on vary large scale integration system Vol. 15, No.10, 2004, PP
1113-1117.
[17] M. Marchesi, G. Orlandi, F. piazza and A. Uncini, “Fast neural network without multipliers”,
IEEE transactions on neural network, Vol.4, No.1,1993.
[18] B. Noory and V. Grozo, “A Reconfigurable Approach to Hardware Implementation of Neural
Network”, Canadian conference on Electrical and Computer Engineering IEEE, CCGE1, 2003,
PP1861-1863.
[19] G. K. Venayagamoorthy and R. G. Harley, “Swarm Intelligence for Transmission System
Control”, IEEE 1-4244-1298-6, 2007.
[20] R. Mendes, P. Cortez, M. Rocha and J. Nevers, “Particle Swarm for Feedforward Neural Network
Training”, IEEE 0-7803-7278-6, 2002, pp 1895-1899.
[21] Fairchild Semiconductor Corporation DS00982, “DM74LS181 4-Bit Arithmetic Logic Unit”,
2000.