implementation of digital circuits using...implementation of digital circuits using neuro - swarm...

Implementation of Digital Circuits Using Neuro - Swarm Based on FPGA

Assist. Prof. Dr. Hanan A. R. Akkar, M. Sc. Student Firas R. Mahdi

64


1Assist. Prof. Dr. Hanan A. R. Akkar,

2M. Sc. Student Firas R. Mahdi

1, 2Department of Electrical and Electronic Engineering, University of Technology, Baghdad,

Iraq

Email: 1

[email protected], [email protected]

doi: 10.4156/ijact.vol2.issue2.6

Abstract This paper constructs fully parallel NN hardware realization of Artificial Neural Network (ANN)

depends on the efficient execution of single neuron. Field Programmable Gate Array (FPGA)

reconfigurable computing architecture is appropriated for hardware achievement of ANN. Numerous

implementation of ANNs have been reported in scientific documents, trying to reduce Neural Networks

NNs hardware circuitry. This paper constructs fully parallel NN hardware architecture reduces neuron

hardware to perform an efficient NN through two main parts; the first part covers network training

using Particle Swarm Optimization (PSO) modified MATLAB tools utilized of PSO advantages, the

second part represents the hardware implementation of the trained network through Xilinx high

performance Virtex FPGA schematic entry design tools.

Keywords: Artificial Neural Network (ANN), Particle Swarm Optimization (PSO),

Field Programmable Gate Array (FPGA).

1. Introduction

1.1 Artificial Neural Network (ANN)

An Artificial Neuron (AN) is a model of biological neuron. Where each AN receives signals from

the environment or other ANs, gathers these signals applying some activation function to the signals

sum and when fired transmits signal to all connected ANs. Input signals are inhibited or excited

through positive or negative numerical weights associated with each connection to AN the firing of the

AN and the strength of the exciting signal are controlled via a function referred to as activation

function. The AN collects all incoming signals and computes a net input signal as a function of the

respective weights. The net input serves to the activation function which calculated the output signal of

the AN. An ANN is a layered network of ANs. ANN may consist of input, hidden and output layers.

ANs in one layer are connected fully or partially to the ANs in the next layer [1].

sssssssNi (x1 , .. , xmi )=ai ( ) …(1)

Where xj input signal, wji the weight, aj the activation function, bj bias.

There are several methods of training ANN Backpropagation is by far the most common. In this

research PSO are supposed as the best training algorithm to our application.

1.2 Particle Swarm Optimization (PSO)

Particle swarm optimization is a population based evaluation optimization technique developed by J.

Kennedy and R. Eberhart in 1995 motivated by the social behavior of bird flocking or fish schooling

[2].

PSO is a kind of random search algorithm that simulates nature evolutionary process and performs

good characteristic in solving some difficulty optimization problems. The basic concept of PSO comes

from a large number of birds flying randomly and looking for food together. Each bird is an individual

and called a particle. As the birds looking for food, the particles fly in a multidimensional search space

looking for the optimal solution. Here all the particles are composed of a family rather than the isolated

individual for each other. They can remember their own flying experience and share theirs

International Journal of Advancements in Computing Technology

Volume 2, Number 2, June, 2010

65

companions’ flying experience [3]. The basic PSO model consists of swarm of particles moving in a D-

dimensional search space the direction and distance of each particle in the hyper dimensional space is

resolute by its fitness and velocity. In general the fitness is primarily related with the optimization objective and velocity is updated according to an urbane rule [4].

In PSO, populations starts with random initialization of individuals in the search space and then

repeat the social behavior of the particles in the swarm till achieves the best possible result by iterative

searching. At each iterative step the velocity (position change) is updated and the particle is moved

towards a new position. The best previously visited position at the nth

particle is denoted by the

personal best position (pbest), while the position of the best individuals of the whole swarms is denoted

as the global best position (gbest). In other words, the particle swarm optimization idea consists of at

each time step, changing the velocity and location of each particle towards its (pbest) and (gbest)

locations according to equations 2 and 3:-

Vid = W*Vi+C1*rand1*(Pid-Xid) +C2*rand2*(Pgd-Xid) … (2)

Xid = Xid + Vid … (3) Where W is the inertia weight which bring a stability between global exploration and local exploration,

C1 and C2 are two constants called learning factors [5], rand1 and rand2 are two independent random

numbers uniformly spread in the range of [0,1], for equation 2 the first part represents the inertia of

previous velocity; the second part is the cognition part which represents the private thinking by itself;

the third part is the social part, which represents the assistance among the particles [6].

During the past few years PSO has been shown successful for many applications [7-9] several

papers discuss how to apply PSO in training NNs and their advantages [10-13]. For the purpose of NN

learning the empirical error referred to as the objective function to be optimized by the optimization

method (minimized to 0) is given by:-

M.S.E = … (4)

Where n is the number of training patterns, m is the number of outputs, T is the target and Y is the

actual value [1].

1.3 Field Programmable Gate Array (FPGA)

FPGA is especially designed IC that is often used for prototyping. Each FPGA has three main parts;

the Configurable Logic Block (CLB) is the most significant part. CLB provides physical support for

the program downloaded on FPGA. Another part is the Input Output Block (IOB) which provides input

and output for FPGA and makes it possible to communicate outside the FPGA. The last part is the

Programmable Interconnect (PI) which connects the different part of FPGA and allows them to

communicate with each other [14].

FPGA are an excellent technology for implementing NNs hardware. Executing a NN on FPGA is a

relatively easy process. For lessening the design circuitry the training will be done independently off

line the FPGA, once the training is completed and the correct network weights is obtained these

weights will be hard implied on FPGA. The accuracy in which these weights can be coded will

depends upon the number of bits existing to implement the weights. Parallelism and dynamic adaption

are two computational characteristics typically related with ANN FPGA - based reconfigurable

computing architecture are well suited to implement ANNs as one can develop concurrency and rapidly

reconfigure to adapt the weights and topologies of an ANN [15]. FPGA realization of ANN with large

number of neurons is still a not easy task because ANN algorithm is wealthy with multiplication

process and it’s relatively expensive to realize. Various work reported in this area includes new

multiplication algorithm for ANN, NNs with some constraints to achieve higher speed of process at

lower price and multichip realization [16-18].

2. Theory Description



66

There has been significant increase in research and development in the area of applying

evolutionary computation techniques for the purpose of evolving one or more aspects of ANNs. These

evolutionary techniques have usually been used to develop NN structure or the network learning

algorithm [19]. A new learning algorithm combined ANN was proposed to determined the optimal

weights, these weights adjusted by PSO.

PSO has its own advantages and disadvantages over other computational algorithms, advantages

like its probabilistic mechanism and multi starting points, hence PSO can avoid getting into the local

optimal solution [20], but the most utilized property PSO in this paper is its free derivative activation

function, which means that we will train feed forward NNs using PSO as the learning algorithm with

only Hardlimit activation function (shown in Figure 1) for all network layers. According to the

Hardlimit activation function properties the output will be either one or zero, this property will be very

helpful simplifying the network multiplication process.

Figure 1. Hard-limit transfer function.

Numerous studies have further explored the power of PSO as a training algorithm, these studies

shown that NNs trained by PSO have more accurate results than other training algorithms, but in the

same time its slowest than other algorithms like Back propagation. For the purpose of neural network

implementation of digital logic circuits we have modified the MATLAB PSO tools to be suitable to our

application. The modified PSO tools will give us the exactly weights needed for the network training

these weights will be only integers numbers this will be obliging in execution the multiplication

process using only AND gates.

3. The Proposed Design of PSO Neuron

Neuron is the main part of each ANN. Each neuron can have countless inputs; however it will only

have one output value. Any digitally design for the neuron must take into consideration to minimize the

neuron hardware circuitry. Since our application is to construct digital circuits using NNs, then the

inputs to the network will be blocked between only two values either one or zero, while the weights

will be just an integer values for the facility of representation and also to reduce complexity. From

these two facts the multiplication process will do not needs more than AND gates where each input

will multiplied each time with a single bit of the weight using an AND gate. Suppose we have a single

input neuron and the weights are restricted in the integer range of [-3, 3] which means that each

weight will be represented using only two bits and a single bit for the representation of the weight sign,

therefore three AND gates are sufficient for the sign multiplication process representation, where 0

stands for positive sign and 1 stands for negative sign as shown in Figure 2.



67

Figure 2. Two bit with sign multiplication digital circuit.

This structure will be repeated for each input. The product produced by the multiplication process

will be added or subtracted according to the weights signs using a special design adder\ subtracter

with sign as illustrated in Figure 3.

Figure 3. Two Bit Adder/Subtracter circuit.

As we said, Hardlimit transfer function will be applied for all network layers, therefore the output of

the neuron will be 1 if the net (final neuron inputs weights product summation) greater or equal to 0

and 0 if the net less than 0. The neuron structure is shown in Figure 4. The output of the neuron will be

the same as the inverted sign of the final neuron net as shown in Figure5.

Figure 4. Two input neuron with two bit weight circuit.



68

Figure 5. Two input neuron with two bit weight circuit.

4. The Proposed Design of Digital Circuit

In this paper we will implement the trained by PSO ANN that performed two digital circuits; the

first will be the one bit full adder while the other will be a 4 bit Arithmetic Logic Unit (ALU).

4.1 Full Adder

To design 1 bit full adder we must first obtained the right integer weights using MATLAB modified

PSO tools. The parameters used for this purpose: W =0.6, C1 and C2 =1.7, No. of particles 1000. 3

neurons in the input layer, 3 neurons in the hidden layer and 2 neurons in the output layer. Weights

arranged between integer [-3, 3]. Where Figure 6. shows the particle dynamics through the training

process, while Figure 7. shows the Full Adder Gbest against epochs until minimize the error to zero

diagram, Figure 8. shows the ANN Full Adder circuit, finally Figure 9 shows logic simulator timing

diagram for the ANN Full Adder circuit. Error goal reached successful termination after 18 iterations.

W {1, 1} = [1 0 -3; -3 -3 1; -1 -2 -2] B {1} = [0; 2; 2]

W {2, 1} = [-2 -1 2; 0 0 -3] B {2} = [0; 2]

Figure 6. Particles dynamics.



69

Figure 7. Gbest vs. epochs for Full adder circuit.

Figure 8. The ANN Full Adder circuit.



70

Figure 9. Logic simulator timing diagram ANN Full Adder circuit.

4.1 4-Bit ALU

Arithmetic Logic Unit ALU is a 4-bit high speed parallel ALU, controlled by the four Function

Select inputs (S0–S3) and the Mode Control input (M). It can perform all the 16 possible logic

operations or 16 different arithmetic operations on active HIGH or active LOW operands. The function

table lists these operations. When the Mode Control input (M) is HIGH, all internal carries are

inhibited and the device performs logic operations on the individual bits as listed. When the mode

control input is LOW, the carries are enabled and the device performs arithmetic operations on the two

4-bit words [21].

Table1. ALU operation [21].

Since we have a large data set 16,384 due to 14 inputs we will divide the network into 5 parts with

maximum 8 inputs and output to simplify the design. Figure 10. shows the first ALU part, while Figure

11. shows Gbest against epochs until minimize the error to zero diagram and Figure 12. shows part 1.

trained Neural Network digital circuit. W=0.6, C1 and C2=1.7, No. of particles 1000



71

Figure 10. Part 1 ALU digital circuit.

6 neurons in the input layer and 2 neurons in the output layer. Weights range blocked between

integer [-7, 7]. Error goal reached successful termination after 19 iterations. W {1, 1} = [-4 -3 0 0 -1

-2; -5 -7 2 -2 1 0] B {1} = [7; 5]

Figure 11. Gbest vs. epochs part 1 ALU.



72

Figure 12. Part 1 trained Neural Network ALU.

Figure 13. shows the second part ALU digital circuit, while Figure 14. shows Gbest against epochs

for part 2 NN training until reached 0 error and Figure 15. shows trained neural network part 2 digital

circuit.

Figure 13. Part 2 ALU digital circuit

No. of particles 10000. 4 neurons in the input layer, 3 neurons in the hidden layer and 1

neuron in the output layer. Weights range blocked between integer [-3, 3]. Error goal

reached successful termination after 31 iterations. W{1,1}=[2 2 1 -1;2 1 -1 0;-1 1 2 -1] W {2,1}=[2 -2 -2] , B{1}=[-3;-1;-2] , B{2} = 1

Figure 14. Gbest vs. epochs Part 2 ALU.



73

Figure 15. Part 2 trained NN ALU.

Figure 16 shows part 3 ALU digital circuit, while Figure 17 shows Gbest vs. epochs for part 3 ALU

training, and Figure 18 shows the trained part 3 NN ALU digital circuit.

Figure 16. Part 3 ALU.

No. of particles 10000. 6 neurons in the input layer, 3 neurons in the hidden layer and 1 neuron in the

output layer. Weights range blocked between integer [-7, 7]. Error goal reached successful termination

after 55 iterations.

W{1,1}=[4 0 7 -4 1 -4; -6 -1 -3 -3 -1 7; 2 1 -5 5 1 -3] , B{1} = [-2;4;-2] , B{2} = 7 W {2, 1}

= [-5 -5 -3]



74

Figure 17. Part 3 Gbest vs. epochs ALU.


Figure 19 shows part 4 ALU digital circuit, while Figure 20 shows Gbest vs. epochs for the NN

training, and Figure 21 shows the trained part 3 NN ALU digital circuit.


No. of particles 10000. 8inputs, 3hidden neurons and 1 output neuron. Weights range blocked

between integers

[-31, 31]. Error goal reached successful termination after 64 iterations.



75



Figure 22 shows part 5 ALU digital circuit, while Figure 23. shows part 5 Gbest vs. epochs, and

Figure24. show part 5 trained NN ALU digital circuit.




76

No. of particles 1000. 8 inputs and 1 output neuron.Weights range blocked between integer [-31, 31].

Error goal reached successful termination after 27 iterations. Figure 25 shows the trained 4- bit ALU

NN.





77

Figure 25. 4-bit ALU.

Figure 26. Logic simulater timing diagram for the trained NN 4-bit ALU three rondom output

readings.

5. Conclusions

This paper has presented the implmentation of ANNs trained by PSO learning algorithm on FPGA.

The proposed learning algorithm was very useful for reducing the neuron circuitry by lessining the

multiplication process to only AND gates. PSO learning algorithm was advanced over other learning

algorithms in decreasing the number of neurons needed for minimizing the mean squre error to zero

which means 100% accuracy. The drawback of the pso training algorithm was its slowness espicially

for large number of particles over other training algorithm, but this is a normal matter because PSO is a

multi starting points algorithm unlike other algorithm like Backpropagation, however even this

slowness is not big essue because as we said before the training of the network will be totally outside

the FPGA and for only once time, once the network have been trained the weights will be presented

and hard codad as a constants 1 and 0 which means VCC and Ground.

6. References

[1] A. P. Engelbrecht, Computational Intelligence: An Introduction , John Wiley & Sons Ltd, 2007.

[2] J. Kennedy and R. Eberhart, “ Particle Swarm Optimization”, IEEE Int. Conf. on Neural

Networks, 1995, PP 1942-1948.

[3] L. Wang, X. Wang, J. Fu and L. Zhen, “A Novel Probability Binary Partical Swarm Optimization

Algorithm and its Application”, Journal of software, Vol. 3, No. 9, 2008.



78

[4] X. Feng Xie, W. Jun Zhang and Z. Lian Yang, “Adaptive Particle Swarm Optimization on

Individual Level”, IEEE 0-7803-7488-6, International Conference on Signal Processing

(ICSP),2002, PP 1215-1218.

[5] R. Eberhart and Y. Shi, “ Particle Swarm Optimization: Developments, Applications and

Resources”, IEEE Int. Conf. on Evolutionary Computation, 2001, PP 81-86.

[6] J. Kennedy, “The Particle Swarm: Social Adaptation of knowledg”, IEEE Int. Conf. on

Evolutionary Computation, 1997, PP 303-308.

[7] V. G. Gudise and G. K. Venayagamoorthy, “FPGA Placement and Routing Using Particle Swarm

Optimization”, IEEE ,Computer Society Annual Symposium on VLSI Emerging Trends in VLSI

Systems Design, 0-7695-2097-9/04, 2004.

[8] G. Kendall and Y. Su, “ A Particle Swarm Optimization Approach in the Construction of Optimal

Risky Portfolios”, Proceedings of a 23rd IASTED International Multi-Conference Artificial

Inteligence and Applications, 2005.

[9] K. Chandramouli and E.Izquierdo, “Image Classification using Chaotic Particle Swarm

Optimization”, IEEE 1¬4244¬0481, ICIP 2006.

[10] V.G. Gudise and G.K. Venayagamoorthy, “Comparison of Particle Swarm Optimization and

Backpropagation as Training Algorithms for Neural Networks”, IEEE Symposium on Swarm

Intelligence, Indianapolis, IN, USA, 2003, pp 110-117.

[11] F. Bergh and A. Engelbrecht, “ Cooperative Learning in Neural Networks using Particle Swarm

Optimization”, SACJ/SART, 2000.

[12] W. Zha, G. K. Venayagamoorthy, “Neural Networks Based Non-Uniform Scalar Quantizer

Design with Particle Swarm Optimization”, IEEE 0-7803-8916, 2005.

[13] J. Ru Zhang, J. Zhang, T. MingLok and M. Lyu, “ A hybrid Particle Swarm Optimization Back-

Propagation Algorithm for Feedforward Neural Network Training”, Applied Mathematics and

Computation185, 2007, PP 1026-1037.

[14] R. Raeisi and A. Kabir, “Implementation of Artificial Neural Network on FPGA”, American

Society for Engineering Education, Indiana and North Central Joint Section Conference, 2006.

[15] A. Muthuramalingam, S. Himavathi and E. Srinivasan, “Neural Network Implementation Using

FPGA: Issues and Application”, International Journal of Information Technology, Vol. 4, No. 2,

PP 86-92.

[16] R. H. Turner, R. F. Woods, “Highly efficient limited range multipliers for LUT- based FPGA

architecture”, IEEE transactions on vary large scale integration system Vol. 15, No.10, 2004, PP

1113-1117.

[17] M. Marchesi, G. Orlandi, F. piazza and A. Uncini, “Fast neural network without multipliers”,

IEEE transactions on neural network, Vol.4, No.1,1993.

[18] B. Noory and V. Grozo, “A Reconfigurable Approach to Hardware Implementation of Neural

Network”, Canadian conference on Electrical and Computer Engineering IEEE, CCGE1, 2003,

PP1861-1863.

[19] G. K. Venayagamoorthy and R. G. Harley, “Swarm Intelligence for Transmission System

Control”, IEEE 1-4244-1298-6, 2007.

[20] R. Mendes, P. Cortez, M. Rocha and J. Nevers, “Particle Swarm for Feedforward Neural Network

Training”, IEEE 0-7803-7278-6, 2002, pp 1895-1899.

[21] Fairchild Semiconductor Corporation DS00982, “DM74LS181 4-Bit Arithmetic Logic Unit”,

2000.

implementation of digital circuits using...implementation of digital circuits using neuro - swarm...

Documents