an hybrid architecture multi-layer feed-forward neural ... · 1 rrwilrl iikr ro express rny deep...
TRANSCRIPT
AN HYBRID ARCHITECTURE FOR MULTI-LAYER FEED-FORWARD NEURAL NETWORKS
by
Zulfiqar Ahmed
A Thesis Subrnitted to the College of Graduate Studies through the Department of Eiectrical Engineering in Partial Fulfillment
of the Requirements for the Degree of Master of Applied Science at the
University of Windsor
Windsor, Ontario, Canada
May 1999
National Library 1*1 of Canada Bibliothéque nationale du Canada
Acquisitions and Acquisitions et Bibliographic Services services bibliographiques
395 Wellington Street 395. rue Wetllington OttawaON KlAON4 OttawaON KlAON4 Canada Canada
The author has granted a non- exclusive licence allowing the National Library of Canada to reproduce, loan, distribute or sel1 copies of this thesis in microform, paper or electronic formats.
The author retains ownership of the copyright in this thesis. Neither the thesis nor substantial extracts kom it may be p ~ t e d or othenvise reproduced without the author's permission.
L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thèse sous la forme de microfiche/film, de reproduction sur papier ou sur format électronique.
L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.
O Copy Rights Zulfiqar Ahmed 1999
ABSTRACT
Multi-layer ked-forward neural networks have the capability to classify and generalize. which
are not achievable with other rnethods. The complete exploitation of their potential to fi111 limit
requires efficient hardware implementation. The two main problems of hardware realization;
easy long term storage of synaptic weights and massive interconnections. are addrrssed and
solved by the mixed signal architecture for implementation of feed-fonvard neural network. The
hybrid architecture is analyzed and implemented in 0.5 micron CMOS technology. The analog
processing blocks have been designed in current mode analog CMOS and the synaptic weights
and threshold values are stored in digital ROM.
ACKNOWLEDGE-MENTS
1 rrwilrl iikr ro express rny deep grutirride arld tlzanks tu my srrpen*isor DI: M. Altrnncli. for
/lis srlpport and grridance rlzrorrghour the progress of this thesis. I rvould also like ro rliank
Dr: G. A. Jrtllieri for his ver). helpfrrl srrggestior~s and encoriragernent. I aiso appreciatu the
parricipatiurl of DI: A. Jaekrl in rny cornmittee and lier fririrfiii s~rgyesrioris. Thanks nrrist also
go ro my parents for their infallible srrppon and encouragement.
TABLE OF CONTENTS
ABSTRACT DEDICATION .4CKNOWLEDGEMENTS LIST OF FIGURES LIST OF TABLES
1 . Introduction 1. I Introduction 1.3 A brief history of research in neural models 1.3 Neurril network rnodels
1 3 . 1 Feed back or recurrent networks 1.3.2 Feed forward networks
1.4 The learning algorithm 1.3.1 Error back propagation algorithm
1-5 Implementation of neural networks 1.6 Goals. objectives and organization
3. VLSI implernentation of neural networks 2.1 Introduction 2.3 Analog implernentation
2.2.1 Floating gate (EEPROM) devices 2 - 2 2 Capacitive storage
2.3 Hybrid implementation
3. An architecture for rnulti-layer neural networks 3.1 A VLSI architecture 3.2 The rnultiplexed architecture 3.3 Operation
vii
iv v vi ix xii
3.4 Input/Outpur requirements 3.5 Multiplexing 3.6 Pipelined architecture 3.7 Performance analysis on an example
3 -7.1 Software implementation of XOR 3.7.2 Quantization 3.7.3 VLSI implementation of XOR 3.7.4 Testing of neural chip
3.8 Sumrnary
4. VLSI circuitry 4.1 Neuron
4.1.1 Current mode neuron 4.1.2 Voltase mode neuron
4.2 Synapse 4.2.1 Multiplier 4.2.2 ROM
4.3 Trans-impedance amplifier 4.3 VLSI implementation of capacitor 4.5 Summary
5 . Conclusion
Appendix A Mask Layout Diagrams
Appendix B Simulation Models
Appendix C Verilog Source Code
REFERENCES
VITA AUCTORIS
LIST OF FIGURES
Figure 1 . 1 Neuron mode1
Figure 1.2 The architecture of a feed back neural network
Fisure 1.3 Multi-layer feed forward neural network
Figure 2.1 Floating gate structure for weight storage
Figure 3.1 An hybrid architecture by Djahanshahi
Figure 3.2 A unified synapse neuron
Figure 3.3 Layers and stages in the architecture
Figure 3.4 Interna1 structure of a stage
Figure 3.5 The adder
Figure 3.6 Conversion of the signal mode at input and output
Figure 3.7 Second solution to the VO mode
Figure 3.8 Pipelined architecture
Figure 3.9 Modified structure of each stage
Fisure 3.10 XOR network with hidden layer
Figure 3.1 1 Implemented network for the XOR function
Figure 3.12 Network with quantized values
Figure 3.13 System tevel VLSI layout for XOR
ix
Figure 3.14 System level simulations
Figure 3.15 Fabrication results of the neural chip
Figure 4.1
Figure 3.2
Figure 4.3
Figure 4.4
Figure 4.5
Figure 3.6
Figure 4.7
Figure 4.S
Figure 4.9
Figure 4.1 O
Figure 4.1 1
Figure 4.12
Figure 4.13
Figure 4.14
Figure A. 1
Figure A.2
Figure A.3
Sigmoid function and its derivative
Differential transistor pair and its transfcr characteristics
Current mode neuron
Transfer characteristics of current mode neuron
Results of montecarlo analysis for current mode neuron
Voltage mode neuron
Transfer charactenstics of voltage mode neuron
h/lontecarlo anaiysis for voltage mode neuron
Multiplier digital to analog convener
Transfer characteristics of MDAC
RO iM
Simulation results of ROM
Trans-impedance amplifier
Transient response
Mask layout of current mode neuron
Mask layout of voltage mode neuron
Mask layout of MDAC
Figure A.4 Mask Iayout of uans-impedance amplifier X
Fisure A.5 Mask layout of ROM with buffers
Figure A.6 Mask Iayout of capacitor
Figure A.7 Mask layout of neural chip
LIST OF TABLES
Table 3.1 Test vectors
Table 3.2 Simulation result
Table 3 . 3 Simulation resuit with quantized values
Table 4.1 Device sizes of current mode neuron
D b l e 4.2 Device sizes of voltage mode neuron
Table 3.3 Device sizes of trans-impedance amplifier
Table 4.4 Device sizes of capacitors
xii
Chapter 1
Introduction
1.1 Introduction:
Massively parallel networks based on neural network models have k e n subject of great interest
for the last several years. Although neural networks have been considered for many years
basically as models for understanding the biological information processing in the human
brain, the recent resurgence of interest in neural network concepts as a new approach to
computing is a cumulative ef5ect of the work of several researchers [6, IO, 131 due to many reasons.
The reasons are; maturing of technologies necessary to implement massively interconnected
parallel networks and improved knowledge of biological models upon which neural models are
Ioosely based. These motivations are underscored by the generai failure of symbolic processing,
to efficiently perform the tasks which are core of neural networks like image processing, pattern/
speech recognition and robotics or proçess control etc.
1. INTRODUCTION
Neural networks consists of large number of simple node processors. These processors do not
require programrning in the normal way, and usually any control which is necessary can be
exercised on at least a semi global ba i s . The stored information (associative memory) in a
neural network is presented in distributed manner. Proper operation of the network is therefore not
dependent upon the value at any specific storage location. Information processing within the
network is also distributed. in human brain the biological neurons keep dying, even then the brain
still functions correctly without any loss of performance. Alike in many neural network
implementations, a few defective neurons do not produce little or no degradation in the
performance of the network. Therefore the neural networks are fault tolerant [63]. This is in
contrast to digital computers.
Digital computers are very useful in solving well defined problems since they can be represented
by sequence of instructions. However, they have a great difficulty when addressing some tasks
which biological systems appear to perform with relative ease such as pattedspeech recognition
[7-9.11) and control. Real time performance of these tasks which is sometimes critical, presents
even more difficulty for digital computers.
It is quite evident that the power of the human brain is not needed to solve many perplexing
problems such as associative memory, pattern recognition, but still the characteristics and
techniques employed by the brain are required to handle these problerns effectively. Massive
interconnected parallelisrn is the most important feature of the brain. The human brain performs
1. INTRODUCTION
some tasks very easily which are very difficult for the digital cornputers, the reason for the
efficiency of brain lies in the massive parallel approach.
Analog circuits offer the best choice for providing this type of computational power. An analog
synapse can be as simple as a single transistor and a capacitor-This equals the sarne cornplexity as
a single bit of dynamic memory in the digital cornputer, though the analog circuit is capable of
representing the equivalent of about 8 bits worth of information, and perfonning crude
multipIication which requires a large amount of digital hardware to duplicate. This shows a lot of
reduction in chip area in cornparison to the digital circuits for comparable level of processing
power. Though analog circuitry is prone to process dependent parameters such as offsets,
restricted dynamic range, noise and temperature. However, these can be controlled by speciai
design techniques.
1.2 A brief history of research in neural rnodels:
The history of research on nervous system goes as far back as the discovery of the electrical
nature of nervous transmission by Galvani,and the early experirnents of Helmholz[67]. Significant
advances in the area of neuro-modeling, however, had to wait until the end of nineteenth century.
At this time, cognitive psychologist like James [2] outlined the fundamental concepts of neural
activity which is still in use even now.
1. INTRODUCTION
The farnous paper of McCullough and Pitts [3] in 1943 is generally considered as the beginning
of senous work on neural modeling. They showed that a network like neural networks of
Iinear threshold elements can compute any logical function. As a matter of fact, this paper had
a more pronounced effect on computer science than on neural networks [4-51. In 1949, Donald
Hebb published bis epoch making work[l2], stating the correlation update law. Frank Rosenblatt
[ 131 proposed the first computationally onented network and also gave the perceptron
convergence procedure. Widrow and Hoff 1141 with their ADALINE (1960) helped to bnng
adaptive systems and neural networks together.
At this time, one group inspired by Rosenblatt, believed in the power of the perceptron and in
general considered a connectionist architecture essential to the types of behavior observed in
human learning and recall processes. The second group anached not much importance to the
intemal mechanism of the nervous system, rather they were mostly in favor of serial symbolic
processing. The area of artificial intelligence; is a reflection of the beliefs of this group.
In 1960, Minsky and Papert published their farnous book, Perceptron [15]. This book is a
brilliant mathematical analysis of the limitations of network consisting of a single layer of iinear
threshold units. The only rnistake of the authors was that they went a step ahead and surmised
that these limitations extend to the general multi-layer and non linear case. This misconception
resulted in a considerable slackening of Pace of research in neural networks for next
two decades.
1 - INTRODUCTION
Between this time and late 1970's, a new interest in neural network was aroused, the works
of Kohnen in correlation matrix mernories [16] and Grosberg in mathematical modeling [17]
were most prorninent.
In the 1980's a re-awakening in neural network was seen. The developrnents of this decade
includes; studies in letter perception [18,20], Hopfield networks [21-221, Self organized networks
[33-251, Neocogniuon [26-291, Boluman machine [30], and the Error back propagation learning
algorithm [3 1-34]. The latter topic is of paramount interest and has attracted researchers from
various of science and engineering.
1.3 Neural network models:
GeneralIy neural network modeIs share a comrnon underlying structure. In very basic form, the
neural networks c m be described as a collection of simple node proçessors, called neurons, which
interact among themselves through a massive interconnected synaptic network. The function of a
single processor is to derive a weighted sum of al1 the previous outputs connected to its input
and to apply a nonlinear function (usually sigrnoid) to the result. The neural mode1 in layer 1+1
is shown in figure 1.1.
The neural networks are usually classified by their general structure. Among the many hardware
implernentations of neural network models, the feed back and feed forward neural network
are frequently used
1. INTRODUCTION
-- - . -
Figure 1.1 : Neuron mode1
Another new class of information processing systems called Cellular neural Network was
presented by Chua and Yang [19]. Similar to neural networks it is a large scale non-linear analog
circuit which presented signais in real time; like cellular automata, it consists of massive
aggregate of regularly spaced circuit clones, called cells, which comrnunicate with each other
directly only through their neighbors. Each ce11 is made of a linear capacitor, a non linear voltage
controlled current source and a few resistive linear circuits elements. Cellular neural networks
are well suited for high speed parallel signal processing. CNNs, which combine some features
of fully interconnected analog neural networks with their nearest neighbors found in cellular
automata, are especially well suited for VLSI implementations 1691.
1. INTRODUCTION
1.3.1 Feed back or recurrent networks:
A general structure of recurrent network is shown in figure 1.2. Each neuron has a non-decreasing
sigmoid non linearity at each node and its output is fed back to al1 other neurons via synaptic
weights. This mode1 has been applied to tasks such as associative memory and optirnization
where one out of several competing solutions must be resolved.
Inputs Q 9
Outputs
Associative mernories, or content addressable mernories cm be considered to be optimizing
networks, because they attempt to minirnize an objective function. Each neuron has a an externai
input and output. The task asigned to network is to reconstmct a vector while a part of it is given
1. INTRODUCTION
as input. The synaptic weights are prograrnmed such that the minima of the objective function
correspond to the stored vector. When a partial input vector is applied, the network will generate
the remainder of the vector whose stored pattern matches the best to that presented. Computer
aided machines were one of the earliest hardware implementations of neural networks [70]. The
most cornmon model of feed back neurd networks is the Hopfield model [2 1,22,7 11.
1 A 2 Feed forward networks:
The most commonly used model of neural network architecture is feed forward network with two
o r more hidden layers. The classical perceptron had a single layer of node processors and is a
feed forward network. Multi-layer feed forward neural networks have one or more layers of node
processors between input and output layers. They are also called as multi-layer perceptron [44].
The general form of feed forward network is shown in figure 1.3; the outputs of any layer are
weighted and added as an input to a neuron in the next layer.
An external input is applied to the first layer i.e. input layer, the processed output is then fed to the
next layers till the last layer Le. output layer. These layers are cascaded to form a multi-layer feed
forward neural network. The governing equation for any neuron in any layer is given by:
where Yj is the output of the jh neuron. Sj is a non linear sigmoid function. wij is the connection
weight from the ith layer to the jth neuron input. x, is the ith output of a neuron in the previous
layer and ej is a threshold at the input of the jth neuron.
OUTPUTS
Hiddei Layer
Figure 1.3: Multi-layer feed forward neural network
Networks with this structure have a flow of information in foward direction only therefore they
are inherently stable. These are often used in classifications tasks; common applications include
pattern recognition1 speech recognition [8,9,11].
1. INTRODUCTION
1.4 The learning algorithm:
In order to realize a desired result with neural networks. a learning process must be used to find
the correct weight matrix. There are two main classes of iearning algonthms; supervised and
unsupervised. In case of unsupervised leaming the network does not receive any feed back from
the environment and no information about the correct result is given, The training relies on
redundancies in order for the network to self organize. Supervised learning is a process where
feed back about the error is given and the weights are adjusted accordingly to minimize the error
. One such supervised training algorithm used for rnulti-Iayer feed forward neural networks is
error back propagation aigorithm [Ml.
1.4.1 Error back propagation algorithm:
Back error propagation is the most popular supervised learning rule for implementation of the
multi-layer neural networks[64]. It is an extension to the gradient rule; is a steepest-descent
method which minirnizes the totd mean square error. The training phase consists of a forward
pass for the output computation, calculation of the output error and propagation of this error to
each neuron using backward pass. The weight update may be then performed by appkat ion of
the neural activations and the associated errors. Let f(nei,) be the activation of the ith neuron.
5 and y be the desired and obtained output of the ith network output, W, be the weight of
connection from the jth neuron to the ith. The a, is the propagated error <O the ith neuron:
At the output layer: lji = f (net)(& - Y ) 1.2
At the hidden layer:
the weight update then can be represented by:
Where E is the leaming rate.The advantage of errorback propagation algorithm is its highly
parailelism: a forward pass and single backward pass is sufficient for weight update calculations.
The parallelism is achieved through assuming a known neural activation function alongwith its
derivative and backward computations [Ml. This assumption is quite valid for software
implementations but cannot be extended to the analog hardware domain[63].
1.5 Implementation of neural networks:
Mathematical modeling and the work Grosberg and Hebb was fundamental in the development of
neural networks[12,17]. However the investigators of nervous system and behavioral psychology
1. INTRODUCTION
were actually able to evaluate the performance of their models with any accuracy or precision
after the advent of digital cornputers and availability of numerical simulations. Farley and Clark
[35] were the first to utiIize the digital cornputer for modeling and software simulation
of neural networks.
Software implementation was done by Neocognitron by Fukushima [26-291. This was a multi-
layer neural network with 9 layers. Four different types of neural units were used and the learning
was supervised. The network was trained to recognize hand written characters regardless of
position site.
Another exarnple of software implementation of neural networks was the NETTALK; a network
that learned how to read. This feed forward, 3 layer network was developed in 1986 by Sejnowsky
and Rosenberg [36]. Boltzman machine and error back propagation learning rules were applied
with comparable results, it showed that error back propagation algorithm was faster in learning.
The software simulation on a digital computer is not difficult, but time consuming. This
stems from modeling a highly interconnected pardiel system with serial hardware. The hardware
implementation of neural networks has been addressed by many researchers. The first learning
machine was made by Marvin Minsky in 195 1. This machine had a memory consisting of 40
control knobs, which were moved by a single motor through electric clutches. It had 300
thermionic valves and in Minsky's own words [37] "was never thoroughly debugged, but worked
nonetheless(robustness)".
1. INTRODUCTION
Rosenblatt in early 1960's in Corne11 University built a "MARK 1" perceptron with 400 photo
receptive sensors on a 20 X 20 array. This perceptron had 152 associative units and 8 binary
response units for the final classification. Each sensory unit had upto 40 random connections to
the associator unit[38].The major obstacle in the realization of networks is the large amount of
hardware required to implement even the simplest functions.
1.6 Goals, objectives and organization:
The objective of the thesis is to develop architecture and circuits for feed forward multi-layer
neural networks. An architecture for this type of network will be presented. It will be shown
that this architecture results in decrease in the number of physical interconnections on the chip
without any loss of generality . The multiplexing scheme used will also make possible multi-
chip systems without a large number of interconnections. The architecture and circuits presented
can be applied to any feed forward neural networks regardless of application. A complete set of
cells for this architecture has been designed and integrated in the neural chip for boolean function
XOR, designed in a full custom 0.5 micron CMOS(3-metal single polysilicon process)
technology.
The chapter 2 will draw some cornparison between analog and digital implementations and
survey analog synapse techniques and the solutions to the analog memory problem will be
discussed. In chapter 3 an implementation of a modular hybrid architecture using 0.5 micron
1. INTRODUCTION
CMOS process will be presented. The building blocks of the hybrid architecture will be presented
in chapter 4. Finally the chapted is the conclusion.This chapter presents a summary of the work
done and .as well, presents promising future areas of research in this area.
Chapter 2
VLSI Implementation of Neural Networks
2.1 Introduction:
Artificial neural networks, are a cIass of distxibuted processors, consisting of massively
interconnected processors. Its capabilities of generalization. adaptation, leaming from examples
and tolerance to noise [3 11 have made them very attractive for many applications such as pattern
recognition. imagekpeech processing, control etc.
The massive interconnected distributed structure of neural networks typically requires millions
of operations per second. Therefore, complete exploitation of neural network potentials is not
achievable through software implementations. Large number of simple processors, huge
interconnectivity of the processors and distxibuted information storage suggests VLSI a suitable
hardware implementation scheme.
2. VLSI irnplementation of neural networks
Similar to any other VLSI architecture reaiization, there are three trends in neural network
implemen tation; Digital, Anaiog and Mixed. The advan tages of digi ta1 approach are:
a) Design techniques are advmced, automated and well understood.
b) Programming of weights can be managed easily.
C) Interchip communication and possible exchange of information with host computer
can be easily perforrned,
d) Digital memories are comparativeiy easy to build and the weight storage problem, from
which analog implementations suffer, does not appear in the digital case.
However, the neural networks are analog in nature and. intuitively, it seems that an analog
implementation would be more suitable and elegant. The neural networks use signals which
have limited precision and range and the environment can be very noisy.T'herefore the precision
and immunity to noise, the main advantages of digital systern are not essential. Also in digital
design, the multipliers occupy a larger silicon area cornparhg with their anaiog counterparts and
the two level information representation on each line increases the interconnections [65-661.
Analog approach [39-461 provides compact realization of neuron non linearity and synaptic
operation which are crucial in the implementation of neural networks for real world applications.
Also the ability to transmit/process more than one bit per line provides a desirable reduction of
interconnections. Some of its limitations are low precision, low noise, immunity, temperature
dependence and process parameter dependence. However, special design measures that typically
2. VLSI implementation of neural networks
increase both the design complexity and size may be taken to improve precision. temperature
and process tolerances 147-481.
The major drawback of analog implementation approach is lack of reliable non volatile analog
memory to store the connection weights:
The third design approach [1,53-561 combines, the digital and analog methods. This scherne is
called rnixed (hybrid) analog/digital method. Murray and Smith [53] have used pulse coded
analog signals and digital synaptic storages. Arima et al [54] have reported a 400-neuron design
which takes advantage of analog weight storage on câpacitors and binary neuron activation. In
the design of Boser et ai [7,49 1 al1 the operations are performed in andog mode. But in order to
simplify system integration, digital input/output (both weights and activations) are used in the
design.
Since implernentation of non volatile programmable memory is the major drawback of fully
analog realization, some researchers have used complete analog processing blocks in conjunction
with digital weight memory [55-561. In both of these designs synaptic multiplication is based on
multiplying D/A converter. This hybrid multiplier produces an analog output through
multiplication of the digital weight by analog input. The draw back of this approach is the limited
number of bits in dynamic range. This will have an adverse effect on the convergence of most of
learning algorithms. However this rnethod is suitable for applications where constant adoption is
necessary.
2. VLSI implementation of neural networks
2.2 Analog implementation:
Analog VLSI is a promising platform for the implementation of these networks. Large scale
integration makes it possible to put many electronic components on a single chip with better
reliabiiity and lower cost. However, another inherent property of neural networks, higher inter-
connectivity, has proven to be one of the major obstacles in the way of hardware implementation
of large networks. The interconnections are a major source of limiting the size of networks on
chip. Analog technology has the advantage that the maximum information capacity of single line
is only limited by the noise and other uncertainties. When the signais are represented digitally
and transrnitted in parailel, each activation or node value requires several wires for inter-
connections, whereas in case of analog one wire suffices.
The most difficult issue in the design of analog neural networks is storage of synaptic weights.
Although some studies have k e n made in this area, a reliable, compact analog rnemory in CMOS
technology, that
available. In the
be discussed.
can preserve data with acceptable accuracy for longer periods of time, is not
following sections the solutions to analog memory by different researchers will
2.2.1 Floating gate (EEPROM) devices:
FIoating gate memory has been used as a method of analog weight storage in neural networks.
2- VLSI implementation of neural networks
The main reason for its usage is due to the fact that its storage time is in years in contrat to
milliseconds of capacitive storage. A cross section of the simple floating gate device is shown in
figure 2.1.
I Figure 2.1 : Floating gate structure for weight storage
The name floating gate mernories is because the polysilicon gaie of the transistor is left
unconnected and by different methods electrons are deposited and removed from the gates. The n-
channel device is constructed on p-substrate with heavily doped n+ diffusion for source and drain.
A special layer of thin oxide is inserted between the substrate and Roating gate. On top of floating
gate is a layer of normal oxide and then the controi gate which corresponds to the gate of regular
MOS transistor. Electricaily this structure acts like two capacitors connected in series between
2. VLSI implementation of neural networks
control gate terminal and substrate. If a large voltage is applied to the control gate relative to
substrate, a high elecuic field will be induced between the conuol and the floating gate and
between floating gate and substrate. If the electric field is strong enough electrons can tunnel
the thin oxide layer between the floating gate and the substrate. Thus the electrons are then
trapped on the floating gate. This phenornenon is known as Fowler-Nordheim tunneling. Negative
voltage c m be applied to the control gate to cause tunnel in the reverse direction. The trapped
charge causes a shift in the threshold voltage of the device , because in the normal mode of
operation a higher voltage will be needed to be applied to control gate to overcome the effect of
trapped charge on the floating gate.
Therefore the threshold voltage on this device can be used to represent a modifiable weight value.
Programming is perfomed with short voltage pulses for efficient control of the amount to charge
which is transferred. There are other methods for trapping of the charge but Fowler-Nordheim
tunneling appears the best method for this application. Due to hjgh qudity of VLSI insulators
, charges on the floating gare will remain intact for a long duration; therefore these devices can
be categorized as non-volatile with non-destructive read. A special fabrication process which is
not universally available, is required to produce the thin oxide layer for these devices-
Floating gate devices have been proposed as non volatile weight storage by many researchers
[SO-521 and were applied in many neural network implementations [40-42,451. The most notice
able implementation of floating gate synapses is the ETANN (Elecuically Trainable Artificial
2. VLSI implementation of neural networks
Neural Network) chip from lntel [40]. Weight modification is perfonned using Fowler-Nordheim
tunneling and a weight resolution of 7 bits was demonstrated and weight retention was estimated
to 15 years at resolution of 4 bits. the Fowler Nordheim process utilized in CMOS process was
exarnined by Thomson and Brooke[76]. Due to thicker oxides associated with this process, long
term weight storage is estimated to be upto 25 years with 10 bits.
Floating gate transistors offer a simple and efficient scalar product circuit implementation
technique for neural networks. The technique is particularly suitable for large nenivorks fabricated
with modem VLSI processes.The most appealing feature of using floating gate MOS transistors
as analog multipliers in neural networks is; weight storage and inputs are implemented
concurrently by the sarne circuitry. The floating gate transistors provide an adjustable, non-
volatile analog memory and simultaneously behave as e l e m e n t q anaiog processors when
appropnately connected [40].
Although the tremendous potential of floating gate devices for large dense synaptic structures
has been amply demonstrated by [40], there remain several technical difficulties with this
technology. In these design methods [SO-521 extra voltage is required for memory prograrnrning,
which is slow and relatively imprecise. Cauwenberghs et al [45] have applied ultraviolet
illumination of the chip to perform weight update, therefore the rest of the circuitry has to be
shielded from the ultraviolet light. Therefore the attractiveness of floating gate memory as non-
volatile analog memory suffers from the necessity of a speciai fabrication process, difficulty
2. VLSI implernentation of neural networks
in small weight changes [SOI, requirements of a successive voltage pulses and measurements for
accurate voltage adjustment [5 11 and drift after every weight change.
2.2.2 Capacitive storage:
Synaptic and bias weights can also be stored as voltages on chip capacitors. The input gate of a
MOS transistor is actually a capacitor whose bottom plate is the channel of the transistor-When
charge is accumulated on the gate, charge carriers are drawn into the channel causing a current
flow. In general, the more the charge the larger will be the flow of current through the channel.
Therefore, the current flowing through the transistor can be used to represent the weight, and
the value of the current can be programrned by the arnount of charge on the gate. which is
proportional to the size of voltage appiied to the gate. The gate is isolated from the channel so
ideally the charge c m remain on the gate until its forced to change. There are various leakage
currents in the circuit and therefore much larger capacitors are needed. CMOS capacitors are very
efficient and decay time in minutes is achievable [77], the mernories are nevertheless volatile.
However with a back ground training process, it might be possible to maintain weights close
to desired target values, especially if the chip is cooled to cryogenic temperatures [39] to extend
the decay time.
Since the leakage currents cannot be eliminated completely therefore in most designs the voltages
2. VLSI implementation of neural networks
on capacitors are seriaily and invisibly refreshed using an extemal digital memory for weight
storage and D/A converter [43-441. The rate at which the voltages are refreshed can have a direct
effect on how precisely a weight c m be represented. Faster the refresh rate the lesser will be the
voltage drop between refresh cycles so more accurate will be the weight. There are other effects
which effect the precision such as dock feed through in the switch transistors which sample the
weight voltage ont0 the capacitors, thus setting an upper limit on the refresh rate. If the capacitors
are large enough and reasonable control over the temperature is maintained, leakage current can
be minimized so that analog refresh rate from digital memory using D/A techniques will be
possible.
A proposa1 for weight stored on the gates of MOS transistors by Akers et al. [79] provided a
g r a s multiplication of an input voltage and the weight value. Since the voltages would be stored
on the relatively small capacitance MOS transistor gates, it will be necessary to update them
every 100 ps. One method for utilizing capacitors for long term storage proposed by Brown et ai.
[80]. it uses a refresh cycle similar to used in digital RAM. A sirnilar method of capacitive
storage was implemented by Mann and Gilbert [78]. A single capacitor holds the weight value
for each synapse and can be R/W by standard addressing means, the weight resolution is 6-8
bits. Furman and Abidi have described an analog CMOS error back propagation VLSI
implementation [39]. The design uses dynamic charge stored on capacitors with values 0.7 pF
as the weight representation. For their chip, cooling has been suggested to preserve the valid
weights after the training phase completion. The charges are updated in leaniing phase.
2. VLSI implementation of neural networks
Single capacitor weight storage with a MOS capacitor was implemented by Satyanarayana et al.
[43] . The 1024-synapse chip allows d l of the weight capacitors to be refreshed in approximately
130 ps from off chip SRAM using off chip D/A converters. A weight resolution of t bits was
mentioned and implied that clock feed through on the sampling swiiches was the lirniting factor.
Boser et al. [7-491 has dso described a chip which uses a single capacitor to store weights. On
chip 6-bit D/A converters are used to refresh the weight from off-chip RAM. the inputs and the
neuron outputs are represented as 3-bit quantities. A learning network utilizing single capacitor
weight storage was developed by Arima et al. [54]. The main network is recurrent and implements
a new learning algorithm. A seconed feed forward network on the same chip provides a measure
of how well a pattern has been learned, which control the l eming circuitry.
In circuits with differential inputs the weights can be stored deferentially; so if the circuitry is
symrnetncal the leakage currents will be very sirnilar. Although the gate voltage wili drop at the
sarne rate but the differential voltage will be much more stable. This allows lower refresh rates.
The draw back of this differential storage technique is that capacitors consume relatively large
area.
In implementation of JPL group [8 11 with dual capacitors as weight values, precision of 1 1 bits
was reported. In this design capacitor charges are refreshed using extemal digital memory. Each
synapse ce11 consist of a pair of sample and hold circuits to store the differential weight value
and a four quadrant multiplier to obtain the product of input and weight. Kub's group at Naval
Research Laboratones implemented two different chips utilizing capacitors to store a differential
2. VLSI implementation of neural networks
weight value [82]. The actual weight values are digitally stored off chip and the on chip voltages
are frequently refreshed by multiplexing the D/A converted values to the appropriate ceil. Resuits
showed that the weight retention was 50 times longer for differential storage in cornparison to
single ended storage. The researchers at AT&T Bell Laboratones have described a sirnilar weight
storage method [7 1,771. They used a a novel method in which weight values were modified. A
weight is initialized to zero by charging both capacitors to the sarne value. CCD(charge coupled
devices) were utilized to pump the charge bidirectionaily between the capacitors to increment o r
decrement the weights. The resolution of the weight was empiricaily determined to be
approximately 10 bits and the maximum weight update rate was estimated to be 2 X 10' updates/
second.
The capacitor as anaiog memory has the foliowing advantages; smailness in size, device structure
is fully compatible with CMOS process and the information as a charge can be controlled easily.
However its biggest drawback is leakage of charge. Even then it is a valuable memory device. It
is aiso used in neural networks as short terrn memory alongwith other non-volatile memories.
2.3 Hybrid implementation:
AI1 the methods described in the last section present problems in the design of a practical neural
network for . A realistic design should work in normal temperature and preserve connection
2. VLSI implementation of neural network
strengths during the operation of system. This requires a certain degree of efficiency in the
utilization of chip area. The design using dynamic charge storage uses large chip area for the
capacitive elements to ensure a reliable preservation of weights at different temperatures.Mu1ti-
level storage on capacitor involves a lot of overhead in circuitry to keep the capacitors locked at
certain voltages. Floating gate mernories show large variations in transistors characteristics and
need large voltages for programming,
The final approach to store the weights; is digital weights in digitai memory. In this method
MDAC(rnultip1ying digi ta1 to analog converter) perform the synaptic multiplications. The hybrid
multiplier produces an analog output through multiplication of digital weight by the analog input.
[85] . Also this allows a faster interface to a host computer and switching noise is only generated
while weights are updated. The hybrid MDACs are an attractive solution with the virtues of ease
of design and simulation but are limited to few bits of dynamic range. This c m causes adverse
effect on the convergence of ieaming algorithm [ I l .
The researchers at AT&T designed a configurable chip [83] with binary valued weights and input
to perform image processing tasks such as edge detection. They used SRAM cells to store
data. The stored data is represented as interconnection matnx and only one output, which has the
largest inner product value ktween input vector and the stored vector is chosen. The design does
not generate spurious steady States since it physically stores the desired data as weights. However,
it requires extra memory cells to store the desired data.
2. VLSI implementation of neural network
The JPL group irnplemented a chip with 32 X 32 MDAC synaptic crossbar matrix [84]. The
inputs are single ended giving two quadrant multiplication. An NMOS transistor operating in the
triode region provides voltage to current conversion. The synaptic precision is limited by
fabricating technology to 7 bits, such that there are only 128 monotonically increasing weight
levels.
Hybrid architectures have been proposed by Djahanshahi, Nosratinia, Yazdi [85,1,63]. A11 of
these use digital memories to store the synaptic and bias weights- The synapse is constnicted
by MDAC, multiplying the input with digital weight from memory. The output of MDACs is
added to generate the nonlinear sigmoid function. These architectures are discussed in the next
chapter.
Chapter 3
An Architecture for Multi-Layer Neural Networks
Among the numerous network models [59-601 the multi layer neural network is a major and
widely applicable rnodel. It is a feed forward network with one or more layers of node processors
between input and output layers which are cailed hidden layers. Al1 the node processors have
non linear characteristics. The resultant muhistage nonlinear characteristics of the network
provide a great potential of input/output mapping for classification applications. Lack of feed
back in the multi layer neural networks makes them inherently stable. This good dynamic
behavior as well as availability of powerfül training schemes such as error back propagation,
genetic dgonthms make them even more attractive from engineering point of view.
AI1 methods used by different researchers as mentioned in chapter 2 present difficulties for
practical applications of neural networks. A realistic design should work in normal temperature
3. An Architecture for MultiLayer Neural Networks
and preserve connection strengths during the operation of system. This requires a certain degree
of efficiency in the utilization of chip area. The design using dynamic charge storage uses large
chip area for the capacitive elements to ensure a reliable preservation of weights at different
temperatures.Multi-level storage on capacitor involves a sizeable overhead in circuitry to keep
the capacitors locked at certain voltages. Floating gate mernories show large variations in
transistors characteristics and need large voltages for programming.
Digitai weight storage has k e n addressed by Raffel [6 1 ,] and others [1,63,85]. This method is
chosen or the architecture presented in this thesis. The most general form of multi-layer neural
networks, without feedback is impiemented (Fig 1.3). Most of the functions are realized in analog
current mode circuitry because of their wider bandwidth, independence from voltage supply
restrictions and no requirement for adder hardware
3.1 A VLSI architecture:
An hybrid architecture for neural networks implemented by Djahanshahi is shown in figure 3.1
[85]. Each module consists of a 5-bit MDAC with register to store the synaptic weight and a
unified synapse neuron as shown in figure 3.2, generates partial non linearity. Ail of these partial
non linearities when connected in parallel generate a scdable sigmoid function.
3. An Architecture for MultiLayer Neural Networks
A sigrnoidal non linearity for one or two synapses look like a hard limiting function for moderate
or large number of input synapses e.g N>5 because of large saturating areas. Therefore a scaling
scheme proportional to f i is to be desired.
Figure 3.1: An hybrid architecture by Djahanshahi
3. An Architecture for MultiLayer Neural Networks
S-shape load element
Figure 3.2: A unified synapse neuron
3.2 The multiplexed architecture:
The general f o m of the architecture[l];shown in figure 3.3. The structure of each layer is similar
as shown in figure 3.4. Each stage is defined as a layer of neurons and their corresponding
connection strengths. Each set of connection strength is associated with the neurons of next layer
rather than preceding layer. Thus each stage constitutes neurons present in that layer plus al1
the connection strengths from the previous layer. Therefore the number of physical connections
between two stages i and i+l is reduced to the number of neurons in stage i. This is apparent
;once the weights are fixed after training the information passed to a layer does not exceed the
information present at the output of the preceding layer and the minimum number of lines
needed to carry this information is no more than mi.
In this architecture, inter stage wiring has been dramatically reduced and much larger systems
3 1
3. An Architecture for MultiLayer Neural Networks
can be designed using a number of chips, each of which constitutes one stage network-This c m
aiso Iead to semi custom approach to the design of neural system, where only one generic chip
containing one stage need to be designed. This chip should then contain the maximum number of
neurons in any stage. If the number of neurons in any stage is greater than needed. they can be
masked out by assigning zeros to the incorning weights.
Layer K
Layer 2
Layer 1
y2 r3 O P m m J mk o U ~ ~ U ~ S
O o o m m . O mtnewo~s
I I I I rnt-1 conrrections
m2 conn&ions
1 m2 neurons
Figure 3.3: Layers and stages in the architecture
3. An Architecture for MultiLayer Neural Networks
ROM
Figure 3.4: Interna1 structure of a stage
The idea of multiplexing the connection strengths is depicted in the figure 3.4. It is possible
since, the analog bandwidth c m be made much smaller than the clocking frequencies available
in digital CMOS. Since the feed fonvard networks are free of feed back, therefore the delays
introduced by multiplexing do not disrupt the network dynamics and only increase the time
latency. The constraint for this architecture is ihat the analog nodes should be refreshed in
time so that the output of the neurons is kept valid at al1 times. The minimum clock speed depends
on the leakage of nodes, itself dependent upon temperature. and is in KHz range. The weights are
presently stored in static ROM and can be changed to RAM also.
3. An Architecture for MultiLayer Neural Ne tworks
3.3 Operation:
Let mi be the number of the neurons in the ith stage. The synaptic weights corresponding to
the ith layer and its preceding layer are saved in (mi-i + 1) x mi two dimensional digital weight
memory array. Each memory block forms the input of one of the stage multipliers-The threshold
of the neurons can be considered as a negative signal coming from a source with a strength of
unity and a weight equal to the threshold value. One row of the storage is dlocated to the
to the threshold value and to avoid mismatching the current source shown in figure 3.4 is actually
another neuron with signal and bias connected to VDD and ground respectively. The multipliers
are basically current mode multiplying digital to analog converter. The output current of each
multiplier is product of its input digital weight by the analog input current. The adder is actually
a cornmon mode output node of the multipliers and a current to voltage converter
A counter with cycle of mi is the main stage address/control generator. At kth (Ockani) time
slot the decoded output of the counter addresses the kth location of each mernory block. The
mu1 tipliers and the adder perform the synaptic/threshold computations of the kth neuron
activation. Meanwhile the stage demultiplexer connects the adder output to the branch going to
the kth neuron input. A capacitor at the input is applied as a short term memory. This storage
preserves valid neuron input for the period that the activation of al1 the stage neurons are being
computed.
3. An Architecture for MultiLayer Neural Networks
ADDER ,
Trans- lmpedance Amplifier
- To Demul tiplexer
Figure 3.5: The adder
An inhibit signal and a switching circuit is also added before neurons. The signal is nothing but a
a delayed inverted clock. Because of the finite settling time of the multipliers and specifically
the tram-impedance amplifier inside the adder block as shown in figure 3.5, the signals at the
output of the demultiplexer are not stable at the time the address becomes available. If this
inhibition circuitry is not provided, the capacitors at the input nodes of the neurons will receive
erroneous charging voltages at the beginning of each refresh cycle. The corresponding voltage
spikes in the signais effectively increase the settling time.
3.4 Input/Output requirements:
In the architecture discussed above the input and output signals are in current mode. For V O
communication it is better to use voltage mode due to two reasons. First; the current levels are
3. An Architecture for MultiLayer Neural Networks
in microamp range so would be severely affected by the noise that c m be coupled to the relatively
long connection wires to the chip package. Secondly the current mode circuits have high output
impedance, therefore the high impedance nodes especially outside the chip are susceptible to
voltage noise spikes of large amplitude, which wiH take the current mode circuits to saturation
if the node voltage goes beyond power supply voltage.
Therefore to have the voltage mode inputioutput, a voltage convener at the input and a t r ans -
irnpedance amplifier at the output may be used as shown in figure 3.6. The combined transfer
characteristics of the current to voltage convener and the trans-impedance amplifier should have
a slope of unity. The magnitude of the current mode outputs should also be taken into account
such that none of them will be driven into nonlinear zone in normal course of operation.
For the output, another variation is possible which can Save the chip area. Voltage mode neural
blocks as shown in figure 3.7, can replace the current mode neurons and the trans-impedance
amplifiers at the last stage. It has an additional advantage; feedback is used in trans-impedance
arnplifiers to achieve linear stabilized 1-V characteristics. The feedback also results in
performance degradation in the network dynamics. Most of the propagation delay of the network
is due to settling time of the trans-impedance amplifiers, thus by elirninating them the total
network delay can be reduced.
The disadvantage of using voltage mode neuron into network is; transfer charactenstic of the
voltage mode neuron is slightly different from the current mode neurons and is almost linear.
3. An Architecture for MultiLayer Neural Networks
This small amount of error sometimes may not be acceptabie therefore the choice of solutions will
depend upon the particular application.
-~~~ network
X-Z Amp z Figure 3.6: Conversion of the signal mode at input and output
voltage mode inputs /(network roltage mode outputs
I Figure 3.7: Seconed solution to the i/0 mode
3. An Architecture for MultiLayer Neural Networks
3.5 Multiplexing:
One of the iirniting factors on the size of the reaiized networks is large number of interconnections
between two consecutive stages Le. if the number of neurons is N,o(N') connections are
required. The interconnection complexity necessitates the design to be squeezed in one chip. Even
then the massive interconnections occupy a large share of the die area. This can be solved by time
multiplexing the interconnections(56-581. However this scheme reduces the forward pass speed
of the network.
The total input/output propagation delay of a multi-layer neural network without any concurrency
in the operation of its stages, is linearly dependent on the delay of each stage as well as the
number of the stages. Let the inputloutput propagation delay of the ith stage of the network be
Ti. The totd inpuiloutput propagation delay of the massively interconnected network with
M layers is[63]:
The total propagation delay for the same network with an Ni base time multiplexing scheme in
the ith stage is:
3. An Architecture for MultiLayer Neural Networks
The latter relationship shows that the number of layers should also be chosen carefully to achieve
an acceptable speed in any time multipiexing implementation.
3.6 Pipeiined architecture:
Implementation of multi-layer neural networks with large number of layers may not be possible
using a tirne multiplexed approach without any scheme of speed improvement; since the speed
trade off with the multiplexing scheme could bring the performance below acceptable margins.
Speed improvements may be achieved by either architectural measures or faster building blocks.
Yazdi proposed a pipelined architecture for speed improvement as shown in figure 3.8 f63]. The
operation is similar to the architecture of Nosratinia [ I l . Each stage operates on its input set, pass
the valid output to the input latch of its next stage and will stop operating. The latched values are
only dtered when the output of preceding stage is completed. The only required control signal is
stop signal Si . After al1 the outputs of the ith layer are computed stop signal will be activated. This
stop signal will halt the operation of the ith layer, signal the proceeding latch to latch the output
of the ith layer and signals the next layer to resume operation on new latched inputs.
Generation of stop signal Si is quite simple; in a N base time multiplexed stage there is a N-base
counter decoder. The output of the counter at the Nlh time slot can be used to trigger a monostable
which generates Si . The modified structure of each stage is shown in figure 3.9. In this structure
3. An Architecture for MultiLayer Neural Networks
the neurons of the stage have been merged to a single neuron. This neuron has current input and
sigrnoid voltage output, thus eliminating the crans-impedance amplifier. The demultiplexed output
is passed to appropriate analog latches.
l m O r
O 0 o m * I I O .
00 : Layer K
. I Analog Latch
1 I o . O w O
O
O O a O
a O O .
1 Analog Latch
1 Anatog Latch Pi
ml narms
mo inputs
Figure 3.8: Pipelined architecture
3. An Architecture for MultiLayer Neural Networks
Figure 3.9: Modified structure of each stage
3.7 Performance analysis on an example:
The XOR problem has k e n selected for the VLSI implementation of the architecture of Aria
Nosratinia [ 11 due to its disconnectedness. It cannot be solved by a simple perceptron, because the
argument space is not linearly separable. A perceptron with one hidden layer can solve ail
problems where the argument space is divided into two convex open or closed regions of
arbitrary shape [60].
The state of the network with hidden layer as shown in figure 3.10, and is governed by the
following equations:
3. An Architecture for MultiLayer Neural Networks
' j = s g n ( ~ w j k o k - ' j ) 3 -3
where Wjk and Wij are synaptic weights between input and hidden layer and hidden and output
layer . 19, and 19, are threshold values for hidden and output layer neurons. Sj and Si are output of
hidden and output layer and =k is the input.
The hidden neuron s l plays the logical element representing the AND function while the s2
emulates the logical OR element. The combination of these two elements allow the generation
of boolean function XOR.
Figure 3.1 O: XOR network with hidden layer
3. An Architecture for MultiLayer Neural Networks
3.7.1 Software implementation of XOR:
The XOR function was implemented using the feedfoward neural networks. The network
consists of one hidden layer in addition to input and output layers. The six connection strengths
and three threshold values were calculated in the training process. These
by MATLAB using error back propagation algorithm with the following
values were cornputed
test vectors:
.. - -
Table 3. i : Test vectors
The result of training is shown in figure 3.1 1. The simulation results of the network with are
given in table 3.2
OUT
O . O s 6
0.9504
Table 3.2: Simulation results
3. An Architecture for MultiLayer Neural Networks
Figure 3.1 1: Implemented network for the XOR function
3.7.2 Quantization:
The network as shown in figure 3.1 1 has k e n trained by error back propagation algorithm which
generates continuous valued connection strengths. In the architecture of Nosratinia [llthe weights
are represented in fixed precision format. This implies that the weights obtained from the training
process have to be quantized and limited in range before they can be incorporated into the neural
network architecture.
The network and its connection strengths as shown in figure 3.1 1; the dynamic range of the
numbers is limited and the concentration of weights is around +8. The maximum range is chosen
as k8: and no truncation will be necessary.
3. An Architecture for MultiLayer Neural Networks
where Q is the resolution, W,, and Wmi, are minimum and maximum values of synaptic
weights and n is the number of bits
With eight bits of accuracy the resolution will be 0.0625.
The figure 3.12 shows the network with quantized weights and threshold values. These are
actually the values stored in the ROM and system level simulations show almost no change in the
behavior of circuits and the output are the sarne as original up to two least significant digits as
shown in table 3.3.
Figure 3.12: Network with quantizes values
Table 3.3: Simulation results with quantized values
e O
0
1
1
HI
0.0094
0.0-7
0.0507
~2
0.9466
0.0i59
O.0i59
OUT
0.0378
0.9-5
0.SSBS
3. An Architecture for MultiLayer Neural Networks
3.7.3 VLSI Implernentation of XOR:
The function XOR bas been implemented by using the architecture of Nosratinia [l]. The
circuit consists of 2 layers:2 input neurons, 2 hidden neurons and one output neuron. Two stages
are integrated in one chip to reaiize the network as shown in figure 3.13. The nine synaptic
weights of the network are saved in ROMs-The architecture and operation of each stage has k e n
explained in section 3.3 and 3.4. Since there are only two memory addresses in each ROM
section, the clock signal and its complement were used for selection and address of weights,
thus address decoders have been avoided. To have voltage mode output, voltage mode neurons
have been used in the second stage. At the input, instead of voltage to current converters, two
current mode neurons have been used. This is pemiissible because the test vectors are digital and
the behavior of this block is unimponant between O and 1.
Waveforms of the signals are given in figure 3.14. The output transition happens at alternate 2
dock cycles after each input transition.The data is presented to circuits at a frequency of 2 MHz.
The effect of inhibit circuitry is very evident as the input to neurons is smooth. The waveforms
presented in this section are the result of full scale SPECTRE simulations. The simulation results
of individual blocks will be presented in next chapter.
The neural chip is implemented in 0.5 micron triple metal single poly CMOS process provided
by Hewelett Packard through Canadian Microelectronics Corporation using Cadence DF-II
design tools. It occupies 1500 X 1500 micro meters of the die area including the bonding pads.
The mask layout of the chip and the cells is illustrated in appendix A.
3. An Architecture for MuitiLayer Neurai Networks
ROM
ROM
dod
Figure 3.13: S ystem leveI VLSI layout for XOR
3. An Architecture for MultiLayer Neural Networks
Clock
Input A
Input B
Inhibit input i : finhibit-input
3 .Bu 6,0u time ( s
Figure 3.14: System level simulations (continued)
3. An Architecture for MultiLayer Neural Networks
Output of Adder of stage 1
Output of De-multi plexer A
Figure 3.14 : System level simulations (continued)
3. An Architecture for Mu1 tiLayer Neural Networks
Output of De-multi plexer B A
> w
Output of T-gate A
A
> w
Output of T-gate B
Output of stage 1 neuron A
n
d w
- 0 t - . 1 . . p . , ,
0.0 3.0~ 6.0~ 9 .Bu time ( s )
Figure 3.14: System level simulations (continued)
3. An Architecture for MultiLayer Neural Networks
Output of stage 1 neuron B
Output of Adder of stage 2
Network Output , > w
Figure 3.14: System level simulations
3. An Architecture for MultiLayer Neural Networks
3.7.4 Testing of neural chip:
This was a simple test chip which included two stages. The tests required a HP work station with
VeeTest software and CMC's TH 1000 testhead. To perform the tests a program was written
using HP's VeeTest software to control the testhead. This prograrn controlled the testhead's
curent source and when measurements were taken. The measurements were graphed and saved
on the Tektronix oscilIoscope. The data was transferred from oscilloscope to PC and plotted by
MATLAB. Figure 3.15 shows the input/output wave forms of the chip at a clock speed of 2 MHz.
Figure 3.15: Fabrication results of the neural chip
3. An Architecture for MultiLayer Neural Networks
3.8 Summary:
Analog CMOS for the implementation has k e n used as the basis of the architecture presented
in this chapter. The main aim in this design is permanent weight storage and reduction in physical
interconnections. The multiplexing scheme in this architecture diows the reduction of the
number of synaptic multipliers and physical interconnections. This is justified by the fact that
bio1ogica.i neurons are generally much slower than CMOS circuitry and even then have good
performance unparalleled by a digital computer.
The XOR problem has been used as an example to test the validity of the concept-The boolean
function XOR has been implemented in software using error back propagation algorithm and the
trained network is implemented in hardware using the architecture.Since the XOR function is
universal therefore the architecture needs not to be trainabie; ROM has k e n used to store the
digital weights (analog quantized weights).
Chapter 4
VLSI Circuitry
The main building blocks of the multi-layer feed forward neural network for the realization of
boolean function XOR will be descnbed in this chapter. The Hspice level3 and level 13 models
have been used for simulations and are shown in Appendix B.
4.1 Neuron:
In neural networks models, the neuron has nonlinear transfer characteristics of the form of a
sigmoidal function; have different saturation levels for low and high inputs. It is continuously
differentiable which is a good approximation to the activation of biological neural cells. The feed
forward multi-layer neural networks are trained by steepest descent algorithm like error back
propagation; usually a sigmoidal charactenstic for the neuron is to be considered:
4. VLSI Circuitry
The sigrnoidai function has the advantage that its derivative can be expressed in terms of itself
and its shifted version. The derivative is used in weight update.
SlGMOlD FUNCTlON AND ITS DERIVATIVE
Figure 4.1 : Sigrnoid hnction and its denvative
The MOS transistors differential pair as shown in figure 4.2 has k e n used to approximate these
characteristics. The large signal behavior of the differential pair is given by [72]:
M 1 and M2 are assumed to be in saturation. The solution for I I and Iî can be obtained by
substituting (4.4) in (4.3). The four regions of operation are:
Region 1 :
4. VLSI Circuitry
The transfer characteristics of the differential pair when I,, = 1 and fi =
The shape of this curve is a good approximation to the figure 4.1.
1 is given in figure 4.2.
Figure 4.2 : Differential transistor pair and its uansfer characteristics
4. VLSI Circuitry
4.1.1 Current mode neuron:
The current mode neuron is shown in figure 4.3. is a neuron with fixed threshold voltage [63].
The threshold value is also expressed as bias weight, which when multiplied by negative unity
results in threshold value. Therefore the effect of threshold will be included in the post synaptic
neuron activation.
The circuit core consists of differential pair transistors M 5 and M6. M7 is an ideal current source
and bias current is reflected to il. The gate of M6 is the neuron input and gate of MS is connected
to the drains of M 1 and M2. The W/L ratios of M 1 and M2 are adjusted such that the drain of
these transistors is at O volt and provide bias current through M7 to the differential pair. The
The current output of neuron from M6 is reflected through an output driver consisting of current
mirrors M4-M8 and Mg-M 10 transistors. M3 is used to make the circuit symmetric. The capacitor
connected at the input node serves to reduce the voltage drop in time between two refresh cycles
i-e. acts as short terrn memory.
Transistor W k (ri m)-Capacitor size(pF)
1 M5-M6 1 112.6 1 Capacitor 1 0.5 1
Table 4.1 : Device sizes of current mode neuron
4. VLSI Circuitry
Vdd
Figure 4.3: Current mode neuron
0.0 = l . . . . 1
0.0 1
2.0 4- 0 6.0 dc C V )
Figure 4.4 : Transfer characteristic of current mode neuron
4, VLSI Circuitry
The circuit is implemented in N-well process. thus is prone to body effect therefore Montecarlo
analysis were performed. In these test a variation of 10 % (uniformiy distributed) in threshold
voltage VT of the transistors in the design around their nominal values is allowed [74].The figure
4.5 shows the result of analysis over thirty iterations. The output was found to be within 3 % of
the desired value i.e. 22 micro amperes.
, . I
C .- A*. .
.I. .
Figure 4.5: Results of Montecarlo analysis for current mode neuron The circuit is implemented in 0.5 micron CMOS single poly triple metal technology provided by
Hewlett Packard, has an area of 57.4 X 23.5 pn and has an peak power dissipation of 0.6 rnilli
watts. The device sizes are shown in table 4.1. The rnask layout of the ce11 is presented in
Appendix A.
4. VLSI Circuitry
4.1.2 Voltage Mode Neuron:
The voltage neuron is designed by adding a current to voltage convener to the current mode
neuron [ 11. The constant current from the drain of M6 is subtracted by M9 from the input node of
M 10 and M l 1 which form the current to voltage converter. The schematic and transfer
characteristics are shown in figure 4.6 and 4.7. The device sizes are s h o w in table 4.2. It has an
area of 6 1.1 X 25.6 pm and peak power dissipation of 1.1 rnilliwatts.
The voltage mode neuron has a positive and negative saturation leveis. Since this neuron is only
used at the output stage, therefore the saturation level wiIL not affect the network operation
adversely [56].
Ti-ansistor W/L (1 m)-Capacitor size (pF)
Table 4.2 : Device sizes of voltage mode neuron
The results of Montecarlo analysis for voltage mode neuron are shown in figure 4.8 and the
results are within 1 % of the desired value i.e. 4.5 volts.
MS-M6
M7
The ce11 is implemented in 0.5 micron CMOS single poly triple metal technology provided by
Hewlett Packard. The mask layout of the ce11 is presented in Appendix A.
A
112.6
1 14
Ml 1
Capaci tor
115.4
0.5
4. VLSI Circuitry
Vdd
Figure 4.6 : Voltage mode neuron
Figure 4.7: Tram fer c haracteristics of voltage mode neuron
4. VLSI Circuitry
Figure 4.8: Montecarlo analysis for voltage mode neuron
4.2 Synapse:
The synapse is a connection between the two neurons. It is realized by multiplier and the
connection strength. The synapse in this architecture is realized by a current mode digital to
analog multiplier and converter and the connection strengths are digitally stored in ROM.
4. VLSI Circuitry
4.2.1 Multiplier:
The multiplier as shown in figure 4.9 is a current mode digital to analog converter and multiplier.
The multiplying elements are the digital connection weights and analog input, the multiplicand is
analog current output. The traditional method of making a converter by using a linear bipo1a.r
process is to generate binary weighted current mirrors [87]. The same technique can be employed
with MOS transistors. The difficulty experienced with this method is the result of variation of
currents due to their drain source voltages mismatch. This can be considerably reduced by using
cascode or Wilson current mirrors [72].
The circuit consists of series of cascode current mirrors, each of which divide the input current by
half. The current output of these current mirrors is sumnied up by a NMOS acting as switch k i n g
controlled by the digital input. The choice of cascode current mirrors is based on their high output
resistance and graceful degradation of linear transfer characteristics [ 11. It has an area of 101 -7 X
46.7 pm and peak power dissipation of 6 milliwatts.
A fixed device size for al1 the NMOS and PMOS transistors for al1 the stages has been used.
An 8-bit digital to anaiog converter and multiplier is implernented in 0.5 micron single poly triple
metal CMOS technology provided by Hewlett Packard-The mask layout of the ce11 is presented in
Appendix A. The transfer characteristics of the ce11 with al1 the digital inputs set to 1 is shown in
figure 4.10.
Vdd
C
- iqn bit
I l
O Analog Current Out
Figure 4.9: Multiplier digital to anaiog converter
4. VLSI Circuitry
~ DC Response
Figure 4.10: Transfer characteristics of MDAC
4.2.2 ROM:
One of the factor for the selection of digital memory device is large number of weights i.e. for N
number neurons, O ( N ~ ) weights are required to be stored. The NAND configuration results in
considerable loss in performance and is only useful for small memory arrays [ 5 ] . Therefore the
NOR ROM as shown in figure 4.1 1 has been seiected for the implementation. It is a combination
of p-channel pullup and n-channel transistors as pull downs constitute a pseudo-NMOS NOR gate
with the word lines as inputs. The lower resistance of the pullup transistors causes lower noise
margins which is controlled by feeding the bit lines to complementary inverters. Since the delay
introduced in the circuit by analog blocks is rnuch higher than ROM, therefore the static ROM for
the systern is sufficient. The results for the ce11 is shown in figure 4.12.
out 1
4. VLSI Circuitry
Vdd
Figure 4.1 1 : ROM
Figure 4.12: Simu lation results of ROM
4- VLSi Circuitry
The ce11 has been implemented in 0.5 micron single poly triple metal CMOS technology provided
by Hewlett Packard. It has an area of 106.3 X 74.3 pm and peak power dissipation of 17.5 milli
watts-The mask layout is presented in Appendix A. The ce11 has also been simulated by verilog
-XL simulator, the v e d o g source code is given in Appendix C
4.3 Trans-Impedance amplifier:
The trans-impedance amplifier as shown in figure 4.13 is basicaI!y a current to voltage convener.
This circuit is part of adder block as shown in figure 3.5. converts the summed current output of
hlDACs to voltage. A prime requirement of trans-impedance amplifier is to desensitize the input
from capacitive loading. in the bipolar technique Miller capacitance is achieved by the input
transistor stage [88].This can be implemented in MOS circuits by stabilizing the circuit with the
load irnpedance or adding a compensation capacitor across the feedback resistor [ 1 1 .
It consists of two stages; a differential amplifier (M3, M4, M5, M6, M7. M8, M9) and second
gain boosting stage comprising of M 1, M2, M 10 and M 1 1. A voltage parallel feedback is
applied through active resistor M 12 and M 13 which result in predictable 1-V characteristics.The
drain current of M6 is reflected to M l 0 through M4 and is available for sourcing it to load.
M 1 1 provides the abiIity to sink current from the output load. Capacitor is added to stabilize the
circuit as well as to decrease the settling time. The uansient response of the cell is shown in figure
4.14. The device sizes are shown in table 4.3. The ce11 is implemented in 0.5 micron single poly
triple metal CMOS technology provided by Hewlett Packard. It has an area of 55.5 X 29 Pm and
peak power dissipation of 3.3 milli wattS.The mask layout of thc ce11 is presented in Appendix A.
4. VLSI Circuitry
Table 4.3 : Device sizes of tram-impedance amplifier
--
Figure 4.13 : Trans-impedance amplifier
4. VLSI Circuitry
Transient Response
Figure 4.14 : Transient response
4.4 VLSI irnplernentation of Capacitor:
Theoretically, the capacitance is represented by
A C = E- d
4.1 1
where A is the area of electnc plate, d is the distance between two plates and E is the dielectric
constant of the isolator.
A capacitor can be designed using standard MOS transistor. This approach suffers from two draw
backs. First, the capacitance of MOS is non linear and secondly it uses large chip area. In case of
MOS transistor used as capacitor the area is given by
4. VLSI Circuitry
where Tm is the thickness of the gate oxide, E,, is the dielectric constant of gate oxide, C is the
size of capacitance and A is the area (A=WL).
There are two methods for the passive realizations of capacitors. One method uses double
polysilicon separated by silicon dioxide. It requires a double polysilicon process as upper and
lower plates of the capacitor are formed with polysilicon. The dielectric is formed by a thin
silicon dioxide layer. The other method uses a conducting layer on top of crystalline silicon
separated by a dielectric silicon dioxide. In order to achieve a low voltage coefficient, the bottom
plate must be heavily doped diffusion [72].
Our design is implemented in 0.5 micron CMOS pmcess which is a single poly process, therefore
the later method described above has k e n used to implement capacitor. The area in this case is
given by
where Co is nominal capacitance, Cl and C2 are voltage dependent coefficients which are
negligible [73]. This method showed a more linear behavior and much lesser area in cornparison
to the active capacitor. Table 4.4 shows the sizes versus capacitances.
4. VLSI Circuitry
-- -
Table 4.4: Device sizes of capacitors
The mask layout of capacitor is shown in Appendix A.
4.5 Summary:
The main building btocks for the VLSI implementation of boolean funclion XOR by multi-layer
feed fonvard neural networks are described in this chapter. These blocks are redesigned for
implernentation in 0.5 micron technology [ 1,631.
A syrnrnetnc differential pair is used as the main element in current and voltage mode neuron
,which has the characteristics of sigmoid hnction.The hybrid synapse has k e n constmcted by
current mode digital to analog multiplier and static ROM. The multiplier uses cascode current
mirrors and static ROM has k e n used. The uans-impedance amplifier for conversion of current
to voltage uses voltage parallel feed back for predictable 1-V characteristics and the capacitor
ensures stability.
Chapter 5
Conclusion
Among neural networks the multi-layer feed forward neural networks are the major and widely
applicable modei. The multistage non Iinear characteristics, lack of feedback and powerful
training schemes provide a great potential for imagelspeech recognition, control etc. Although
the neural networks can be implemented by software, but it takes a long time to simulate, because
a highly parallel system is modeled by a serial processor. Therefore complete exploitation of the
potentials require a large number of operations at high speed which necessitates the hardware
reaiization.
Analog VLSI offers area-efficient implementation of the functions required in neural network
such as multiplication, summation and sigmoid transfer function. However the analog circuits
are sensitive to the problems of process variation. device matching, cascadeability. For this reason
attention must be given to the limitations of MOS transistors and to the design techniques-As
5. Conclusion
a result the high accuracy and linearity found in digital implementations can be traded off for the
simplicity, speed, silicon area and interconnectivity found in analog circuits. But one of the major
drawback of analog implementation is non-availabiliry of reliable non-volatile memory to store
synaptic weights. The neural networks have a large arnount of connectivity which is a major
probIem in the hardware reaiization. The former problem c m be solved by using a rnixed signal
architecture in which the synaptic weights c m be easily stored in digital memories.The
reduction in physical interconnections can be achieved by implementing time division
mu1 tiplexing.
The rnixed signal architecture [ 1 ] based on analog CMOS units and digital synaptic weight
memory has been analyzed. The architecture is well suited for descent based learning
algorithms such as errorback propagation and weight perturbation [63]. As an example for the
architecture ,the boolean function XOR has been implemented. The building blocks and network
have been implemented in 0.5 micron CMOS process.
The architecture was analyzed in chapter3. In this architecture the number of physical
interconnections and multipliers is reduced considerably from the full implementation. The y0
mode of the network was also discussed in section 3.4. The boolean function XOR is
irnplemented as an example to test the validity of the concept. Since the arcliirecture is hybxid
therefore quantized analog values are stored in digital memory, the network with quantized
values is discussed in section 3.8. The VLSI implementation of the XOR network is discussed in
section 3.9.
S. Conclusion
The analog and digital CMOS blocks [ I l have been redesigned and are presented in chapter4.
The neuron is presented in two modes: current and voltage mode with fixed threshold. MDAC
and ROM constitute the synapse. ROM stores the synaptic weight and MDAC performs synaptic
multiplication. The tram-irnpedance amplifier is the linear current to voltage converter- It is a
differential amplifier with voltage parallel feedback, the stability is ensured by connecting a
compensation capacitor across the feedback. The robustness of the neuron circuits has been
tested by montecarlo analysis.
Speed improvements in the area of circuit design are important areas for future research. The new
integration technologies with smaller channel lengths and lower power supply range will add
some factors to circuit design and the current building blocks will have to be modified before they
can be used in advanced integrating technologies.The low power and faster building blocks will
be able to increase the speed of the network. This is an uphill task which if successfully done will
enhance the speed of the architecture and decrease the power consumption.
APPENDIX A
Mask Layout Diagrams
The appendix contains layout diagrarns of the building blocks of the network. The layout diagram
of a chip for XOR function is aiso presented in this appendix. The blocks and chip are
implemented in 0.5 micron single plysilicon triple metal CMOS technology. The mask layouts
of the cells/chip are:
1 . Current mode neuron.
2. Voltage mode neuron.
3. Multiplier and digital to analog converter (MDAC).
4. Trans-impedance amplifier.
5. ROM with buffers.
6. Capacitor
7. WRZAC( the neural chip for XOR)
A. Layout Diagrams
Figure A. 1 : Mask layout of the current mode neuron
A. Layout Diagrams
Figure A.2 : Mask layout of voltage mode neuron
A. Layout Diagrams
Figure A.3 : Mask layout of multiplier and digital to analog converter
Figure A.4 : Mask layout of tram-impedance amplifier
A. Layout Diagrams
- . . . . . . . . . . . . . - - . . . . . . . . . . flAFrx:.~~:~**@ri, . . . . o . . . .. .......... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - - W....
- -- - - - - - - - . . - - - - - - - - - -
Figure A S : Mask layout of ROM with static buffers
A. Layout Diagrams
f i . . . . . . . . . . . . . . . . . : a .. . . . . . . ......... . - i .. C I I .
: ' 1 ......... . . a . . -. ..
..S. ..... ! ......... . . - 1 . . . . . . . . . . . . . . . . . . . . . . . . I ................................................................................................... . , .: . .: :. :.: ! ; - ; ! ; - : -: - - : - : - ? ;.;.; i - ; . : > : - : :
. . a . . . - . . .."" .. - - ! - 1 l " . ~ . . . . . . . ....................................................................................... "...."* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 .: . : 1 . . . .~.:..-X..:.~.:...~...:.O.:~..I...:Z..:.....:Z......:..I.....:..:......:..:...>.. . . . . . . . . . . . . . . . . . . . . . . .
Figure A.6: Mask layout of capacitor
A. Layout Diagrams
Figure A.7: Mask layout of neural chip
APPENDIX B
Simulation Models
B. 1 Hspice level3 mode1 parameters:
.MODEL CMOSN NMOS LEVEL=3
+ PHI=0.700000 TOX=9.6000E-09 XJ=0.200000U TPG= 1 + VTO=0.6566 DELTA=6.9 100E-O 1 LW.7290E-O8 KP= 1.9647E-04 + UO=546.2 THETA=2.684ûE-0 1 RSH=3.5 12OEi-O 1 GAMMA=0.5976 + NSUB= 1.39îOE+ 17 WS=5.9090E+11 VMAX=2.008OE+05 ETA=3.7 180E-02 + KAPPA=2.8980E-02 CGDO=3.05 ISE- 10 CGSO=3.05 ISE- 10 + CGBO4.0239E- 10 CJz5.62E-04 MJ=0.559 CJSW=S.OOE- 1 1 + MJSW-0.52 1 PB4.99 + X W 4 . IOSE-07 * Weff = Wdrawn - Delta-W * The suggested Delta-W is 4.108OE-07
.MODEL CMOSP PMOS LEVEL=3 + PHI=0.700000 TOX-9.6000E-09 XJ4.200000U TPG=- 1 + VTO=-0.92 13 DELTA=2.8750E-0 1 LLk3.5070E-O8 KP4.8740E-05 + UO= 135.5 THETA= 1.8070E-0 1 RSH= 1.1000E-0 1 GAMMA=0.4673
B. Simulation models
+ NSUB=8.5 1 20E+ 16 NFS=6SûûûE+ 1 1 VMAX=2.5420E+05 ETA=2.45OOE-02 + KAPPA=7.958OE+OO CGDCk2.3922E- IO CGSO=2.3922E- 10 + CGBOz3.7579E- 10 CJz9.35E-04 MJz0.468 CJS W S 8 9 E - 10 + MJSW=0.505 PBd.99 + XW~3.622E-07 * Weff = Wdrawn - Delta-W * The suggested Delta-W is 3.6220E-07
B. 2Hspice level 13 (Berkeley Level 4; HSPICE Level 13) model parameters:
NMOS PARAMETERS
.MODEL CMOSN nmos levek 13 +vfbO=-7.05628E-0 1 ,Ivfb=-3 -86432E-02, wvfb4.98790E-02 +phi0=8.4 1 845E-0 l,lphi=0.00000E+00, wphi=û.OOûûûE+OO +k l=7.7657OE-O 1 ,Ik 1=-7.65089E-O4,wk l=-4.83494E-O2 +k2=2.66993E-02,lk24.57480E-02,~k2=-2.589 17E-02 +e tao=- 1 -94480E-03, leta= 1.7435 1 E-02, weta=-5 -089 1 &-O3 +muz=5.75297E+02,dlO= 1 -70587E-00 1 ,dwW.75746E-00 1 +u00=3.305 13E-0 1,1~0=9.75 1 lOE-O2,~~0=-8.58678E-02 +U 1 =3 -26384E-02, IU 1 =2.94349E-O2, WU 1 =- 1.38002E-02 +x2rn=9.73293E+00,1~2m=-5.62944E+OO, wxSrn=6.55955E+00 +x2e=4.37 180E-04,1~2e=-3.070 IOE-03, wx2e=8.94355E-04 +x3e=-5.050 12E-O5,lx3e=- 1 -68530E-03,wx3e=- I -4270 1E-03 + ~ 2 ~ 0 = - 1.1 1542E-02,1~2~0=-9.58423E-04, ~ ~ 2 ~ 0 4 . 6 1645E-03 + X ~ U 1 =- 1.04401E-03,lx2u 1= 1.2900 I E - O ~ , W X ~ U 1=-7.1009SE-04 +mus=6.927 16E+02,lms=-5.2 1760E+O 1, wms=7.009 12E+00 +x3ms=-6.4 1307E-02, Wms= 1.37809E+00, wx2ms-d. 1 S45SE+OO +x3ms=8.86387E+OO, lx3ms=2.0602 lE+ûO,wx3ms=-6.198 17E+00 +x3u 1 =9.02467E-03,lx3u 1 =2.06380E-û4,~~3u 1=-5.202 18E-03 +toxrn=9.60000E-003, tempm=2.70000E+O 1, vddm=S.OûûûûE+ûû
B. Simulation models
*N+ diffusion:: * +rshm=2.1, cjm=3 SOOOOûe-04, cjw=2.900000e- 10, +ijs= l e-08, p j d . 8 +pjw=0.8, mj0=0.44, mjw=0.26, wdf=O, ds=O
Gate Oxide Thickness is 96 Angstroms
PMOS PARAMETERS .MODEL CMOSP pmos level= 13 +vfbO=-2.026 lOE-Ol,l~fb=3.59493E-O2,~~fb=- 1.1065 1E-0 1 +phi0=8.25364E-0 l,lphi=û.OûûE+ûû, wphi=û.O0000E+OO +k 1~3.54 162E-01 ,lkl=-6.88 193E-02, wk1=1 S2476E-0 1 -tu=-4.5 1065E-02, Ikh9.4 1324E-03, ~k2=3.52243E-02 +etaO=- 1 -07507E-02, leta= 1 -96344E-02,weta=-3.5 1067E-04 +muz= l.37992€+02,dlO= 1 -92 169E-00 1 ,dwW.68470E-00 1 +uOO= 1.8933 1 E-O l,Iu0=6.30898E-02,~~0=-6.38388E-02 +U 1 = 1.3 17 10E-02, lu 1 = 1 -44096E-02, WU 1=6.92372E-04 +x2m=6.57709E+Qû,lx2m=- 1.56096E+00, wx2m= l.I3564E+OO +x2e=4.68478E-O5,1~2e=- 1 -09352E-03,wx2e=-1.53 I 1 1E-04 +x3e=7.76679E-O4,1~3e=- 1.972 13E-M,wx3e=- 1.12034E-03 +x3u0=8.7 1439E-03,1~2~0=- 1 -92306E-03, WX~UO= 1 -86243E-03 +x2u 1 =5 -9894 1 E-04,1~2~ 1 =4.54922E-O4, W X ~ U 1 =3.11794E-04 +mus= 1.49460E+02,lms= 1.36 l52E+û 1, wms=3.55246E+ûû +x2ms=6.37235E+00,1x2ms=-6.63305E-01, wx2ms=2.25929E+00 +x3ms=- 1.2 1 135E-O2,1~3ms= 1.92973E+00, wx3rns=1.00 182E+00 +x3u 1 =- 1.1 6599E-O3,1~3u 1 =-5.08278E-04, W X ~ U 1 S.5679 1 E-04 +toxm=9.60000E-003, tempm=2.70000E+Ol, vddm=5.00000E+00 +cgdom=4.18427E-0 IO,cgsom=4.18427E-0 lO,cgbornd.33943E-010 +xpart= l.000OûE+ûûO,durn 1 =0.00000E+000,dumS=O.O0000E+000 +nO= 1.00000E~,ln0=0.00000E+000,wnW.OOOOOE+000 +nb0=0.00000E~,Inb=O.00000E+000,wnb=0.00000E+000 +ndO=O.00000E+000,lnd=O.O0000E+000, wnd=û.O0000E+Oûû
B. Simulation models
*P+ diffusion::
APPENDIX C
Verilog Source Code
C. 1 The verilog behavioral source code:
Il Verilog HDL for "thesis", "ROM-B" "behavioral"
module ROM-B (al, a2, a3, a4, a5, ad, a7, a8, bl, b2, b3, b4, b5, b6, b7, b8, c l , c2, c3, c4, c5, c 6 f i c8, inputl. input2);
output a 1 ; output a2; output a3; output a4; output as; output a6; output a7; output a8; output b 1 ; output b2; output b3; output b4; output b5; output b6; output b7; output b8; output c l ; output c2;
C. Venlog Source Code
output c3; output c4: output c5: output c6; output c7; output c8; input input 1 ; input input2;
//interna1 wiring wire XI , x2, x3, x4. x5, x6, x7, x8, x9, x10, x l l . x12, x13, x14, x15, x16, x17 , x18, x19, x20, x21, x22, x23, x24, zl , z2,23, z4. z5,26, z7,z8,z9,z10, 211,212, z13,z14,z15, zl6, 217, z18, ~ 1 9 ~ ~ 2 0 , z21, 222,223,224;
//power supplies
supply 1 vdd; supplyO gnd;
//transistors pmos
pmos p 1 (x 1, vdd, gnd); pmos p2(x2, vdd, gnd); pmos p3(x3, vdd, gnd); pmos p4(x4, vdd, gnd); pmos p5(x5, vdd, gnd); pmos p6(x6, vdd, gnd); pmos p7(x7, vdd, gnd); pmos p8(x8, vdd, gnd); pmos pg(x9, vdd, gnd); pmos p 10(x 10, vdd, gnd); pmos p 1 1 (x 1 1, vdd, gnd); pmos p 12(x 12, vdd, gnd); pmos p 1 3(x 1 3, vdd, gnd); pmos p 14(x 14, vdd, gnd); pmos p 15(x 15, vdd, gnd); pmos p I6(x 16, vdd, gnd); pmos p 17(x 1 7, vdd, gnd); pmos p 1 8(x 1 8, vdd, gnd); pmos p 19(x 19, vdd, end); pmos p20(x20, vdd, gnd); pmos p2 1 (x2 1, vdd, gnd); pmos p22(x22, vdd, gnd); pmos p23(x23, vdd, gnd); pmos p24(x24, vdd, gnd);
C. Verilog Source Code
//transistors nmos
// inverters not nl(z1, X I ) ; not n2(alT zl); not n3(z2, x2); not n4(a2,z2); not nS(z3, x3); not n6(a3, 23); not n7(z4, x4); not n8(a4,z4); not ng(z5, x5); not n lO(a5,zS); not n 1 1 (26, x6); not n l2(a6,z6); not n 13(27, x7); not n 14(a7,z7); not n lS(z8, x8); not n 16(a8,28); not n 17(z9, x9) ; not n 18(b 1, 29); not n19(z10, x10); not n20(b2, z10); not nSl(zl1, XI 1); not n22(b3, z 1 l ); not n23(z 12, x 12); not n24(b4, z 12); not n25(z13, x13); not n26(b5, z 13);
C . Venlog Source Code
not n27(z 14, x 14); not n28(bG, z 14); not n29(z15, x15); not n30(b7, z 15); not n31(z16, x16); not n32(b8, 216); not n33(2 17, x 17); not n34(c 1, z 17); not n35(z18, x 18); not n36(c2, z 18); not n37(z 19, x 19); not n38(c3, z 19); not n39(z20, x20); not n40(c4,z20); not n41(z21, x21); not n42(c5, z2 1); not n43(z22, x22): not nM(c6.222); not n45(z23, x23); not n46(c7,z23); not n47(z24, x24); not n48(c8, 224);
reg al. a2. a3, a4, a5, a6, a7, a8, b l , b2. b3. b4. b5, b6. b7. b8, c l , CS, c3, c4, c5. c6. c7, c8;
initial begin
a l = l'bO; a2 = I'bO; a3 = 1 ' bO; a 4 = 1 * bO; a5 = 1'bO; a6 = 1 ' bO; a7 = 1 'bO; a8 = l'bO; bl = l'bO; b2 = 1 ' bO; b3 = 1'bO; b4 = I'bO; b5 = 1 %O; b6 = 1'bO; b7 = 1 'bO;
C. Verilog Source Code
# 1 Sdisplay("a1 =Sb. a2=%b, a3=%b. a4=%b. a5=%b, a6=%b. a7=%b. a8=%b. b 1 =%b. b2=%b. b3=8b. b4=%b, b5=%b, b6=%b. b7=%b, b8=%b, cl=%b. c2=%b. c3=%b, c4=%b. c5=%b, c6=%b. c7=%b. c8=%b\n",al, a2. a3. a4. a5, a6. a7. a8, bl, b2. b3. b4. b5. b6, b7, b8. cl. c2, c3. c4, c5, c6. c7, c8);
a l = l'bO; a2 = l ' b l ; a3 = libO; a 4 = l 'bl ; a5 = 1 %O; a6 = l'bO; a7 = I 'bO; a8 = l 'b l ; b1 = l'bO; b2 = I 'bl; b3 = l'bO; b 4 = l 'bl; b5 = l'bO; b6 = I'bO; b7 = l'bO; b8 = l 'b l ; c l = l 'bl; CS = l 'bl; c3 = l 'bl; c4 = l 'b l ; c5 = l 'bl; c6 = l 'b l ; c7 = l 'bl; c8 = I'bO;
#50 Sdisplay("al=%b, a2=%b, a3=%b1 a4=%b, a5=%b, a6=%b, a7=%b. a8=%b. bl=%b. b2=%b, b3=%b, b4=%b, b5=%b, b6=%b1 b7=%b, b8=%b, cl=%b, c2=%b, c3=%bl ~ 4 = % b , c5=%b, c6=%b. c7=%b, c8=%b\nV,al, a2, a3, a4, a5, a6. d, a8, bl, b2. b3, b4, b5. b6. b7, b8. cl . c2, c3, c4, c5, c6, c7, cg);
C. Verilog Source Code
al = l 'b l ; a 2 = l 'bl; a3 = I'bl; a4 = l 'bl; a5 = I'bO; a6 = I'bO; a7 = I'bO; a8 = 1'bO; bl = l 'bl; b2 = l 'bl; b3 = l 'bl; b4 = l 'bl: b5 = I'bO; b6 = 1 'bO; b7 = l'bO; b8 = I'bO; c l = l'bO; c2 = I'bO; c3 = l 'bl; c4 = 1 'bO; c5 = l 'b l ; c6 = I 'bl; c7 = l 'b l ; c8 = 1 'bO;
#50 Sdisplay("al=%b, a2=%b, a3=%b, a4=%b, aS=c/ob, a6=%b, a7=%b, a8=%b, bl=%b, b2=%b, b3=%b, b4=%b, b5=%b, b6=%b, b7=%b, b8=%b, cl=%b, c2=%b, c3=%b, c4=%b, c5=%b, c6=%b, c7=%b, c8=%b\n",al, a2, a3, a4, a5, a6, a7, a8, bl, b2, b3, b4, bS, b6, b7, b8, c l , c2, c3, c4, c5, CO, ~7,428);
end
endmodule
C. 2 The verilog stimulus source code:
// Verilog HDL for "thesis", " R O M j " "behavioral"
module R O M j ;
reg input 1, input2; wire a l , a2, a3, a4, a5, a6, a7, a8, b l , b2, b3, b4, b5, b6, b7, b8, cl, c2, c3, c4, c5, c6, c7, c8;
C. Verilog Source Code
R O M 3 sl(al,aS,~.a4,a5,a6,a7,a8,bl,b2, b3, b4, b5, b6, b7, b8, c 1, c2, c3, c4, c5, c6, c7, c8, input 1, input2);
initial begin
input 1 = 1 %O; input2 = l'bû; #1 $display( "al = %b, a2 = %b, a3= %b, a4= %b, a5 = %b, a6 = %b. a7 =ab . a8 = %b. bl
=%b,b2=%b,b3=%bTb4=%b,b5=%b,b6=%b,b7=%b,b8=%b,cl=%b,c2=%b,c3= %b, c4 = %b, c5 = %b, c6= %bc7 = %B, c8 = %b input1 = %b, input2 = %b\nW.al, a2, a3, a4, a5. a6, a7, a8, bl. b2, b3, b4, b5, b6, b7, b8, c 1, c2, c3, c4, c5, c6, c7, c8, inputl, input2):
input l = 1 'b 1; input2 = l'bO; #50 $display( "a1 = %b, a2 = %b, a3= %b, a4= %b, a5 = %b, a6 = %b, a7 =%b, a8 = %b, bl
=%b, bZ= %b, b3 =%b, b4= %b, b5 =%b, b6=%b, b7 =%b, b8 =%b,cl =%b,c2=%b,c3 = %b, c4 = %b, c5 = %b, c6= %bc7 = %B,c8 = %b inputl = %b, input2 = %bùi",al, d, a3. a4, a5, a6, a7, a8, b 1, b2, b3, b4, b5, b6, b7, b8, c 1, c2, c3, c4, c5, c6, c7, c8, input 1, input2):
input 1 = 1 'bû; input2 = l'bl; #50 $display( "al = %b, a2 = %b, a3= %b, a4= %b, a5 = %b, a6 = a b , a7 =%b, a8 = %b, bl
=%b,b2=%b,b3=%b,b4=%b1b5=%b,b6=%b,b7=%b,b8=%b,cl =%b,c2=%b,c3= %bT c4 = lob, CS = %b, c6= %b c7 = %B, c8 = %b inputl = %b, input2 = %b\n",al, a2, a3, a4, a5, a6, a7, a8, bl, b2, b3, b4, b5, b6, b7, b8, cl , c2, c3, c4, c5, c6, c7, c8, input 1, input2);
end
REFERENCES
[ 11 A. Nosratinia, "An architecture for multi-layer feedforward neural networks", M.A.Sc
thesis, University of Windsor, 199 1.
[2] W-James. Psychology (Brief course). New York: Holt, Chapter XIV "Associations"
, 1980, pp253-279.
[3] W. S. McCulloch and W. Pitts, " A logical caiculus of ideas immanent in nervous activity"
, Bulletin of Mathematicai Biophysics, Vo1.5, pp. 115-133, 1943.
[4] J. von Neuman, The Computer and the Brain. New Haven: Yale University Press,
1958, pp.66-82.
[5] Jan M. Rabaey, Digital integrated circuits- A design perspective: Prentice Hall Decenber
1995.
References
B. Widrow and M. A. Lehr,"30 years of Adaptive Neural Networks: Perceptron, Madaiine
and Backpropagation", Proceedings of EEE, vol. 78, no. 9, pp. 1415-1442. Sep 1990.
B. Boser, E. Sackinger, J. Bromley. Y. leCun and L. Jacke1,"Hardware Requirements for
Neural Network Pattern Classifiers", IEEE Micro, vol. 12. no. 1. pp. 32-39, Feb. 1992.
E. Sackinger, B. Boser, J. Bromley, Y. leCun and L. Jackel,"Application of the ANN A
Neural Network Chip to High Speed Character Recognition", EEE Trans. Neural
Networks. vol. 3. no. 3, pp. 498-505, May 1992.
T. X. Brown, M. D. Tran, T. Daud, A.P. Thakoor,*'Cascaded VLSI neural network chips:
Hardware learning for pattern recognition and classification", Simulation, pp. 340-346,
May 1992.
J. von Neuman, "Fint draft of a report on the EDVAC", in The Origins of the Digital
Computers: Selected papers. B. Randall (Editor, Berlin: Springer Verlag, 1945/1982).
A. Sankar and R. Marnmone,"Speaker independent vowel recognition using neural trees".
Int. Joint Conf. of Neural networks, vol. 2, Seattle, WA, pp. 809-8 14, 199 1.
D. O. Hebb, The Organization of Behavior, New York: Wiley, 1949.
F. Rosenblatt, 'The perceptron: a probabilistic mode1 for information storage and
organization in the brain", Psychological Review, Vo1.65, pp.386-408, 1958.
[14] B-Widrow and M. E. Hoff, "Adaptive switching circuits", 1960 IRE WESCON
Re fe rences
Convention Record, IRE, New York, pp.96- lO4.
[15] M. Minsky and S. Papert, Perceptrons. Cambridge, Massachusetts: MIT press, 1969.
[16] T. kohonen, "Correlation matrix memones", JEEE Transactions on Cornputers,
V0L.C-2 1, pp.353-359, 1972.
[ 171 S. Grosberg, "Adaptive pattern classification and universal recording: 1. Parallel
development and coding of neural feature detectors", Biological Cybernetics, Vo1.23,
No.4, pp. 12 1 - 134. July 1976.
f 181 J. L. McClelIand and D.E.Rumelhart, "An interactive activation of context effects in letter
perception: Part 1. An account of basic findings", Psychological Review, Vo1.88,
pp.375-407. 198 1.
[19] L. O. Chua and L. Yang,*'Cellular Neural Networks:Theory", IEEE Trans. Circuits &
Systems, vol. 3, no. 10, pp. 1257- 1272, Oct. 1988.
1201 D. E. Rumelhart and J.L. McClelland, "An interactive activation of context effects in letter
perception: Part II. The contextual enhancement effect and some tests and extensions of
the model", Psychological Review, Vo1.89, pp.75- 1 12, 1982.
[2 11 J. J. Hopfield, "Neural Networks and physicai systems with emergent collective
computational abilities", Proceedings of the National Academy of Sciences, Vo1.79,
pp. 2554-2558, April 1982.
Re ferences
[22] J. J. Hopfield, "Neurons with graded response have collective computational properties
like those of two state neurons", Proceedings of the National Academy of Sciences",
Vol. 8 1, pp. 3088-3092, May 1984.
[23] T. Kohonen, "Self-organized formation of topologicaily correct feature maps", Biological
Cybernetics, Vol. 43, pp. 59-69, 1982.
[24] T. Kohonen, "Adaptive associative, and self-organizing functions in neural computing",
Applied Optics, Vol. 26, No.23, pp. 4910-4918 December 1987.
[25] T. Kohonen, Self Organization and Associative Memory, Berlin: Springer Verlag, 1984.
[26] K. Fukushima, "Neocognitron: a self organizing neural network mode1 for a mechanism
of pattern recognition unaffected by shift in position", Biological Cybernetics, Vol. 36
, pp. 193-202, A p d 1980.
[27] K. Fukushima, S. Miyakee and Tito, "Neocognitron: a neural network mode1 for an
mechanism of visual pattern recognition", IEEE Transactions on Systems, Man and
Cybernetics, Vol. SMC- 13, NOS, pp. 826-834, Sept./Oct. 1983.
[28] K. Fukushima, "Neocognitron: a hierarchal neural network capable of visual pattern
recognition", Neural Networks, Vol. 1, No.2, pp. 109- 130, 1988.
[29] K. Fukushima, "A neural network for visual pattern recognition", IEEE Cornputer
Magazine, pp. 65-75, March 1988.
References
1301 D. H. Ackley. G. E. Hinton and T. J. Sejnowsky, "A learning algorithm for Boltzman
machines", Cognitive Sciences, Vol. 9, pp. 14% 169, 1985.
13 I l D. E. Rumelhart, G. E. Hinton and R. J. Williams,"Learning representations by back
propagating errors", Nature, Vol. 323, No. 6û88, PP. 533-5369 October 1986.
[32] Y. Le Cun, "Learning processes in an asyrnrnetric threshold network, Disordered Systems
and Biological Organizations, E. Bienenstock, F. Fogelman Souli and G. Weisbuch
(Editors), Berlin: Springer Verlag, 1986.
[33] D. Parker, "Learning Logic", Technical report TR-87, Center for Computational Research
in Economics and Management Sciences, MIT, Cambridge, MA, 1985.
[34] P. J. Werbos, "Beyond regression: new tools for prediction and analysis in the behaviorai
sciences", Ph. D. thesis, Harvard University, Cambridge, MA, 1974.
[35] B. G. Farley and W. A. Clark. "Simulation of self organizing systerns by digital computer
", IRE Transactions on Information Theory, Vol. 4, pp. 76-84, 1954.
[36] T. J. Sejnowsky and C. R. Rosenberg:' NETtalk: a parallel network that learns to read aioud
", The Johns Hopkins University Electrical Engineering and Computer Science Technical
Repon IUH/EECS-86/0 1, 1986.
[37] J. Bernstien, "profiles: AI, Marvin Minsky", The New Yorker, pp.50-126,
14 Decernber 198 1.
References
[38] H. D. Block, "The perceptron:a mode1 for brain functioning", Reviews of Modern Physics
, VOL 34, pp. 123-135, 1962.
1391 B. Furman and A. A. Abidi, "An analog CMOS backward error propagation LSI", Proc.
2 n d ASLOMAR Conf. on Signals, Systems & Cornputers, pp.645-648, 1988.
f40] M. Holler et al, "An Electrically Trainable Artificial Neural Network (ETANN) with 10240
Floating Gate Synapses", Proc. 1989 Int'l. Joint Conf. on Neural Networks, Vol. 2.
pp. 19 1 - 196, June 1989.
[4 1 ] C. Schneider and H. Card, " Analog CMOS Hebbian synapses", Electronics Letters,
Vol. 27, No. 9, pp. 585-586.25 April 1991.
[42] T. H. Borgestrom, M. Ismail and S. B. Bibyk,"Prograrnmable current mode network for
implementation in analog VLSI", IEEE proceedings, part G, VOL 137, No. 2, PP. 75-84
, April 1990.
[43] S. Satyanarayana, Y. Tsividis and A. F. Graf, "A Reconfigurable VLSI Neural Network
, IEEE Journal of Solid State Circuits, Vol. 27, No. 1, Jan. 1992, pp. 67-8 1.
[44] 1. B. Lont and W. Guggenbuhl, "Analog CMOS irnplementation of Multi-Layer perception
with nonlinear synapses", E E E Trans. on Neural Networks, Vol. 3, pp. 457-465, May92.
[45] G. Cauwenberghs, C. F. Neugebauer and A. Yariv, "Anaiysis and verification of an analog
VLSI Incremental Outer-Product Learning Systems", IEEE Trans. on Neural Networks
Vol. 3, No.3, PP. 488-497, May 1992.
Re ferences
[46] B. Hoechet et al. "Implementation of a leaming Kohonen Neuron based on a new Multilevel
stonge technique", EEE Journal of Solid-State Circuits", Vol. 26. No. 3. pp. 262-267.
March 1991.
[47] D. J. Weller & R. R. Spencer,"A process invariant analog invariant neural network IC
with dynamically refreshed weights", Proc. 33rd Midwest Symp. on Circuits & Systems
, pp. 273-276. Alberta Canada, 1990.
[48] M. Verleysen, D. Jespers. " Precision of computations in analog neural networks", in VLSI
Design of neural networks. U. Rarnacher & U. Rucken, eds., Kluwer Academic Publisher,
pp. 65-8 1, 199 i .
[49] B. Boser. E. Sue Kinger, J. Brornley, Y. Lecan and L.D. Jacke1,"An Analog neural network
processor with programmable topology", IEEE Journal of Solid-State Circuits, Vol. 26,
pp. 20 17-2025, December 199 1.
[SOI V. Hu, A. Kramer and P. K. Ko, "EEPROM's as analog storage devices for neural nets",
Neural Networks, suppl., Vol 1,pp. 387, 1988.
[5 11 E. Vittoz et al," Analog storage of adjustable synaptic weights in VLSI design of neural
networks", U. Rarnacber & U. Ruckert (Editors), Kluwer Academic Publishers, pp. 47-63
1991.
[52] E. Pasero, "Floating gates as adaptive weights for artificial neural networks", in silicon
architectures for Neural nets, Proçeedings of IFIP (WG. 10. 5) workshop on Silicon
Re ferences
Architectures for Neural nets, pp. 125- 135, Saint Paul de Vence, France, 1990.
[53] A. F. Murray and A.V. Smith, "'Asynchronous VLSI neural networks using pulse Stream
arithrnetic", IEEE Journal of Solid State Circuits, Vol. 23 No. 3. pp. 688-697, lune 1988.
[54] Y. Arima et aI, "A refreshable analog VLSI neural network chip with 400 neurons and 40k
synapses". IEEE Journal of Solid State Circuits. Vol. 27, No. 12, pp. 1854- 186 1,
Dec 1992.
[55] A. Moopen, T. Duong a. P. Thakoor, " Digital-Anaiog-Hybrid Synapses Chips for
Electronic neural networks", Advances in Neural information processing Systems 2
(NIPS 89), D. S. Touretiky (editor), Morgan Kaufman Publishers, 1990.
[56] A. Nosratinia, M. Ahmadi, M. Shridhar, G. A. Jullien, "A Hybrid Architecture for Feed
Forward Multi-layer neural networks", Proc. 1992 EEE International Symposium on
Circuits and Systems, Vol. 3, pp. 1541- 1544, San Diego, USA.
1571 F- Distante, M. G. Sami, R. Stefanelli and G. Stari-Gajani,'*A compact and fast silicon
implementation for layered neural networks", Proc. ht'l. Workshop on VLSI for Ai and
Neural Net, Oxford, UK, Sep, 1990.
[58] R. R. Spencer, "'Analog irnplementation of artificiai neurai network, Proc. 1992 E E E
international Symposium on Circuits and Systems, pp. 127 1 - 1274, Singapore 199 1.
1591 J. L. McClelland and D. E. Rumelhart, Explorations in Parallel Distributed Processing: A
References
Handbook of Models, Programs and Exercises, Cambridge, Massachusetts: MIT Press,
1988.
[60] R.P. Lipmann, "An introduction to computing with neural nets", IEEE ASAP Magazine
, Vol. 4, pp. 4-22, A p d 1987.
[6 1 ] J. Raffel et al, "A genenc architecture for wafer scde Neuromorphic Systems", Proc.
IEEE 1st International Conference on Neural Networks, 1987, Vol. III, p.501
[62] J. Reinhardt, B. Muller, An Introduction to Neural Networks, Springer - Verlag , 1990.
[63] N. Yazdi, "Pipelined and Trainable Architectures for Multi-layer Neural Networks",
M.A.Sc Thesis, University of Windsor, 1992.
[64] D. E. Rumelhart, G. E. Hinton and R. J. Williams, "Learning interna1 representations by
error back propagation", in parallel distributed processing, D. E. Rumelhart and J. L.
McClelland, eds., Ch 8, Cambridge, MA: MIT Press 1988
[65] L. E. Atlas and Y. Suzuki, "Digital systems for artificial neural networks", EEE Circuits
and Devices Magazine, Vol.5, No.6, pp. 20-24, November 1989.
1661 M.Yasunaga, et al., "A wafer s cde integration neural network utilizing completely digital
circuits", Proc. UCNN 1989, pp.IY2 13-2 17, 1989.
Re ferences
[67] Blasius, Boylan and Krarner, Founders of experimental physiology, J. F. Lehmanns 197 1
[68] J. J. Paulos and W. Hollis,"Artificial Neural Networks Using MOS Analog Multipliers",
E E E Journal of Solid State Circuits, vol. 25, no. 3, pp. 849-855, June 1990.
[69] S. Sadeghi Emamchaie,"Cellular Neural Networks", Ph. D Thesis, University of Windsor
1999.
[70] J . Hertz, A. Krogh and R.G. Palmer," Introduction to theory of Neural Computation",
Addison Wesley Publishing Company, 1992.
[7 11 B. W. Lee and B. J. Sheu,"Design of Neural Based A/D Converter Using Modified
Hopfield Network", IEEE Journal of Solid State Circuits, vol. 24, no. 4, pp. 1 129-1 135,
August 1989.
[72] Phillip Allen and Douglas R. Holberg,'' CMOS Analog Circuit Design", Oxford University
Press, Newyork, 1987.
[73] Design mle manual for HP- l4TB process.
[74] Meta Software, Hspice Users Manual: Simulation and Analysis, version 96.1 for HSPICE
Release 96.1, Feb 1996, California.
[75] G. G. Loremtz,'The 13* Problem of Hilbert", Mathematical Development Arising from
Hilbert Problems, Amencan Mathematical Society, Providence, R.I., 1976.
References
[76] Thomson and Brooke,"A floating gate MOSFET with tunneling injector fabricated using
a standard double polysilicon CMOS process", E E E Electronic Device letters, vol. 12,
no. 3. pp. 1 1 1-1 13, 1991.
1771 D. B. Schwartz, R. E. Huward and W. Hubbard,*'A programmable Analog Neural Network
chip". E E E Journal of Solid State Circuits, vol. 24, no. 2, pp. 3 13-3 19, Aprîi 1989.
(781 J. Mann and S. Gilbert,"An anaiog self organizing neural network chip", in Advances in
Neural Information Processing Systems 1, pp. 739-747, Morgan Kaufman, 1989.
[79] M. Walker, S. Haghighi, L. Akers, R. O. Grandin and D. K. Ferry,"Parallel Hardware
Architecture for neuromorphic Computation", Proc. 3rd Annuai Parailei Processing
Symposium, pp. 540-548, March 1989.
1801 P. Brown, R. Millecchia and M. Stinley,"Analog memory for continous voltage discrete
time implementation of Neural Network", IEEE Trans. Neural Networks vol. 3, pp. 523-
530, San Diego, CA, 1987.
[8 11 S. Eberhardt, T. Daud and A. Thakoor,"A VLSI analog synapse 'building-block' chip for
hardware neural network implementations", Proc. 3rd Annual Parailel Processing
Symposium, pp. 257-267, March 1987.
[82] F. J. Kub, K. K. Moon, 1. A. Mack and F. M. Long,"Prograrnmable analog vector-matrix
Muiltipliers", IEEE Journal of Solid State Circuits, vol. 25, no. 1, pp. 645-648, Dec 1990.
References
[83] H. P. Graf and D. Henderson,"A reconfigurable CMOS neural network, P m . IEEE intl.
Solid State Circuits Conference(1SSCC). pp. 144- 146, 1990.
[84] T. Daud, S. Ebarhardt, M. D. Tran and A. Thakoor,"Leming and optirnization with
cascaded VLSI Neural Network building block chips". Proc. IJCNN, pp. 1 84- 192, 1992-
[85] H. Djahanshahi,"A robust hybnd VLSI neural network architecture for a smart optical
sensor", Ph.D Thesis University of Windsor, 1998.
1861 B. MacleanV9'A VLSI implementation of an intelligent sensor", M.A.Sc Thesis University
of Windsor, 1998
187) P. R. Gray and R. J. Meyer, Analysis and Design of Analog Integrated Circuits. NewYork
: John Wiley and Sons, 1977.
[88] Robert G. Meyer and Robert Alan Blauschild," A wide band low noise monolithic trans-
impedance amplifier", E E E Journal of Solid State Circuits, vol. SC-2 1, no. 4, August
1986.
VITA AUCTORIS
Zulfiqar Ahmed was born in Pakistan in 1967. He graduated from Pakistan Naval Junior Cadet
College in 1984. He obtained his B.E in Electrical Engineering from N.E.D University in 1988.
From 1988 to 1993, he worked on board P.N ships and shore establishments as Weapon Electrical
Officer. From 1994 to 1995, he was Deputy Manager Design, Configuration management and
R&D Department in Pakistan Navy Electronic Research Centre. In 1996. he joined the graduate
program at the University of Windsor and obtained his M.A.Sc degree in Eiectrical Engineering in
1999.