image identification system based on an optical broadcast neural network processor

Image identification system based on an opticalbroadcast neural network processor

Marta Ruiz-Llata and Horacio Lamela-Rivera

We describe the implementation of a vision system based on a hardware neural processor. The archi-tecture of the neural network processor has been designed to exploit the computational characteristics ofelectronics and the communication characteristics of optics in an optimal manner, thus it is based on anoptical broadcast of input signals to a dense array of processing elements. The vision system has beenbuilt by use of a prototype implementation of a neural network processor with discrete optic andoptoelectronic devices. It has been adapted to work as a Hamming classifier of the images taken with a128 � 128 complementary metal-oxide semiconductor image sensor. Its results, performance character-istics of the image classification system, and an analysis of its scalability in size and speed, with theimprovement of the optoelectronic neural processor, are presented. © 2005 Optical Society of America

OCIS codes: 100.0100, 200.4700.

1. Introduction

Neural networks can be widely used in industrialvision systems as they can deal with many imageclassification and pattern recognition tasks.1,2 A neu-ral network consists of a set of simple processingelements with a high degree of interconnectivity be-tween them. The processing elements or neurons areconnected by weights that can be adapted to improveperformance. The computation strength of a singleprocessing element is small, but when huge numbersof them work together, the result is a powerful ma-chine that could be suitable for tasks such as patternrecognition and classification.3

In most cases neural networks are emulated bysoftware, but software implementations are often in-sufficient to meet the real-time requirements of manyindustrial vision applications. To alleviate this prob-lem, hardware platforms are used that are capable ofincreasing the speed compared with conventional dig-ital processors based on the Von Neumann architec-ture.2 Most of the commercial neurocomputinghardware, embedded neurosystems for special appli-

cations, PC accelerator cards, and neurocomputersare built by use of neural network chips. These chipshave been designed to carry out the basic operationsperformed by neural network algorithms in parallel;in other words, broadcast the input, multiply, andadd.3

The operation speed increases as the number ofneural operations that can be done in parallel in-creases, along with the number of processing ele-ments that make up the neural processorarchitecture. In a neuroprocessing hardware system,the number of possible interconnections increaseswith the square of the number of processing elementswithin the neural system. It is desirable to build ar-chitectures composed of hundreds or thousands ofprocessing elements with massive interconnectionsbetween them. It can be inferred that one of the keyissues for the construction of hardware architecturesfor neural networks is interconnection and parallelprocessing.4

The main drawback of wired microelectronic neu-ral network chips is that they have a low intercon-nection capacity, so they are difficult to scale up in anumber of processing elements. The use of opticaltechniques to implement interconnections in neuralnetwork hardware architectures appears to be an at-tractive alternative.5 The advantages it hopes to offerover electronic interconnections are massive parallel-ism, speed, and cross-talk-free interconnections.6

Although there are potential benefits of optical in-terconnections for hardware neural networks,5,6

there appear to be only a few optoelectronic neural

M. Ruiz-Llata ([email protected]) and H. Lamela-Rivera arewith the Grupo de Optoelectrónica y Tecnología Láser, Univer-sidad Carlos III de Madrid, C/Butarque 15, 28911 Leganés, Ma-drid, Spain.

Received 6 May 2004; revised manuscript received 15 October2004; accepted 22 October 2004.

0003-6935/05/122366-11$15.00/0© 2005 Optical Society of America

2366 APPLIED OPTICS � Vol. 44, No. 12 � 20 April 2005

processors that are shown to have applications, butnone are in use. Most of them are based on an opticalvector–matrix multiplier, such as the first optoelec-tronic Hopfield neural network proposed by Farhat etal.7 In these systems, input is introduced into theoptical processor by a modulated one-dimensional ortwo-dimensional source of light. The input beam in-tensities are individually multiplied by the weightmatrix mask [usually accomplished by use of a spa-tial light modulator (SLM)], and the resulting opticalsignals are distributed to the output plane, in whichan array of optical detectors add their contribution toform the output.5,7,8 Some problems encountered withthese systems are optical alignment and interconnec-tion weight reconfiguration and assignment with aSLM. Although SLM technology has advanced overthe past few years,9 many demonstration systemsperform associative memories in which the weightmask is fixed.

As mentioned above, the basic operations per-formed by neural network algorithms are to broad-cast the input, multiply, and add.3 Multiplication andaddition can be carried out much easier by electronicsthan with an optical system as all present-day com-puting systems demonstrate. On the other hand, op-tics has performed better at massive communicationat high speed, as demonstrated by present-day com-munication systems. To our knowledge, only a recentapplication demonstration system of an optoelec-tronic neuroprocessor has managed this and it usesoptics to communicate and electronics to process; it isthe optoelectronic neural-network scheduler de-scribed in Ref. 10. The optoelectronic architecturebasically consists of a set of winner-take-all (WTA)networks with a regular interconnection pattern. Theparticular characteristic of the system is that there isa constant weight associated with each interconnec-tion, so the presence of the SLM is not necessary. Theauthors have improved the system by modifying theelectronic part,11 its main disadvantage being thatthe optical system has been designed particularly foroptimization tasks.

Lamela et al.12 described a novel optical broadcastarchitecture for neural networks. It is a hybrid opto-electronic architecture that does not attempt to com-pute in the optical domain. Our intention is to exploitthe communication strength of optics and the compu-tational strength of electronics in an optimal manner.Interconnections are carried out in the optical do-main whereas interconnection weight is assigned tothe electronic domain. Thus, with this approach, re-configurable optical interconnections are not re-quired and the high fan-out and massive parallelismof optical beams are exploited.

Here we focus on the description of a vision systemthat uses an optical broadcast neural network as aprocessing core. In Section 2 we describe the concep-tion of optical broadcast hardware. In Section 3 wegive a description of the design of the optoelectronicneurons that make up the architecture. In Section 4we describe the neural network model (a Hammingclassifier) used in the vision system and its imple-

mentation based on an optical broadcast neural net-work architecture. In Section 5 we focus on the wholeimage classification system comprised of an imagesensor, a memory to store the sample pattern, a con-troller, and a core optoelectronic neural processor.Results are presented in Section 6. In Section 7 wepresent an evaluation of the projected performance ofthe system, compared with pure electronic implemen-tations. Conclusions and further research are pre-sented in Section 8.

2. Optical Broadcast Architecture for Neural Networks

The basic operation of one neuron in a neural net-work is to provide one output that is a nonlinearfunction of weighted inputs.3 Our proposed optoelec-tronic hardware architecture12 is composed of a set ofweight up and accumulate neurons that implementthat operation in a time-multiplexing scheme. Theneurons are grouped in cells, and all the neurons in acell share the same time-distributed input. The wholearchitecture (Fig. 1) is made up of K cells with Mneurons in each.

The main feature of our architecture, comparedwith electronic architectures, is the global intercon-nection performed by use of a special holographicdiffuser that efficiently broadcasts the input to all theneurons in a cell. By attaching these cells one on topof another into noninterfering planes, parallel pro-cessing of the input data is possible.

The main difference with this system comparedwith other optoelectronic neural network hardwareimplementations is the time multiplexing of intercon-nection weights as opposed to spatial multiplexing,i.e., there is one input in one time slot that is broad-cast to all the neurons in a cell.

Figure 2 is a block diagram of one optical broadcastcell. The cell works as follows. First, all the neuronsare cleared. The operation cycle of a cell is dividedinto time slots; in the first time slot the first input isintroduced and optically distributed to all the neu-rons. Each neuron executes the product of the firstinput with the corresponding interconnection weightand stores the result. In the next time slot the secondinput is broadcast, multiplied by the interconnection

Fig. 1. Optical broadcast neural network architecture.

20 April 2005 � Vol. 44, No. 12 � APPLIED OPTICS 2367

weight and added to the previous result. At the end ofthe operation cycle all the inputs have been intro-duced and the output of the cell is the product of theinput vector and the interconnection weight matrix.These outputs can be connected to different hardwareblocks, such as threshold electronic circuits or WTAcircuits, which suppress all output other than thatwhose initial input was the maximum.

This architecture is essentially a hybrid (optical–electronic) vector–matrix multiplier in which commu-nication fan-out is done optically and theinterconnection weight assignment and other neuronoperations are done electronically. The optical time-multiplexing scheme allows for a number of detectorsto be minimized; only one per neuron is needed (Mdetectors per K cell), a number of optical emitters tobe minimized [only one per cell is needed (K)], and thedesign of the optical interconnection device to be sim-plified as well to facilitate optical alignment. Neuralsystem synchronization could also be carried out byoptical broadcast of the clock signal.

3. Optoelectronic Neuron Designs

The input vector is distributed sequentially to all theneurons in the cell by means of an optical signal. Theoptoelectronic neuron must multiply each input withthe corresponding interconnection weight and accu-mulate the result until the entire input vector hasbeen introduced at the end of the operation cycle. Thebasic neuron scheme that implements that opera-tion12,13 is presented in Fig. 3. The light pulses thatreach each neuron are converted into current pulsesby the photodetector. An analog multiplexor, con-trolled by a corresponding interconnection weight,connects the detector or not (binary weights) to acapacitor that acts as the storage element. In thisway, the product operation is implemented by theanalog multiplexor and the accumulation function of

the neuron is implemented by the capacitor [Eq. (1)].Additionally, there is a switch controlled by the clear(CLR) signal that resets the capacitor at the begin-ning of an operation cycle.

Equation (1) summarizes how a neuron works. Itrepresents the increase in the voltage of the capacitorin a time slot that is proportional to the product of theinput (I) and interconnection weight (W) and the de-sign parameters of the system. These are the valuesof the storage capacitor (C), the optical power thatreaches the detector (P), the responsivity of the de-tector (R), and the time slot ��t�. For binary unipolarinputs and interconnection weights, if input I is 0 thevoltage in the capacitor does not increase because nolight hits the detector and no photocurrent is gener-ated; if input I is 1, the voltage in the capacitor in-creases only if the interconnection is 1 because in thiscase the detector is connected to the capacitor:

�VC � IW�1C PR�t�. (1)

Figure 4 shows temporal processing of a neuron inthe architecture. The input vector is introduced intoan operation cycle. The operation cycle is divided intotime slots and each element of the input is introducedinto one time slot. Figure 4 shows an example ofevolution of the signals for binary input and binaryinterconnection weights; the input and the intercon-nection weights of one processing element are shown.The result of the accumulation can be read at the endof an operation cycle.

The neuron design described above allows onlybinary input patterns and binary interconnectionweights with values {0,1}. Higher resolution of in-puts and interconnection weights can be obtainedby proper modulation inside a time slot; inputs canbe codified as pulse-frequency modulated signals

Fig. 2. Block diagram of the optoelectronic neural network cell.

Fig. 3. Basic weight up & accumulate neuron design.

Fig. 4. Temporal processing of a neuron.


and interconnection weights as pulse-width modu-lated signals.14 In this way we can exploit the highbandwidth of the optical emitters and interconnec-tions.

Figure 5 shows the waveforms for a high-resolution operation of the optoelectronic weight upand accumulate neurons. In Fig. 5 there are sixoscillograms that represent the behavior of one neu-ron with different inputs and interconnectionweights. Inputs and weights are in the range (0,1).The maximum resolution of inputs is limited by theratio between the width of the pulses of the opticalemitter (laser diode) and the time slot associatedwith one input; this is 6 bits in the example given.The resolution of interconnection weights is limitedby the weight storage memory, which is a digitalmemory with an 8-bit word length in this first pro-totype. The oscillograms at left represent the evo-lution of the voltage �VC� in the accumulationcapacitor in an operation cycle with four inputs (fourtime slots); the oscillograms at right span the dura-tion of a time slot. We observed that the final voltagein the capacitor is proportional to the accumulativeproduct of the inputs and their corresponding inter-connection weights (see Fig. 5); the increase of thevoltage in a time slot is

�VC �1C PRTX�W, (2)

where C is the capacity of the storage capacitor, P isthe optical power that reaches the neuron detector, Ris the responsivity of the photodetector, and TX�W isthe time that both input (I) and weight (W) signalsare at a high level, which is proportional to the prod-uct of the input and interconnection weight.

For the basic neuron design described in Fig. 3 andthe input and weight data represented as the pulse-frequency modulated and pulse-width modulated sig-nals, respectively, we introduce high and variableresolution multiplication without the use of electronicmultipliers.

It is also possible to design the neuron circuitry toallow bipolar weights. This particular question is ofinterest compared with traditional optical vector–matrix multipliers based on the modulation of inputbeams by means of a SLM. Because of the unipolarnature of light, it is necessary to duplicate the sys-tem for excitatory and inhibitory interconnections7

or resort to more complicated techniques such aspolarization control.15 In our optical broadcast ar-chitecture, as interconnection weight assignment isimplemented in the electronic domain, we can mod-ify the neuron circuitry to allow for bipolar inter-connection weights. One example of this is thescheme presented in Fig. 6. The XOR function com-putes the sign for excitatory or inhibitory contribu-tions and the capacitor, as a storage element,increases or decreases its voltage according to theresult of the multiplication operation. Oscillogramsshowing how this scheme works are given in Section4.

4. Description of the Optoelectronic HammingNetwork

A. Neural Network Model

The neural network hardware model we chose to im-plement the image identification system is a Ham-ming classifier. It has been demonstrated as the mostefficient classifier for binary patterns compared withother content-addressable memories such as the Hop-field network.16

It comprises two layers,3 as represented in Fig. 7.The first layer consists of as many nodes or neuronsas the number of different classes that can be classi-fied (P); one pattern is assigned for each processingnode. The inputs are the input image pixels (N). The

Fig. 5. Output of one weight up & accumulate neuron for inputsof (a) 0.5, (c) 0.5, (e) 0.25; weights of (a) 0.9, (c) 0.25, (e) 0.9; (b), (d),(f) time slot spans.

Fig. 6. Neuron scheme for bipolar weights.


sample pattern pixel values are stored in the inter-connection weights for each processing node. The out-put of each processing node is the distance betweenthe input image and the stored pattern, the strongestresponse of a neuron, the closest to the stored pat-tern. The second layer of the Hamming classifier re-ceives the matching scores from the first layer. Itsfunction is to suppress the values at the output nodesexcept for the output node of the first layer that wasinitially the maximum.

B. Optoelectronic Hamming Classifier

The block diagram of the first prototype implemen-tation of the optoelectronic Hamming classifier is pre-sented in Fig. 8. The calculate matching scores layerwas implemented based on our optoelectronic archi-tecture. The first prototype is composed of four pro-cessing elements, so this allows a choice to be madeamong four different classes �P � 4�. The MAXNETlayer has been implemented by use of an electroniccircuit based on the circuit proposed in Ref. 17.

The neural network model we chose to implementthe image classification system works with binary

patterns for which input and interconnection weightvalues are ��1, �1�. To calculate the matching scorebetween the input and each of the sample patternswe must determine the scalar product of the inputpattern and the reference pattern for each processingelement. This operation can be carried out by use ofthe neuron circuitry presented in Fig. 6.

The system works as follows: the input image pix-els are multiplexed in time and optically distributedto all neurons. Each neuron receives each input as anoptical signal, the photodetector provides a photocur-rent proportional to the optical power, which is con-verted into a voltage level by a transimpedanceamplifier and then threshold, so that the input to theneuron is binary. The multiplication of input andinterconnection weight is carried out by a logical XOR

that gives a result of 0 if the input and interconnec-tion weight (corresponding pixel value of the sample)are equal and 1 if they are different. The storageelement is a capacitor that increases its voltage if theproduct is 0 or decreases it if the product is 1. Theamount of charge that increases the voltage in thecapacitor is controlled by the current source Ichar (Fig.6), which is connected when the product is 0. Theamount of charge that decreases the voltage in thecapacitor is controlled by the current source Idis,which is connected when the output is 1. This neuronscheme provides excitatory and inhibitory connec-tions and also controls the value of the interconnec-tion weights by controlling the current sources. Thisneuron scheme allows the product of an input patternand a reference pattern to be found. At the end of anoperation cycle, when the whole input vector has beensequentially presented (N time slots), the voltage inthe capacitor is proportional to the number of match-ing elements between the input and the correspond-ing reference pattern.

To observe the waveforms of the neurons we usethe patterns presented in Fig. 9 as examples �N� 8 � 8�. Each of the four diagrams presented in Fig.10 corresponds to a different neuron. Waveform 1 isthe CLR signal, common to all neurons. Waveform2 is the input pattern, common to all neurons. Inputpixel values for the optoelectronic neural networkare presented sequentially so input images are readfrom left to right and from the top down. In the fouroscillograms it is obvious that the input pattern isA. Waveform 3 is the reference pattern; A for theoscillogram in Fig. 10(a); E for the oscillogram inFig. 10(b), C for the oscillogram in Fig. 10(c), and

(negative A) for the oscillogram in Fig. 10(d).Waveform 4 is the output product. Waveform 5 is

Fig. 7. Hamming classifier.

Fig. 8. Optoelectronic Hamming classifier.

Fig. 9. 8 � 8 sample patterns.


the voltage in the storage capacitor that increaseswhen the input and reference patterns match anddecreases when they do not. We observed that atthe end of the operation cycle the voltage in thecapacitor is proportional to the number of pixels inthe input pattern that are equal to those in thereference pattern.

The WTA electronic circuit is presented in Fig. 11.It is similar to the circuit proposed in Ref. 17, butoff-the-shelf bipolar discrete transistors are used in-stead of a complementary metal-oxide semiconductor(CMOS) integrated circuit. The circuit has four in-

puts Mi (analog voltage levels) and provides four dig-ital outputs Oi; 0 for the highest input voltage and 1for the others. The circuit is made up of four identicalsections, one per processing node. Each section i,which is basically a voltage buffer, consists of a dif-ferential amplifier (Qi1 and Qi2) with a mirror activeload (Qi3 and Qi4) and a voltage follower �Qi5�. As allthe voltage followers share the same output, the volt-age �Vmax� always follows the maximum input voltageat the input node �Mi�. Transistor Qi5 of the winnersection remains ON, while the other three are OFF.Each comparator �Ai� provides a level of 0 at its out-put �Oi� if it matches the winner section. If not, thelevel of output is 1. In the diagrams presented in Fig.10, we observed that at the end of the operation cyclenode 1 is the winner.

5. Vision System Description

The vision system has been designed to capture animage with a 128 � 128 pixel CMOS image sensor tocompare it with a set of stored patterns and to providean output that indicates which pattern best matchesthe input. The block diagram of the system is pre-sented in Fig. 12.

Fig. 10. Waveforms for (a) node 1, (b) node 2, (c) node 3, (d) node 4.

Fig. 11. WTA circuit.


A. Optoelectronic Hamming Classifier

The core of the vision system is the optoelectronicneural network described in detail in Section 4. It canbe deduced that the number of different classes wecan classify depends on the number of optoelectronicprocessing elements that are implemented in the sys-tem, which is four in this prototype �P � 4�. The sizeof the input vector can be configured, we just need tochange the number of time slots within an operationcycle. We have tested up to N � 64 � 64 pixel inputimages, limited by the size of the address bus in thecontrol system.

B. Memory Block

The reference patterns that are to be compared withthe input image are stored in a static random-accessmemory. The reference patterns that define eachclass are presented in Fig. 13(a). These gray-levelpatterns are threshold and the resolution is reducedto 64 � 64 pixels. The resulting images [Fig. 13(b)]are stored in the memory of the vision system.

C. Image Sensor

The input image is captured by a CMOS image sensorULL128.18 The sensor specifications are resolution of128 � 128 pixels, array size of 4 mm � 4 mm, supplyvoltage of 5 V, output swing of 3.5 V, and responsivityof 17 V��lx s�. The CMOS sensor was designed withthe photosensors separate from the storage elements

so that it also works as a random readout analog mem-ory with storage time of the order of tens of seconds.

The sensor array has three operation modes con-trolled by the controller. The reset mode must be acti-vated before taking a new image. The discharge modemust be activated to take a new image; controlling thetime in the activated mode means that the exposuretime can be controlled. The readout mode: the CMOSsensor stores the last captured image and the imagepixels can be read randomly. The sensor works as arandom-access analog memory of the image pixel val-ues.

Because the optoelectronic Hamming network clas-sifies binary patterns, the analog output voltage fromthe CMOS sensor is thresholded using a comparator.The binary output is the input for the optoelectronicHamming classifier.

An example of a 64 � 64 pixel input image cap-tured by the camera is presented in Fig. 14(a). Theresolution of the image has been set to 64 � 64because we have only 12 pins available in the micro-controller (8051F226) to provide the pixel address.The camera has been connected so that we can readpixels in the even rows and columns of the sensorarray. Figure 14(b) shows the threshold image that isthe input for the optoelectronic Hamming classifier.

D. Control and User Interface

The system works in two modes. In the configurationmode the controller communicates with a PC bymeans of the RS-232 standard. The user can selectthe resolution of the input image and the referencepatterns; the reference patterns can be images cap-tured by the CMOS sensor or any binary image storedin the computer. Once the patterns are selected theyare stored in the memory block. In the operationmode, first the image is captured, then its pixel val-ues are addressed sequentially along with the con-tent of the memories. At the end of the operationcycle, when all the image pixels have been presented,the controller reads the output provided by the opto-electronic classifier. We observed that the control sig-nals and the information managed by the controllerdo not depend on the size of the Hamming classifier.

6. Vision System Results

Here we describe the results obtained with the visionsystem described in Section 5. A picture of the wholesystem is shown in Fig. 15. Figure 16 shows the re-sults obtained by the vision system when the refer-ence patterns were the ones presented in Fig. 13 andthe input image was the one presented in Fig. 14(b).The first waveform shows the beginning of the oper-ation cycle and the second waveform its end. The nextfour waveforms represent the evolution of the neu-rons’ activation throughout an operation cycle for theinput image presented in Fig. 14(b). The operationcycle is divided into N � 64 � 64 (4096) time slots;one slot per input image pixel. The next four wave-forms correspond to the output of the WTA circuit. Itis obvious that, at the end of the operation cycle, thehigher activation corresponds to the pedestrian cross-

Fig. 12. Vision system block diagram.

Fig. 13. (a) Reference patterns and (b) stored patterns.


ing traffic signal, which means that the image cap-tured by the camera has been correctly recognized.

7. Projected Performance of the System VersusElectronics

Here we present an evaluation of the projected per-formance advantages of our system compared withpure electronic neural systems. Electronics is a ma-ture technology that has provided many implemen-tations of specific neural network processors. Manycommercial neural chips emerged in the early 90s19;most of them are discontinued19–21 but new chips,evolved from previous implementations, have ap-peared recently.22,23 Table 1 summarizes the charac-teristics of these chips. The parameters shown arethe number of neurons �NN� or processing elements(PE) that compound the hardware neural networkarchitecture; the number of connections allowed perprocessing element (NW per PE), and the speed, mea-sured in connections per second (CPS), for which aconnection means a multiplication and an addition.These commercially available neural chips have been

Fig. 14. (a) Sensor image and (b) input image.

Fig. 15. Picture of the vision system.

Fig. 16. Vision system results.


classified according to their architecture between an-alog vector–matrix multipliers (VMM) and analogbioinspired (bio) or digital type.

Comparing these systems, we can determine howelectronic neural system implementations haveevolved in the past decade. However, this evolution isnot surprising if we compare it with the evolution ofconventional digital processors.25 Their operationspeed has increased by approximately 2 orders ofmagnitude, following the same path as conventionaldigital processors. Other important parameters forneural networks, such as the number of neurons orprocessing elements and the number of weighted in-terconnections per neuron, have not been increasedsignificantly over the past decade. Pure electronicneural products have been designed to perform spe-cific tasks faster and at lower power than neuralnetwork algorithms running on general-purposehardware. Digital chips, such as the VindAx proces-sor,23 are good candidates for the implementation ofsmall neural network classifiers, for which portabil-ity and real-time operation are application specifica-tions. An analog neural network, such as theACE16K chip,22 is a good solution for real-time, low-level vision applications, but it is not useful for high-level vision applications because of its limitedinterconnected architecture. It seems that pure elec-tronic neural hardware cannot cope with large net-works composed of a large number of neurons and alarge number of interconnections per neuron.

It is also important to mention the fact that, asCMOS technology improves, so does the performanceof transistor speeds and densities, but this is not thecase with the chip interconnections.26 As neural net-work hardware architectures need to be composed ofa large number of interconnections, pure electronicneural network implementations will be more diffi-cult to scale up than other electronic processing ar-chitectures. Optical interconnections would helpelectronic neural systems to scale up the number ofprocessing elements, number of interconnections perprocessing element, and the speed. In particular, op-tical interconnections have been shown to be the lead-ing appraoch in a multigigahertz frequency rangelong-distance (chip scale) signal distribution, such asclock distribution networks.6,27 The main advantages

are the reduction of jitter, skew, and power consump-tion.6 The optoelectronic architecture we propose forneural networks is based on these principles; eachcell in our system is composed of a signal source thatis broadcast to many different processing nodes, socommunication will be faster and use less power thanin a wired network.

Our system can readily be scaled up to a largenumber of neurons by simply increasing the numberof detectors and replicating their attached electron-ics. A foreseeable system would comprise 100 cellswith 100 processing elements per cell. In a scaledintegrated system of this nature, the memory of in-terconnection weights should be physically distrib-uted among the neurons. Thus, local connectionswould be electronic, as their performance appears tobe sufficient,27 whereas global interconnectionswould be optical. Preliminary work28 centered on theintegration of electronic circuits in a standard CMOSprocess, and the fabrication of an efficient opticalinterconnection element with volume holograms,suggests that we would be able to package such asystem—comprising 100 cells with 100 neuronseach—into a cube with sides measuring 15 mm.28

Besides this, the processing speed is now limited bythe response time of the large area detector used inthe prototype and the discrete component electronicpixels. Faster speed has been achieved in a recentoptoelectronic neuron design with a bandwidth of upto 150 MHz.29 Higher bandwidths, in the gigahertzrange, are currently being tested. Table 2 shows asummary of performance parameters of existing com-mercial neural systems and our optoelectronic sys-tem. The parameters in bold type are those that havealready been demonstrated.

8. Conclusions and Further Studies

We have built a vision system based on a CMOSimage sensor and an optoelectronic broadcast neuralnetwork, configured as a Hamming classifier. Wehave shown the ability of the system to capture animage, 64 � 64 pixels in size, compare it with fourreference patterns, and provide an output that indi-cates which of the reference patterns best matchesthe input image. The number of classes into whichthe input pattern can be classified depends merely on

Table 1. Performance Parameters of Electronic Neural Network Chips

Name/Author Year Type NN�PE NW per PESpeed(CPS)

ETANN20a 1989 Analog VMM 2 � 64 128 2 � 109

Schemmel et al.24b 2004 Analog VMM 4 � 64 128 800 � 109

Synaptic OCR19a �1993 Analog bio NAd NAd �109

ACE16K22c 2004 Analog bio 128 � 128 8 �300 � 109

CNAPS21a 1990 Digital 64 4K 1.6 � 109

VindAx23c 2003 Digital 256 16 11 � 109

aDiscontinued product.bExperimental prototype, not commercially available.cCommercial prototype.dNot available.


the size of the optoelectronic neural network, as thecontrol signals will remain the same if the system isscaled up.

As shown by the evaluation of the projected perfor-mance, our system could be used for large neuralnetwork implementations that cannot be carried outby purely electronic systems. The main advantage ofoptical interconnections in an optoelectronic classifieris that our system is potentially scalable to a largenumber of neurons and interconnections as discussedin Section 7. In Table 2 we have compared the per-formance parameters of our system with recent elec-tronic neural network implementations. Althoughour prototype implementation is composed of fourprocessing elements, we have shown a number ofinterconnections per neuron equal to 4096 (for a 64� 64 pixel pattern). This number is limited by thememory size and not by the hardware architecture,as in the ACE16K chip.22 The operation speed of ouroptoelectronic classifier for 150�MHz bandwidth29 is9 � 106 classifications per second for a 16-elementpattern; we used this pattern size for comparison asit is the maximum size of the patterns that can bemanaged by the VindAx processor in its current im-plementation.23 If the size of the patterns is in-creased, the classification speed is reducedproportionally. For patterns of 128 � 128 elements(maximum image size for the ACE16K chip), our vi-sion system would perform them at a classificationrate of 8800 frames�s.

The classification speed of our system comparedwith both electronic processors in Table 2 is faster,but the main advantages are that (1) it implementsa higher number of neurons compared with the dig-ital neural processor, so the classification is madebetween a higher number of classes and (2) it im-plements a higher number of connections comparedwith an analog neural processor, so high level visionapplication can be implemented. Assuming a scaledsystem based on the optical broadcast architecturewith 100 cells and 100 neurons per cell28 and theoperation speed (for 150�MHz bandwidth29), mea-sured in connections per second, we obtained 1.5� 1012 CPS; much faster than the electronic neuralprocessors presented in Table 1. Our recent effort ison the implementation of a scaled and compact pro-totype28 and on the improvement of the vision systemcapacity.

References1. M. R. G. Meireles, P. E. M. Almeida, and M. G. Simoes, “A

comprehensive review for Industrial applicability of artificial

neural networks,” IEEE Trans. Ind. Electron. 50, 585–601(2003).

2. E. N. Malamas, E. G. M. Petrakis, M. Zervakis, L. Petit, andJ.-D. Legat, “A survey on industrial vision systems, applica-tions and tools.” Image Vision Comput. 21, 171–188 (2003).

3. R. P. Lippmann, “An introduction to computing with neuralnets,” IEEE Trans. Acoust. Speech Signal Process. ASSP-4,4–22 (1987).

4. M. W. Roth, “Survey of neural network technology for auto-matic target recognition,” IEEE Trans. Neural Netw. 1, 28–43(1990).

5. H. J. Caulfield, J. Kinser, and S. K. Rogers, “Optical neuralnetworks,” Proc. IEEE 77, 1574–1583 (1989).

6. D. A. B. Miller, “Rationale and challenges for optical intercon-nects to electronic chips,” Proc. IEEE 88, 728–749 (2000).

7. N. H. Farhat, D. Psaltis, A. Prata, and E. Paek, “Optical im-plementation of the Hopfield model,” Appl. Opt. 24, 1469–1475(1985).

8. S. Jutamulia and F. T. S. Yu, “Overview of hybrid opticalneural networks,” Opt. Laser Technol. 28, 59–72 (1996).

9. G. L. Li and P. K. L. Yu, “Optical intensity modulators fordigital and analog applications,” J. Lightwave Technol. 21,2010–2030 (2003).

10. R. P. Webb, A. J. Waddie, K. J. Symington, M. R. Taghizadeh,and J. F. Snowdon, “Optoelectronic neural-network schedulerfor packet switches,” Appl. Opt. 39, 788–795 (2000).

11. K. J. Symington, Y. Randle, A. J. Waddie, M. R. Taghizadeh,and J. F. Snowdon, “Programmable optoelectronic neural net-work for optimization,” Appl. Opt. 43, 866–876 (2004).

12. H. Lamela, M. Ruiz-Llata, and C. Warde, “Optical broadcastinterconnection neural network,” Opt. Eng. 42, 2487–2488(2003).

13. H. Lamela, M. Ruiz-Llata, and C. Warde, “Prototype optoelec-tronic neural network for artificial vision systems,” in Proceed-ings of the IEEE Annual Conference of the IndustrialElectronic Society (Institute of Electrical and Electronics En-gineers, Piscataway, N.J., 2002).

14. L. M. Reyneri, “On the performance of pulse and spiking neu-rons,” Analog Integr. Circuits Signal Process. 30, 101–119(2002).

15. S. Abramson, D. Saad, E. Marom, and N. Konforti, “Four-quadrant optical matrix–vector multiplication machine as aneural-network processor,” Appl. Opt. 32, 1330–1337 (1993).

16. L. Tarassenko, J. N. Tombs, and J. H. Reynolds, “Neural net-work architectures for content-addressable memory,” IEEProc. F 138, 33–39 (1991).

17. R. G. Carvajal, J. Ramiriez-Angulo, and J. Martinez-Heredia,“High-speed high-precision min/max circuits in CMOS tech-nology,” Electron. Lett. 36, 697–699 (2000).

18. G. Chapinal, S. A. Bota, M. Moreno, J. Palacin, and A. Herms,“A 128 � 128 CMOS image sensor with analog memory forsynchronous image capture,” IEEE Sensors J. 2, 120–127(2002).

19. D. Hammerstrom, “Neurocomputing hardware: present andfuture,” Artif. Intell. Rev. 7, 285–300 (1993).

20. M. Holler, S. Tam, H. Castro, and R. Benson, “An electricallytrainable artificial neural network (ETANN) with 10240 “float-

Table 2. Comparative Performance Parameters

Name/Author Type NN�PE NW per PE Speed

ACE16K22 Analog 128 � 128 8 �1000 frames/sVindAx23 Digital 256 16 2 � 106 classifications/sOptical broadcast Optoelectronic 100 � 100 �4K 9 � 106 classifications/sa

(for 16-element patterns)

aFinal results with electronic circuitry at 150�MHz bandwidth (from Ref. 29).


ing gate” synapses,” in 1989 IEEE International Joint Confer-ence on Neural Networks, Vol. 2 (Institute of Electrical andElectronics Engineers, Piscataway, N.J., 1989), pp. 191–196.

21. D. Hammerstrom, “A VLSI architecture for high-performance,low-cost, on-chip learning,” in 1990 IEEE International JointConference on Neural Networks, Vol. 2 (Institute of Electricaland Electronics Engineers, Piscataway, N.J., 1990), pp. 537–544.

22. A. Rodriguez-Vazquez, G. Liñán-Cembrano, L. Carranza, E.Roca-Moreno, R. Carmona-Galán, F. Jiménez-Garrido, R.Domínguez-Castro, and S. Espejo-Meana, “ACE16k: the thirdgeneration of mixed-signal SIMD-CNN ACE chips towardVSoCs,” IEEE Trans. Circuits Syst. I 51, 851–863 (2004).

23. D. C. Hendry, A. A. Duncan, and N. Lightowler, “IP core im-plementation of a self-organizing neural network,” IEEETrans. Neural Netw. 14, 1085–1096 (2003).

24. J. Schemmel, S. Hohmann, K. Meier, and F. Schürmann, “Amixed-mode analog neural network using current-steering

synapses,” Analog Integr. Circuits Signal Process. 38, 233–244(2004).

25. Intel Corporation, “Microprocessor quick reference guide,”http://www.intel.com/pressroom/kits/quickrefyr.htm.

26. J. D. Meindl, “Beyond Moore’s law: the interconnect era,” Com-put. Sci. Eng. 5, 20–24 (January/February 2003).

27. J. H. Collet, F. Caignet, F. Sellaye, and D. Litaize, “Perfor-mance constraints for onchip optical interconnects,” IEEE J.Sel. Top. Quantum Electron. 9, 425–432 (2003).

28. M. Ruiz-Llata, H. Lamela, M. Moreno, S. Bota, A. Hermsand,and C. Warde, “Progress on the development of a compactneuroprocessor with optical interconnections,” in Proceedingsof the Conference on Design of Circuits and Integrated Systems(Ciudad Real, Spain, 2003), pp. 718–722.

29. H. Lamela, M. Ruiz-Llata, D. M. Cambre, and C. Warde, “Fastprototype of the optical broadcast interconnection neural net-work architecture,” in Optical Information Systems II, B. Ja-vidi and D. Psaltis, eds., Proc. SPIE 5557, 247–254 (2004).


image identification system based on an optical broadcast neural network processor

Documents