fisherman learning algorithm of the som realized in the ... · elements of a given learning pattern...

Fisherman learning algorithm of the SOMrealized in the CMOS technology

Rafa l D lugosz1, Marta Kolasa1 and Witold Pedrycz2

1- Faculty of Telecommunication and Electrical EngineeringUniversity of Technology and Life Sciencesul. Kaliskiego 7, 85-796, Bydgoszcz, Poland

2- Department of Electrical and Computer EngineeringUniversity of Alberta

Edmonton, AB T6G 2V4, Canada

Abstract. This study presents an idea of transistor level realization ofthe fisherman learning algorithm of Self-Organizing Maps (SOMs) whichis described in [4]. The realization of this algorithm in hardware calls fora solution of several specific problems not present in software implementa-tion. The main problem is related to an iterative nature of the adaptationprocess of the neighboring neurons positioned at particular rings surround-ing the winning neuron. This makes the circuit structure of the SOM verycomplex. To come up with a feasible realization, we introduce some mod-ifications to the original fisherman algorithm. Detailed simulations of thesoftware model of the SOM show that these modifications do not have thenegative impact on the learning process, and helps bring significant reduc-tion of the circuit complexity. In consequence, a fully parallel adaptationof all neurons is possible, which makes the SOM very fast.

1 Introduction

In the competitive learning of SOMs training patterns X(l) being vectors in ann-dimensional space <n coming from a given learning set are presented to theneural network (NN) in the random fashion. At each learning cycle (l) the net-work computes a distance between a given pattern and the weight vectors Wj(l)of all neurons of the map. The neuron, whose weights resemble a given inputpattern to the highest extent becomes a winner. Various learning algorithms ofSOMs have been proposed. One of the basic algorithms is the Winner TakesMost (WTM) method, which is often referred to as the classic Kohonen’s rule.The adjustment of the weights is in this case realized as follows:

Wj(l + 1) = Wj(l) + η(k)G(i, j, R, d)[X(l)−Wj(l)] (1)

where η(k) is the possible learning rate in the kth training epoch, d is thedistance in neuron space, R is the range of the neighborhood, Wj is the weightvector of the particular neuron in the map, and X is a given input pattern.The neighboring neurons are adjusted at different intensities that depend on theneighborhood function (NF) G() and a distance in neuron space, d, between thewinning, ith neuron and the corresponding neighbors. In the classical approach

171

ESANN 2011 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges (Belgium), 27-29 April 2011, i6doc.com publ., ISBN 978-2-87419-044-5. Available from http://www.i6doc.com/en/livre/?GCOI=28001100817300.

the rectangular neighborhood function (RNF) is used [3]. As stressed in theliterature, better results are achieved by using Gaussian neighborhood function(GNF). A realization of the GNF in the software-based NNs is simple, but thehardware realization is very complex. For this reason, we recently proposedan efficient hardware implementation of the triangular neighborhood function(TNF), which requires only a single multiplication and shifting the bits [1].

In this paper, we consider a hardware realization of another algorithm, pro-posed by Lee and Verleysen in [4], called the fisherman rule. Let us recall that inthe classical Kohonen’s algorithm all neurons that belong to the winner’s neigh-borhood are adapted so that the connections move towards the input patternX, as shown in (1). In the algorithm proposed in [4], in the first iteration thewinning ith neuron is adapted in the same way as encountered in the classicupdate rule:

Wi(l + 1) = Wi(l) + α0[X(l)−Wi(l)] (2)

For a distance d = 0 the α0 parameter is equal to the term η0(k) ·G() termpresent in (1). On the other hand, in the fisherman rule the neighboring neurons(d = 1, ..., R) are trained in an iterative fashion according to formula:

Wd(l + 1) = Wd(l) + αd[Wd−1(l + 1)−Wd(l)] (3)

In the second iteration, for d = 1, all neurons from the first ring surround-ing the winner are adapted in such a way that their weights move toward theweights of the winning neuron calculated in the first iteration. The neurons fromthe second ring, i.e., for d = 2, are in the next iteration adapted towards theupdated weights of the neurons of the first ring, and so forth. A detailed compar-ison between different learning rules, presented in [4], show that the fishermanalgorithm usually leads to better results than the classic one.

In case of the software realization, in which weights of particular neurons arecalculated sequentially, both algorithms come with a comparable computationalcomplexity, so there is a sound rationale behind using the fisherman rule inmany cases. On the other hand, the fisherman concept is significantly morecomplex in hardware realization, as the described iterative adaptation sequencehas to be controlled by an additional multiphase clock. Furthermore, the controlclock signals have to be distributed on the chip using additional paths. Theiterative nature of the second algorithm is a source of another disadvantage.The adaptation of neurons in each following ring can be undertaken only afterthe adaptation in the preceding ring has been completed. This is the source ofa delay that significantly slows down the adaptation phase. For the comparison,in case of our hardware realization of the classic algorithm adaptation in allneurons is performed fully in parallel [1, 2]. Eventually this is the reason why inall reported transistor-level realizations of SOMs, the classic algorithm is beingused. To overcome the described problems and to make the realization of thefisherman algorithm more efficient we modified the update formula as follows:

Wd(l + 1) = Wd(l) + αd[Wd−1(l)−Wd(l)] (4)

172


In comparison with the original algorithm, in which each successive ring ofneighbors uses the weights Wd−1(l + 1) i.e. adapted in a given learning cycle,in the modified algorithm, here we use the weights Wd−1(l) from the previouscycle. As a result, the neurons at each following ring do not need to wait untilthe adaptation at the preceding ring has been completed. This makes the overalladaptation process approximately R times faster, where R is the neighborhoodrange in a given epoch. For small values of the η0(k) ·G() term (that is a typicalcase for larger values of l) and when the GNF or the TNF is being used, thismodification does not have the negative impact on the learning algorithm.

2 System level investigations

To determine the influence of the training algorithm as well as the NF on thelearning process we completed a series of simulations using the software modelof the NN. The comprehensive simulations have been carried out for the mapsizes varying in-between 8x8 and 32x32 neurons and for different values of theinitial neighborhood size, Rmax. In hardware implementation, this parameterRmax exhibits the main influence on the circuit complexity [1]. In what follows,we report on some selected results which can be regarded as being representativeto the overall suite of experiments.

The learning process has been assessed using five criteria described in [4].They allow for thorough evaluation of the quality of the vector quantization,and the topographic mapping. The quantization quality is assessed using twomeasures. One of them is the quantization error defined as follows:

Qerr =1

m

m∑j=1

√√√√ n∑l=1

(xj,l − wi,l)2 (5)

In this formula, m is a total number of the input patterns, x denote particularelements of a given learning pattern X, and i stands for the winning neuron. Asecond measure used to assess the quantization quality is a percentage of deadneurons (PDN), which tells us about the ratio of inactive (dead) neurons versusthe total number of neurons.

The quality of the topographic mapping is assessed using three measures.The first one is the Topographic Error defined as follows:

ET1 = 1− 1

m

m∑h=1

l (Xh) (6)

This is one of the performance measures proposed by Kohonen [3]. The valueof l(Xh) equals 1 when for a given pattern X two neurons whose weight vectorsresemble this pattern to the highest extent are also direct neighbors in the map.Otherwise the value of l(Xh) equals 0. In the best case ET1 = 0.

The remaining two measures of the quality of the topographic mapping donot require the knowledge of the input data. In the second criterion, in the first

173


step, we calculate the Euclidean distances between the weights of an rth neuronand the weights of all other neurons. In the second step, we check if all p directneighbors of neuron r are also the nearest in the sense of the Euclidean distancecomputed in the feature space. To express this requirement in a formal manner,let us assume that neuron r has p = |N(r)| direct neighbors, where p dependson type of the topology of the map. The value of the denominator in (7) isdecreased, according to the reduced number of the neighbors in this case. Letus also assume that function g(r) returns the value equal to the number of thedirect neighbors that are also the closest to neuron r in the feature space. As aresult, the ET2 criterion for P neurons in the map can be defined as follows:

ET2 =1

P

P∑r=1

g(r)

|N(r)|(7)

The optimal value of ET2 is equal to 1. In the third criterion, around eachneuron r we construct a neighborhood in the feature space (Euclidean neighbor-hood) defined as a sphere with the radius:

R(r) = maxs∈N(r)

||Wr −Ws|| (8)

where Wr are the weights of a given neurons r, number of Ws are the weightsof its sth direct neighbors. Then we count the number of neurons, which arenot the closest neighbors of neuron r, but are located inside R(r). The ET3

criterion, with the optimal value equal to 0, is defined as follows:

ET3 =1

P

P∑r=1

| {s|s 6= r, s 6∈ N(r), ||Wr −Ws|| < R(r)} | (9)

The simulation results that illustrate the quality of the learning process com-pleted on the basis of the five criteria described above are shown in Figure 1.Data are divided into P classes (centers), where P is equal to the number of neu-rons in the map. Each center is represented by an equal number of the learningpatterns. The centers are placed regularly in the input data space. To achievecomparable results for different map sizes the input space is fitted to input data.For example, for the 8x8 map the inputs are in the range of 0 to 1, while for themap of size 16x16 in the range of 0 to 2, and so forth. As a result, in all cases thesmallest value of the Qerr equals 16.2e-3, while the optimal values of remainingparameters (PDN / ET1 / ET2 / ET3) equal 0/0/1/0, respectively. The resultsreported in Figure 1 show that for particular map sizes and particular NFs thebest results are achievable for different values of Rmax, usually small even forlarge maps. Note that the results for the modified algorithm are comparable orin some cases even better that those obtained for the original algorithm. Thisis visible in Fig. 1 (a, f) i.e., for small values of the learning rate η, as expected.This conclusion is important from the hardware realization point of view.

174


20

25

30

35

40

45

50

55

0 1 2 3 4 5 6 7

Qe

rr[1

0E

-3]

Rmax

Original fisherman (RNF)

Modified fisherman (RNF)

Original fisherman (TNF)

Modified fisherman (TNF)

PDN=45,31

ET1=0,853

ET2=0,095

ET3=49,86

PDN=6,25

ET1=0,059

ET2=0,936

ET3=1,28

PDN=7,81

ET1=0,066

ET2=0,940

ET3=1,37

PDN=6,25

ET1=0,047

ET2=0,929

ET3=1,14

PDN=4,69

ET1=0,137

ET2=0,764

ET3=6,14

PDN=6,25

ET1=0,169

ET2=0,788

ET3=7,00

PDN=3,12

ET1=0,162

ET2=0,80

ET3=5,4820

22

24

26

28

30

32

34

36

38

40

0 1 2 3 4 5 6 7

Qe

rr[1

0E

-3]

Rmax





PDN=26,56

ET1=0,8

ET2=0,1

ET3=49,80

PDN=3,12

ET1=0,075

ET2=0,924

ET3=1,53

PDN=6,25

ET1=0,091

ET2=0,907

ET3=1,56

PDN=4,69

ET1=0,041

ET2=0,940

ET3=1,19

PDN=3,12

ET1=0,087

ET2=0,921

ET3=1,39

PDN=10,9

ET1=0,125

ET2=0,786

ET3=5,50 PDN=1,56

ET1=0,056

ET2=0,893

ET3=2,00

(a) (b)

20

25

30

35

40

45

50

55

0 2 4 6 8 10 12 14

Qe

rr[1

0E

-3]

Rmax





PDN=46,8

ET1=0,933

ET2=0,024

ET3=220,9

PDN=0,78

ET1=0,063

ET2=0,911ET3=1,83

PDN=1,17

ET1=0,085

ET2=0,90ET3=2,02

PDN=1,17

ET1=0,048

ET2=0,877ET3=2,39

PDN=5,07

ET1=0,207

ET2=0,628ET3=16,78

PDN=5,86

ET1=0,216

ET2=0,656ET3=12,38

PDN=5,86

ET1=0,231

ET2=0,604ET3=21,30

PDN=3,52

ET1=0,249

ET2=0,618ET3=17,38 20

25

30

35

40

45

0 2 4 6 8 10 12 14

Qe

rr[1

0E

-3]

Rmax





PDN=28,5

ET1=0,928

ET2=0,027

ET3=214,8

PDN=0,78

ET1=0,063

ET2=0,917ET3=1,81

PDN=1,56

ET1=0,052

ET2=0,901ET3=1,83

PDN=0,39

ET1=0,056

ET2=0,911ET3=1,75

PDN=0,39

ET1=0,077

ET2=0,895ET3=2,34

PDN=1,17

ET1=0,052

ET2=0,904ET3=1,93

PDN=1,17

ET1=0,040

ET2=0,914ET3=1,86

PDN=3,52

ET1=0,186

ET2=0,60ET3=16,73

(c) (d)

18202224262830323436384042444648505254

0 5 10 15 20 25 30

Qe

rr[1

0E

-3]

Rmax





PDN=43,7

ET1=0,984

ET2=0,008ET3=895,8

PDN=4,98

ET1=0,227

ET2=0,588ET3=22,66

PDN=5,37

ET1=0,089

ET2=0,749ET3=11,29

PDN=5,27

ET1=0,259

ET2=0,589ET3=20,75

PDN=5,27

ET1=0,164

ET2=0,637ET3=16,76

PDN=3,61

ET1=0,125

ET2=0,713ET3=9,67

PDN=3,91

ET1=0,127

ET2=0,727ET3=8,89

PDN=3,71

ET1=0,139

ET2=0,688ET3=13,54

PDN=3,12

ET1=0,134

ET2=0,695ET3=11,86

PDN=3,81

ET1=0,169

ET2=0,641ET3=27,24

20

22

24

26

28

30

32

34

36

38

40

42

44

0 5 10 15 20 25 30

Qe

rr[1

0E

-3]

Rmax





PDN=32,3

ET1=0,987

ET2=0,010ET3=892,8

PDN=3,32

ET1=0,190

ET2=0,637ET3=16,28

PDN=4,78

ET1=0,251

ET2=0,590ET3=20,67

PDN=4,20

ET1=0,271

ET2=0,579ET3=23,95

PDN=3,71

ET1=0,217

ET2=0,596ET3=18,89

PDN=4,69

ET1=0,197

ET2=0,588ET3=17,63

PDN=2,64

ET1=0,084

ET2=0,741ET3=9,46

PDN=5,37

ET1=0,255

ET2=0,575ET3=17,97

PDN=3,90

ET1=0,230

ET2=0,585ET3=16,8

PDN=4,98

ET1=0,176

ET2=0,630ET3=17,79

(e) (f)

Fig. 1: Simulation results for: (a,b) 8x8, (c,d) 16x16, (e,f) 32x32 neurons in themap, for (a,c,e) η = 0.9, (b, d, f) η = 0.3 at the beginning of the learning process

3 Hardware implementation

In this section, we only briefly present the idea of the proposed hardware real-ization of the fisherman algorithm. The new solution is based on a fully parallelSOM, described earlier in [1, 2]. In the proposed circuit, particular weight vec-tors Wj(l) are transferred from one ring of neighbors to the other in the sameway, as the signal R in our previous circuit [2]. The block diagrams of the origi-nal, as well as the modified fisherman algorithms realized in hardware are shownin Figure 2. The ADF and the ADFM blocks are responsible for the adapta-tion in both algorithms, according to (3) and (4), respectively. In the first case(top diagram) the multiphase clock {ck0, clk1, clk2, ...} is used to control the

175


ADFADFW (l+1)

iADF

W (l+1)1ADF

W (l+1)2W (l+1)i

(l )X

W (l+1)1d=1

d=0

d=1 d=2

ck0ck1 ck1 ck2

(l )X

W (li

)ADFMADFM

W (li

)ADFM ADFM

W (l1 ) W (l2 )d=1 d=2d=1

d=0

W (l1 )

ck

Fig. 2: Block diagram of the SOM realized in hardware with: (top) the originalfisherman algorithm and (bottom) the modified algorithm.

adaptation sequence as described earlier. The duration of a single clock phaseshould be sufficient to enable adaptation of all weights of particular neurons ina given ring. In the CMOS 0.18 µm technology, this time equals approximatelyn · 5 ns, where n is the number of the network inputs. As a result, the overalladaptation process in a single cycle takes R · n · 5 ns. In the second (bottom)case all neighbors are adapted in the same time (the ck clock phase), and theoverall adaptation process in all neurons takes only n · 5 ns.

4 Conclusions

A very fast hardware realization of the fisherman learning algorithm of the SOMhas been proposed. To make such realization feasible, we have introduced somemodifications to the original algorithm. The simulations performed with thesoftware model of the SOM show that both in the original and the modifiedalgorithms the learning quality is comparable. The modified algorithm allowsfor significant reduction of the circuit complexity, as well as a fully paralleladaptation of all neighboring neurons. As a result, a single learning cycle takesonly 100 ns in the CMOS 0.18µm technology. As a result, the SOM with thefisherman learning rule is able to achieve the throughput of even 10 millionpatterns per second, independently on the size of the map. In a nutshell, thisresult implies that such NN can be even thousand times faster than the SOMimplemented on a standard PC, while consuming 50-100 times less energy.

References

[1] R. D lugosz, M. Kolasa, “CMOS, Programmable, Asynchronous Neighborhood MechanismFor WTM Kohonen Neural Network”, 15th International Conference Mixed Design ofIntegrated Circuits and Systems (MIXDES), Poznan, Poland, June 2008, pp. 197–201

[2] R. D lugosz, M. Kolasa, W. Pedrycz, ’‘Programmable Triangular Neighborhood Functionsof Kohonen Self-Organizing Maps Realized in CMOS Technology”, European Symposiumon Artificial Neural Networks (ESANN), Bruges, Belgium,2010, pp.529-534

[3] T. Kohonen, Self-Organizing Maps, third ed. Springer, Berlin, 2001

[4] J. A. Lee, M. Verleysen, “Self-Organizing Maps with Recursive Neighborhood Adapta-tion”, Neural Networks, Vol. 15, Issues 8-9, October-November 2002, pp. 993–1003

176


fisherman learning algorithm of the som realized in the ... · elements of a given learning pattern...

Documents