fractal-clustering analysis of video information

5
Fractal-clustering analysis of video information I. K. Kokhanenko a Rostov Military Institute of Rocket Forces, Rostov-on-Don Submitted February 11, 2010 Opticheski Zhurnal 77, 47–53 August 2010 A basis is provided for an image-analysis algorithm for automatic recognition, considered in this article from the viewpoint of formalizing the process of distinguishing clusters either of constant or similar signal intensity, their number and size, and two-level decision-making. Such an approach is based on the fractal hypothesis, which consists of assuming that the number of constant-intensity pixels in a cluster has a power dependence on its rank and that the pixels in the cluster itself have a fractal spatial distribution. The rationale of the hypothesis is associated with harmony of the variability and stability of the image as a system. The features of a two-level algorithm for automatic recognition by comparison with a standard are considered. An example of image recognition is presented, using an observed optical pattern. The features of such an approach when video information is analyzed on the basis of Kohonen patterns are studied. © 2010 Optical Society of America. INTRODUCTION It is shown in experimental psychology that it is not the brightnesses of objects that is most informative in image recognition, but the characteristics of their boundaries—the contours. This assumption is used in many algorithms for analyzing video information, in morphological analysis, in automatic segmentation of textured images, and in self- organized Kohonen cluster patterns. In this case, symmetries are found in the measurement information that are expressed in the invariant shape of an object against many transforma- tions typical of the measurement conditions and that result in clusters of invariants—regions that are homogeneous in the characteristic attributes. These are the main external marks of an object; they are associated with constant properties and determine its shape. Changes of the optical properties color and texture and illumination are typical in many transforma- tions. In situations where the optical properties and illumina- tion are locally homogeneous, clusters of constant brightness can be regarded as invariant relative to the indicated changes and the basis for recognition. Assuming that the resulting images have such invariants, it is possible to represent a model of an image in the form of a two-dimensional array of R = N x N y pixels, each line of which is given by a sequence of pairs of numbers: The in- tensity I of the received signal and the number n of elements of the line of the array having intensity I = I 1 , n 1 , I 2 , n 2 ,..., I m , n m , n i = N x . Such a model of the measured image coincides with the following piecewise- constant description of it: f x, y = I i i x, y, i = 1,2, ... m , x, y A i i x, y = 1, x, y A i i x, y =0, 1 where m is the number of clusters A i with constant intensity; here, one of the clusters is background: I m m x , y. A fairly complete description of the model is given in Refs. 1 and 2. Next, if a morphological approach is followed, the shape of the object is constructed in the form of projector P f g—the projection of image g onto a set of shapes V f or the rule by which image g from Hilbert-function space L 2 X is put into accordance with some image = P f g. According to Refs. 1 and 2, by the shape of the image is meant a set of images with ordered intensity V f = f x, y = I i i x, y + I m m , x, y X,- I m I i , i =1,2, ..., m -1 . 2 The desired intensional rule for recognition must reveal the image of the set V f that would be the closest to image g. In this case, if = P f g = g , 3 it is assumed that the object is recognized—i.e., recording conditions can be chosen under which the object generates image g. Set V f is assumed to be convex and closed—i.e., projector P f is defined from the condition of minimum dis- tance from the image to the shape, and this results in a pro- cedure by which P f is constructed by averaging the image in each cluster of constant intensity. 1 Thus, the mean intensity that is considered constant in each cluster is found in it. But in this case, in the context of the image-recognition problem, it is important to know how this constant intensity is distrib- uted in each cluster and how many pixels lie in different subregions of it. Obviously, such a pixel distribution in a cluster is an image characteristic. The recognition device therefore loses information by not learning such characteris- tics. Such an algorithm still contains one indeterminacy: It is not clear how to determine the clusters of constant intensity A i , how many there should be, and what their sizes are. Such questions should usually be answered as follows: Assuming that the image regions of identical intensity are formed by modules of the object with identical physical optical, acous- tic, etc. and geometrical properties, one can specify clusters 499 499 J. Opt. Technol. 77 8, August 2010 1070-9762/2010/080499-05$15.00 © 2010 Optical Society of America

Upload: i-k

Post on 03-Oct-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Fractal-clustering analysis of video information

Fractal-clustering analysis of video information

I. K. Kokhanenkoa�

Rostov Military Institute of Rocket Forces, Rostov-on-Don�Submitted February 11, 2010�Opticheski� Zhurnal 77, 47–53 �August 2010�

A basis is provided for an image-analysis algorithm for automatic recognition, considered in thisarticle from the viewpoint of formalizing the process of distinguishing clusters either of constantor similar signal intensity, their number and size, and two-level decision-making. Such an approachis based on the fractal hypothesis, which consists of assuming that the number of constant-intensitypixels in a cluster has a power dependence on its rank and that the pixels in the cluster itselfhave a fractal spatial distribution. The rationale of the hypothesis is associated with harmony ofthe variability and stability of the image as a system. The features of a two-level algorithmfor automatic recognition by comparison with a standard are considered. An example of imagerecognition is presented, using an observed optical pattern. The features of such an approach whenvideo information is analyzed on the basis of Kohonen patterns are studied. © 2010 OpticalSociety of America.

INTRODUCTION

It is shown in experimental psychology that it is not thebrightnesses of objects that is most informative in imagerecognition, but the characteristics of their boundaries—thecontours. This assumption is used in many algorithms foranalyzing video information, in morphological analysis, inautomatic segmentation of textured images, and in self-organized Kohonen cluster patterns. In this case, symmetriesare found in the measurement information that are expressedin the invariant shape of an object against many transforma-tions typical of the measurement conditions and that result inclusters of invariants—regions that are homogeneous in thecharacteristic attributes. These are the main external marksof an object; they are associated with constant properties anddetermine its shape. Changes of the optical properties �colorand texture� and illumination are typical in many transforma-tions. In situations where the optical properties and illumina-tion are locally homogeneous, clusters of constant brightnesscan be regarded as invariant relative to the indicated changesand the basis for recognition.

Assuming that the resulting images have such invariants,it is possible to represent a model of an image in the form ofa two-dimensional array of R=Nx�Ny pixels, each line ofwhich is given by a sequence of pairs of numbers: The in-tensity I of the received signal and the number n of elementsof the line of the array having intensity I= ��I1 ,n1� , �I2 ,n2� , . . . , �Im ,nm��, �ni=Nx. Such a model ofthe measured image coincides with the following piecewise-constant description of it:

f�x,y� = � Ii�i�x,y�, i = 1,2, . . . m ,

�x,y� � Ai → �i�x,y� = 1, �x,y� � Ai → �i�x,y� = 0,

�1�

where m is the number of clusters Ai with constant intensity;here, one of the clusters is background: Im�m�x ,y�. A fairlycomplete description of the model is given in Refs. 1 and 2.

499 J. Opt. Technol. 77 �8�, August 2010 1070-9762/2010/08

Next, if a morphological approach is followed, the shapeof the object is constructed in the form of projector Pfg—theprojection of image g onto a set of shapes V�f� or the rule bywhich image g from Hilbert-function space L2�X� is put intoaccordance with some image �= Pfg. According to Refs. 1and 2, by the shape of the image is meant a set of imageswith ordered intensity

V�f� = � f�x,y� = � Ii�i�x,y� + Im�m,�x,y� � X,− � � Im

� Ii � �,i = 1,2, . . . ,�m − 1�� . �2�

The desired intensional rule for recognition must revealthe image of the set V�f� that would be the closest to imageg. In this case, if

� = Pfg = g , �3�

it is assumed that the object is recognized—i.e., recordingconditions can be chosen under which the object generatesimage g. Set V�f� is assumed to be convex and closed—i.e.,projector Pf is defined from the condition of minimum dis-tance from the image to the shape, and this results in a pro-cedure by which Pf is constructed by averaging the image ineach cluster of constant intensity.1 Thus, the mean intensitythat is considered constant in each cluster is found in it. Butin this case, in the context of the image-recognition problem,it is important to know how this constant intensity is distrib-uted in each cluster and how many pixels lie in differentsubregions of it. Obviously, such a pixel distribution in acluster is an image characteristic. The recognition devicetherefore loses information by not learning such characteris-tics.

Such an algorithm still contains one indeterminacy: It isnot clear how to determine the clusters of constant intensityAi, how many there should be, and what their sizes are. Suchquestions should usually be answered as follows: Assumingthat the image regions of identical intensity are formed bymodules of the object with identical physical �optical, acous-tic, etc.� and geometrical properties, one can specify clusters

4990499-05$15.00 © 2010 Optical Society of America

Page 2: Fractal-clustering analysis of video information

Ai from the placement of homogeneously reflecting faces orboundaries relative to the observer. However, such a solutioncannot always be physically implemented. Moreover, theconstancy of the intensity is often fairly conventional—as arule, the desired constancy approximately represents someset of intensities in a comparatively small region.

Similar questions characterize the algorithms imple-mented in neural networks, particularly in patterns in layersof Kohonen neurons. There the clusters are not regions ofconstant intensity, but regions with a similar �close-lying�spatial distribution of the intensities �attributes�. Questions ofthe initial discrimination of clusters on the image—on whatgrounds and how many—remain open in the recognitionproblem in this case. Grounds are needed for designating anapproximate number of clusters, without which the structureof the Kohonen layer could not be synthesized. It is impor-tant in recognition to know the fractal dimension of theintensity-intensity-rank space or the number-of-input-vectors—cluster-rank space for each cluster. Without such anestimate, a neural-network recognition device loses informa-tion.

I. THE FRACTAL HYPOTHESIS AND THE IMAGE SHAPE

To overcome the contradictions noted above, it is expe-dient to supplement the I�x ,y� dependence with the connec-tion of the intensity with the number of pixels in region Ai.Actually, for many real objects, each cluster Ai will have anumber of pixels ni of constant intensity Ii that characterizesit. It is thus necessary to find the ni�Ii� dependence. Takinginto account that the image shape is associated with it, it isnatural to seek the dependences in the midst of systems thatpossess the properties of variability and stability, since vari-ability of the shape is a necessary condition for the complete-ness of adequate verification according to the rule given byEq. �3�, while stability is a necessary condition for the con-servation of its characteristics and observability. The conceptof variability is usually associated with entropy and informa-tion. In the equilibrium state �the extremum of entropy�, asthe only resource �effort�, the problem on the conventionalextremum, depending on the character of the limitations ofthe shape by the resources, leads to various rank distribu-tions: Gibbs, Zipf–Pareto, “broken rod,” and MacArthur.3

This article gives an example of only the Zipf–Pareto distri-bution pI=�I−1/d−1, where it is assumed that the requirementsof clusters in the resource are proportional to ln I. For ex-ample, in accordance with Fechner’s physiological law, therequirement of efforts on the reproduction of a word of rankI in a text is proportional to ln I, and therefore the frequencyof the appearance in a text of a word of rank I is fractallyassociated with the rank. Here �= p1, p1� p2� . . . � pm, m isthe number of clusters �forms� in the system �the message�, din thermostatics is treated as a quantity inverse to the tem-perature, and d in the Zipf–Pareto law is the fractal dimen-sion.

It is usually assumed that, the greater the variability ofthe system, the larger are the clusters in it and the greater theuniformity E= �mpi / ln pi� / i, which is the ratio of the observ-able variability to the maximum value—i.e., the uniformity

500 J. Opt. Technol. 77 �8�, August 2010

of the distribution of clusters in the system �E=1 when thedistribution of the clusters is even�. Here the quantitympi / ln pi is known as the Shannon index, and the typicalrange of its variation is 1.5–4.5.

An evaluation of the properties of these laws �density,entropy, uniformity, Shannon index� shows that the greatestvariability and consequently perfection and stability is pos-sessed by systems whose structure is associated with theZipf–Pareto and MacArthur distributions. Moreover, systemscharacterized by the latter distribution have the greatest vari-ability. While the former law has a known direct relationshipto fractal structures, this is not obvious relative to the Mac-Arthur law. By studying the mechanism by which the Mac-Arthur distribution is formed, it can be shown that it reducesto the Zipf–Pareto rank distribution x�r��r−, =1 / �d−1�;i.e., the MacArthur law also belongs to the class of fractaldistributions.

The resulting solutions are confirmed with respect tovariability when the ideas of information theory are used.The character of the perception of the signals of the externalmedium by the system obviously depends upon the band-width of the signal frequencies to which the system and itselements are sensitive. In this case, the wider the frequencyband, the greater not only the variability, but also the errorsin perceiving the signals of the external medium, and thegreater the probability that the system will be unstable. Thus,not only the variability of the system—i.e., the number ofclusters and elements in them—but also the inadequacy of itsbehavior, and consequently the probability of its degradation,depend on the bandwidth of the signal frequencies perceivedby it and by its clusters and elements. In the context of thematter being studied, it is interesting to compare the trans-mission bandwidth of the rank-of-a-cluster-number-of-pixels-of-identical-intensity system in it with various distri-bution laws. To do this, one must clarify the concept of thetransmission bandwidth as applied to the cluster model of theimage studied here. The frequency associated with the fractaldimension d and rank r of the cluster depends on the Zipf–Pareto rank distribution as follows:3 r=C�exp�1 /d−1��−ln r, C=const. It is assumed in this case that the depen-dence of the number of pixels in a cluster and the frequencyis linear. From this, the Zipf–Pareto rank distribution can bewritten in the form n�r�=n�1�r−1/d−1, 1 /d−1=ln �, and �= �r /r+1�−ln�r+1�. The frequency ratio of clusters i and �i+1� for the Zipf–Pareto law equals �2=�ln�r+1�, �=exp�1 /d−1�. Similarly, for the Gibbs law �ratio �1� and MacArthurlaw �ratio �3�: �1=�, and �3=ln�r+1�. It follows from thesedependences that, for ranks r 2, the width of the frequencyband is greatest in the Zipf–Pareto law and smallest in theMcArthur law. This agrees with the character of the variabil-ity of the systems characterized by the distribution laws stud-ied here: the fact that a system is fractal promotes its stabilitybecause it ensures a high degree of variability and an accept-able noise level at a reasonable transmission width.

Thus, systems that obey the Zipf–Pareto law have sec-ond rank with respect to variability for the largest transmis-sion band. Systems that obey the MacArthur law have firstrank in the transmission band and the greatest variability.

500I. K. Kokhanenko

Page 3: Fractal-clustering analysis of video information

Systems that obey the Gibbs laws have a moderate frequencyband and the smallest variability. It is shown above that sys-tems that obey the MacArthur and Zipf–Pareto laws belongto the class of fractal systems. Consequently, the fractal na-ture of a system promotes its stability because it provides ahigh degree of variability and acceptable noise for a reason-able transmission width. Therefore, for a stably evolving sys-tem, it is natural for it to have a fractal structure. Conse-quently, it is reasonable to choose a fractal as the dependencebetween the size of cluster Aj �the number nj of pixels ofconstant intensity� and the corresponding constant intensitiesIj; i.e.,

nj�Ij� = �A�/Ijd, d = � + 1, �4�

where the parameters are determined from the relationship inRef. 4: For dimension d, this is

d1 = ln�G/A� , �5�

G is the geometrical mean of the intensity observations, ex-cluding the background; for A, this is A1

=min�I1 , I2 , . . . , Im−1�.Such is the meaning of the fractal hypothesis that deter-

mines the choice of the number of clusters and their size.Now Eq. �2� for the set of image shapes takes the form

V�f� = � f�x,y� = � ��A�/nj�−d + Im�m,�x,y� � X,d � 0, j

= 1,2, . . . ,�m − 1�� , �6�

�=d−1. Again we obtain the shape of a piecewise-continuous image, except in the form of a fractal. Conse-quently, to determine projection Pfg, it is necessary to aver-age the observable image g�x ,y� in each of the clusters Ai ofconstant intensity of image f .1,2 It can be seen from Eqs. �5�and �6� that the set of image shapes has a single parameter—the fractal dimension d. It is necessary to estimate d for theresulting projection Pfg; i.e., to estimate the fractal dimen-sion of the mean-intensity—number-of-corresponding-pixels-in-the-cluster system. Equality of the resulting fractaldimension to this shape parameter will be equivalent to fol-lowing the rule of Eq. �3�, i.e., equivalent to bringing theobserved image into coincidence with the standard. Actually,I1� I1�A ,d ,n1�, I2� I2�A ,d ,n2� , . . ., Im−1� Im−1�A ,d ,nm−1�are �m−1� equations in two unknowns �A ,d�; if it is recalledthat A=min�I1 , I2 , . . . , Im−1�, we have �m−1� equations in oneunknown �d�. Obviously, for identical intensities in the clus-ters of the reference and observed images, their fractal di-mensions can be equal only when the number of pixels ni isequal.

II. ALGORITHMIC FEATURES OF THE CLUSTERINGPROCEDURES

According to the fractal hypothesis, it is more conve-nient to solve the recognition problem in the following se-quence: When an image is represented by an array with grey-scale elements, the scale is partitioned into intervals�clusters� of intensity values in the range 1–256, on whichthe intensity is taken to be equal to the corresponding meanvalue—for example, in the interval 180–256, the mean inten-

501 J. Opt. Technol. 77 �8�, August 2010

sity equals 218. By developing the fractal hypothesis, it ispossible to determine a regular procedure of designating theranges of the indicated clusters. Actually, when studying howthe greatest intensity Im in a cluster depends on the rank r ofthe cluster, the use of the factors mentioned in Section Irelative to the transmission bandwidth of systems with dif-ferent distribution laws in terms of the greatest informationcontent makes it expedient to use a hyperbolic dependence—i.e., the Zipf–Pareto law. When a grey scale is used, thisdependence has the form Im=256 /rd; i.e., the greater the in-tensity, the more significant the size of the cluster. Clusters ofconstant intensity—for example, with d=1—can haveboundaries 256–128, 128–85, 85–64, and 64–51. Fractal di-mension d is determined starting from the physically basedvalues of the signal intensity for the background If and thenumber m of clusters; i.e., d= �ln 256−ln If� / ln m. The frac-tal dimension of the system of clusters with the correspond-ing constant intensities is then calculated. As shown earlier,the coincidence of the clusters and of the fractal dimensionsof the shape and the projection is evidence that Eq. �3� issatisfied, i.e., that the observed image coincides with thestandard. However, the model of Eq. �6� on the basis of thefractal hypothesis can be extended by making a detailedanalysis. Actually, it is expedient to structure the connectionI�n� of the intensity with the number of pixels in a cluster,described by Eq. �6�, to take into account the natural nonuni-formity of the placement of the pixels of constant intensity ineach cluster, as pointed out in the introduction. If the sepa-rate nonintersecting subranges are distinguished in the clus-ter, the number of pixels of constant intensity in each of themis ranked; i.e., n�r�. After this, Eq. �6� is transformed into amore detailed model—a second-level model, including �m−1� models �forms� of the clusters,

nj�Ij� = �nj�Ij�x,y�� = � rjk−1/dj−1�A/dj − 1�1/dj−1,�x,y�

� X, j = 1,2, . . . ,�m − 1�,k = 1,2, . . . ,mj� , �7�

where mj is the number of subranges in a cluster.To perfect the general recognition result, the projectors

Pfgj are determined for each of the �m−1� clusters, and theirfractal dimensions dj are estimated. The projectors of theclusters are found as follows: Intervals �x and �y are distin-guished in a cluster along the x and y axes. In each intervalof the jth cluster �j=1,2 , . . . ,m−1�, the number of pixels ofintensity njk, k=1,2 , . . . ,mj that appear in it is found, wheremj is the number of subranges in cluster j. The subranges areranked in the order of decrease of the number of pixels of agiven intensity. Next the fractal dimension of the cluster isestimated from the known rank relationship for the Zipf law

n�r� = B/r1/d−1, B = �A/d − 1�1/d−1,

from which it is easy to find d=dj by constructing thestraight line

ln n = ln B − �1/d − 1�ln r . �8�

Here r is the rank of the subregion in the cluster, and A is thelowest intensity.

501I. K. Kokhanenko

Page 4: Fractal-clustering analysis of video information

Using the fractal dimensions d and dj �j=1,2 , . . . , �m−1�� thus found, a solution is adopted on the basis of thetheoretical results of dynamic fractal geometry.4,5 The mean-ing imposed on the definition of dynamic is as follows: It isassumed in the theory of fractals that fractal objects displayspatial self-similarity, while dynamic fractals �fractal timeseries� have statistical self-similarity in time. The generalizedstudy of self-similarity is based on replacing temporal byspatial dynamics, in the kinetic equations of which the de-rivatives with respect to time are replaced by derivativeswith respect to the spatial coordinates. It is shown in Ref. 5that the anomalous diffusion equations of subdiffusion andsuperdiffusion types in divergent form, modified by replac-ing time by the x coordinate, have an isocoordinate ��f /�x=0� solution of the form f �h−d, corresponding to the maxi-mum entropy, where d is the fractal dimension, and h is acharacteristic of the surface �the image�. Now, following Ref.4, a sequence of equations of the form �f1 /�x=af1, �f2 /�x=bf2 can be written, from which the fractal f1� f2

−d is ob-tained, where f2=h and d=a1 /a2. It is impossible not to sub-stitute such a sequence in the case studied here, when thenumber ni in the ith cluster of constant brightness Ii equalsni� Ii

−d, while, on the other hand, the number of pixels is thesum ni��Bj /rj

1/di−1 �the summation over index j is carriedout from j=1 to j=k, where k is the number of intervals inthe cluster�. We therefore get that

Ii−d � � Bi/rj

1/di−1, �9�

where i=1, . . . ,m−1, and rj is the rank in the jth interval ofthe ith cluster.

The fact that the fractal dimensions of the clusters of thestandard and the observed image are equal is evidence thatEq. �3� is valid—i.e., that the observed image coincides withthe standard. An additional intensional analysis is required todistinguish some or all of them. These distinctions can beuseful in recognition, the visualization of anomalies, defectsin microelectronics, stored systems, and other applications.Equation �9� is evidence of the functional dependence offractal dimensions of a d-image and of di-clusters of constantintensity. This indicates the inadequacy of the conclusionthat the standard and the observed images coincide otherthan in their fractal dimension, Eq. �3�. To obtain a morereliable solution, it is necessary to estimate the fractal dimen-sions of the clusters.

A morphological analysis of an image on the basis of thefractal hypothesis thus includes two levels. The first is basedon the basic Eq. �6� and a comparison of the fractal dimen-sions of the shape and of the observed image; ultimately, thesecond level assumes the comparison of the rank fractal di-mensions of clusters of constant intensity, Eq. �7�. On eachlevel, a decision can be made concerning recognition; thereliability of the solution obviously increases as the levelincreases.

As shown above, the number of clusters depends on thebackground intensity If and the fractal dimension. The num-ber of clusters in the range d=1–3 and If =10–50 variesapproximately from 2 to 25. A quantitative study of a set ofimages shows that one or two clusters are sufficient to reflect

502 J. Opt. Technol. 77 �8�, August 2010

the specifics of the image, and as a rule no more than five.The fifth cluster in the grey scale corresponds to a constantintensity equal to about 20. If the classical Shannon formulais used, it is found that 0.75 of the information concerning animage contained in five clusters is contained in three clusters,and 0.15 is found in one cluster. This is evidence that theimage contains a nucleus—a one, two, and five-brightnessnucleus. Relative to variations of the optical properties, thenucleus is invariant, describing the visible pattern. The indi-cated invariant is the main invariant, most substantial in theimage, and is a compressed form of the image. Such an as-sertion is most adequate for homogeneous images, i.e., im-ages that have small essential elements �for example, theface—the nose, eyes, and ears�. This is confirmed by Fig. 1,which shows for comparison two versions of the image ofthe same object. If there are several essential elements in theimage, significant information is lost when the number ofclusters is decreased.

III. FEATURES OF NEURAL-NETWORK ANALYSIS

Video information, for which it is hard to use references,is often interpreted by using various clustering algorithms.The problem of neural-network analysis of video informationis considered here as applied to Kohonen layers,6 where theclusters are not regions of constant intensity but regions withsimilar �close-lying� spatial distributions of intensities �at-tributes�. In this case, as before, questions of the initial dis-crimination of clusters on the image remain in the analysis ofthe images. What is more, in the context of image recogni-tion, it is important to know the fractal dimension of theintensity–intensity-rank space in each cluster. A neural–network recognition device loses information if it does notestimate the indicated fractal dimension.

The presence of a nucleus of the image, noted in SectionII, makes it possible to compress the information by fractalclustering. Therefore, rather than forming the structure of aKohonen layer, it is preferable to carry out fractal clusteringof the image, using the morphological analysis technique de-scribed in Section I. In this case, because of the informationcompression characteristic of fractal clustering, the inputvectors of the Kohonen layer will subsequently contain asmaller number of intensity gradations, and this simplifiesthe neural clustering procedure that follows the fractal pro-cedure, reducing the number of training cycles.

The architecture of the Kohonen network synthesized inaccordance with the training vectors and fractal clusteringcontains a certain number of clusters, in each of which, afterneural analysis by the neuron-winners, neurons occur that are

(а) (b)

FIG. 1. Image �a� and a copy of it �b� with three clusters.

502I. K. Kokhanenko

Page 5: Fractal-clustering analysis of video information

composed of a subset of input vectors of the image. In tra-ditional algorithms, the problem is completed at this point.However, it is obvious that such a conclusion usually has noresult for automatic image recognition, because there is noteacher in Kohonen networks. A teacher–standard is neededto recognize an image. Therefore, the topology of the Ko-honen pattern thus obtained needs to be analyzed in theimage-recognition problem. To do this, the fractal dimensiondni of the number-of-input-vectors-rank-of-the-cluster spaceis estimated in each cluster. Equation �8� is used in this case,with an accuracy up to the designations. The same problemis solved for the reference, and the fact that the fractal di-mensions of the reference and the image coincide is evidenceof recognition.

IV. AN EXAMPLE

The video information to be analyzed is represented inthe form of static digital black-and-white images �observedimage in Fig. 1 and reference image in Fig. 2� with dimen-sion 300�110 elements �pixels� and with N=256 intensitygradations �grey gradations�. The images are transformedinto the corresponding arrays.

A comparison of the reference image with the full-scaleimage of the object in Fig. 2 is not carried out in the three-cluster version, but it is largely similar to Fig. 1. The fractaldimension of the reference image in the three-cluster version,calculated from Eq. �8�, equals 3.1. Likewise, the fractal di-mension determined for the image to be recognized �Fig. 1�equals 2.4. It follows from this that the indicated images arenot identical. This is confirmed by the result of the secondlevel of analysis—estimates of the fractal dimensions of theclusters of two images to be compared. They were found tobe 2.5, 1.5, and 1.2 for the reference and 2.0, 1.7, and 1.6 forthe image to be recognized, respectively. It is obvious thatthey do not coincide.

The example illustrates that the algorithm is workableand that the fractal dimension of the spaces of the generatedclusters of invariants has high information content as a rec-ognition criterion. It does not make it possible to obtain anidea of such an important characteristic of the algorithm asthe probability of correct recognition on image samples. Theauthor has not obtained any analytic relationships for esti-mating such a probability. However, experience has been ac-cumulated in computer recognition of diverse images, usingthe proposed algorithm. This shows that the reliability of the

FIG. 2. Reference image in a three-cluster version.

503 J. Opt. Technol. 77 �8�, August 2010

solution in one-level analysis is about 60–80%. Experimentson recognition were not carried out for conditions of strongdistortions, where the aspect was unsuitable or the illumina-tion was poor. When one goes to two-level analysis, the re-liability increases by about 15–20%, depending on the num-ber of clusters and of pixels in them. Moreover, a 10%difference in the fractal dimensions of two images producesa relative distance of 45–15% between the images in themetric characteristic of Hilbert functional space as the num-ber of pixels varies from 3 to 100. These results are obtainedfrom thirty experiments for each type of images �terrain,technical objects, people�.

CONCLUSION

The proposed procedures for automatic two-level clusterimage analysis are based on the well-substantiated hypoth-esis that the number of constant-intensity pixels in clusters ofinvariants exhibits a power dependence on the rank and frac-tal spatial distribution of the pixels in each cluster. In thiscase, unlike traditional approaches, it becomes possible toformalize the estimate of both the number of clusters andtheir size, while the study of the spatial distribution of con-stant intensity in a cluster and the estimate of the fractaldimension of such a distribution increase the informationcontent of the analysis. The same parameter is used in therecognition criterion accompanying analysis,: the fractal fre-quency or rank dimension �depending on the available data�.

Since the proposed procedures can be used not only inmorphological analysis but also in neural-network image rec-ognition, their features in self-organizing Kohonen patternsare discussed. The possibility of formalizing the estimatesboth of the number of clusters and of their dimension is alsoused here, and the content of the fractal spaces to be ana-lyzed changes by comparison with the morphological analy-sis.

The procedures developed here thus make it possible touse clusters of invariants for automatic image recognition,using the corresponding fractal dimensions as a measure ofsimilarity.

a�Email: [email protected]

1Yu. P. Pyt’ev, “Morphological image analysis,” Dokl. Akad. Nauk SSSR269, No. 5, 1061 �1983�.

2Yu. P. Pyt’ev and A. I. Chulichko, “Morphological image analysis: Com-parison in shape, recognition, classification, and estimation of param-eters,” in VTs RAN. Reports of the Eleventh All-Russia Conference onMathematical Methods of Pattern Recognition, 2003, pp. 415–418.

3Geography and the Monitoring of Biodiversity �Izd. Nauchn. Ucheb. Me-tod. Tsentra MGU, Moscow, 2002�.

4I. K. Kokhanenko, “Fractals in the evaluation of the evolution of complexsystems,” AiT No. 8, 54 �2002�.

5I. K. Kokhanenko, “Fractal nonparametric recognition,” Obozr. Prikl.Promysh. Matematiki 15, 704 �2008�.

6P. D. Wasserman, Advanced Methods in Neural Computing �Van NostrandReinhold, New York, 1993; Mir, Moscow, 1992�.

503I. K. Kokhanenko