method of automatically analyzing color images

ICONICS — THE SCIENCE OF IMAGES

Method of automatically analyzing color imagesT. Yu. Fisenko and V. T. Fisenko

Scientific-Design Center of Optoelectronic Observation Systems, Branch of TsNII Kometa, St. Petersburg~Submitted December 24, 2002!Opticheski� Zhurnal70, 18–23~September 2003!

This paper presents a method of automatically analyzing color television images. The method isbased on the unification of segmentation and clusterization. It makes it possible to takeinto account both the spatial and the color characteristics of the image. The number of clusters isnot predetermined but is computed during the processing in accordance with the informationcontained in the image. The boundary of a segment is determined to within a raster element, unlikefragmenting methods, in which the accuracy with which the boundary is determineddepends on the size of the fragment. The algorithm reduces feature space from 16 million toseveral tens of clusters. The algorithm is implemented as part of a software package for a televisioncomputational system for analyzing color images. ©2003 Optical Society of America

cc

ri

arthth

imloti

nah

in

ga

toldr’pivth

te

f

caani

arecal

ednors.

ne-

pthus-is-

co-tedine

saryantsuresibu-re-then,

f

-sE

au-t ittheanriza-thatthesteresm-

INTRODUCTION

The problem of analyzing color images includes suaspects as the description of color images, their classifition, i.e., the formation of clusters~in this case, by a clustewe usually mean a group of objects that form a regiondescription space that is compact in some sense!, and seg-mentation, i.e., partitioning the image into regions thathomogeneous relative to one or several characteristics orbelong to a certain cluster. The shape and distribution ofclusters when color images are segmented depend oncolor-coordinate space. The authors have analyzed colorages in various color-coordinate spaces. The choice of cocoordinate space is just as important as the segmentamethod proper.

For segmenting color images, Ohlanderet al.1 selectedthe quantization thresholds in terms of multi-dimensiohistograms of the distribution of the signal components. Timages were analyzed in terms of nine components obtafor the three color-coordinate spacesRGB, YIQ, andHSI(H is hue,S is saturation, andI is intensity!. The test subjectis a room with furniture for resting. The authors first semented the textured part of the scene and then, for the pof the scene in which there is no texture, segmented iterms of color on the basis of multi-dimensional threshlimitation. The model experiment showed that Ohlandesegmentation method is extremely efficient. The hue comnent is recognized by the authors as being more informat

Reference 2 proposed a clusterization method forsegmentation of color images in cylindrical coordina(L* H0C* ), obtained from the 1976 CIEL* a* b* space, inwhich L* determines the luminance of the light,a* deter-mines the ratio of red and green,b* determines the ratio oblue and green,H is the hue, andC is the saturation. In thisdescription, an element of a cluster is a circular cylindrisegment the surfaces of whose elementary volumeformed by given ranges of variation of the luminance achromaticity. Horizontal cuts are formed for constant lum

637 J. Opt. Technol. 70 (9), September 2003 1070-9762/2003/0

ha-

n

eate

the-

r-on

leed

-rtsin

so-e.e

s

lred-

nance; vertical cuts passing through the achromatic axisobtained at constant hue, while the parts of the cylindrisurfaces concentric relative to the achromatic axis are formwith constant saturation. The clusters are estimated withassumptions regarding the distribution laws of the clusteThe procedure of determining the clusters is as follows: Odimensional histograms of the distributions ofL* H0C* co-ordinates are constructed from the image. The mode dethat is larger is selected from these distributions. Two clters are selected from this distribution. One-dimensional htograms of the distributions over each of the remainingordinates are then constructed for each of the estimaclusters. The use of one-dimensional histograms to determthree-dimensional clusters reduces the number of necescomputations. The method of the Fisher linear discriminis used to separate the clusters. Such a procedure ensuccessful separation in terms of a one-dimensional distrtion, and this is an advantage of this method from that psented in Ref. 1. However, when segmentation is done bymethod of quantization of the histograms of the distributiothe clusters have a rectangular, less flexible shape.

Uchiyama and Arbib3 followed the recommendations o1973 CIE on the use of equicontrast color spaceL* u* v* ,which differs fromL* a* b* space in the chromaticity coordinates u* v* ~the luminances coincide in these space!.While the (a* b* ) are a nonlinear transformation of the CI(x,y), the (u* v* ) are connected with (x,y) by a lineartransformation. When they segment a color image, thethors use a training method without a teacher, proving thareduces to the clusterization method using the criterion ofminimum of the sum of squares of the error. With suchapproach, the shape of the cluster changes, with clustetion using the method of least squares of the errors, sothe entire space is broken up into regions belonging toclosest weight vector, and a more flexible shape of the cluis formed, as a Vorony� mosaic. Vector quantization assumthat the set of input vectors is partitioned into a certain nu

63790637-05$20.00 © 2003 The Optical Society of America

FIG. 1. Color-coordinate spaceHLS. The locusr 1g1b51 determines Maxwell’s triangle.P is a color ele-ment, W is gray with coordinatesr 5g5b51/3, P8 isthe point at whichOP intersects the plane of Maxwell’striangle.H5f (0<f<2p), S5WP8/WA (0<S<1).

ouaa

dendo

nd,em

ab

ee

a

as

the

-

ass

,

en

fiza-

ber of weight vectorsW i . It is proven in Ref. 3 that theclusterization problem using the criterion of the least sumsquares of the errors reduces to the problem of vector qtization. The clusterization algorithm is carried out forspecified cluster size and number of iterations and inclucompetitive training for forming the quantization vectors adetermining the clusters. The competitive training methrequires many iterations.

By combining the histogram-quantization method athe clusterization method in terms ofK-intragroup meanswe obtain a new method of analysis that makes it possibluse the advantages of each of these methods while miniing computer time.

CHOOSING THE COLOR-COORDINATE SPACE

Let us choose color-coordinate spaceHLS,4 in whichluminance componentL corresponds to the luminance ofblack-and-white image. The chromaticity is determinedhue H and saturationS ~Fig. 1!, computed in terms of thenormalized tristimulus values:

r 5R

R1G1B, g5

G

R1G1B, b5

B

R1G1B.

Striklandet al.4 gave expressions for estimatingHLS bydirectly transforming fromRGB space toHLS space, anddid not write formulas for the inverse transformation. Bason the methods of analytic geometry, the following inverstransformation formulas were obtained fromHLS space toRGB space:

x52S/311/3, z512~x1y!,

y5S

3

x cosf

cosf1cos~120°2f!1

1

3for 0°<f<90°;

y52S

3

x cosf

cos~120°2f!1

1

3for 90°<f<120°;

f5H, r 5y, g5z, b5x for H<120°;

f5H2120°, r 5x, g5y, b5z for H<240°;

f5H2240°, r 5z, g5x, b5y for H.240°.

With eight-bit representation of the components, the vues of the resulting components are in the [email protected]#.

638 J. Opt. Technol. 70 (9), September 2003

fn-

s

d

toiz-

y

d-

l-

CLASSIFICATION CRITERION

A method of automatically analyzing color images hbeen developed for the chosen color-coordinate space.

Let xj ( j 51,...,m) be a finite set of input vectors~x!, Ci

( i 51,...,n) be a finite set of clusters (C), and Wi ( i51,...,n) be a finite set of weight vectors~W!.

The center of mass of clusterCi under conditions ofequiprobable input vectors is given as

xi51

mi(

xj PCi

xj . ~1!

The sum of squares of the distances of the elements fromcenter of the cluster is determined from

Ei5 (xj PCi

uxj2 xi u25 (xj PCi

uxj u22mi uxi u2. ~2!

Let us define the partitioning function asxjPCi if, for anykÞ i , uxj2Wi u2,uxj2Wku2, and let us define the weightvector function as the center of mass of clusterCi :

W~C!5$xi / i 51,...,n%.

The sum of squares of the deviations from the center of mof the set of input vectors can be written as

E05(i 51

n

(xj PCi

uxj2 x0u2,

wherex0 is the center of mass of the set of input vectors

x051

m (j 51

m

xj .

The sum of squares of the intercluster deviations is givby the following equation:

EM5(i 51

n

uxi2 x0u2. ~3!

When the criterion of the minimum of the sum osquares of the errors is used, the quality index for clustertion has the form

EW5(i 51

n

(xj PCi

uxj2 xi u2,

E05EM1EW . ~4!

638T. Yu. Fisenko and V. T. Fisenko

of

t

tith

as

e

dmsthcoete

s of

ares

esorsbe-sd innot

orde-or-

n-een

oes

y.n-cu-s-

ors,

of

ofon.ingthem-

Since E0 is independent of clusterization, minimizationthe sum of squares of errors,EW→min, maximizes the sumof squares of the intercluster deviations, thereby ensuringbest separability of the clusters.

METHOD OF K-INTRAGROUP MEANS

The method ofK-intragroup means is as follows:5 LetCj be a cluster with a number of elementsmj>2. Let Ck bea nonempty subset ofCj : i.e., CkÞ0, CkÞCj , CkPCj . LetCp5Cj2Ck also be a nonempty set of the difference ofCj

andCk .The center of mass of clusterCp is defined as

xp5mjxj2mkxk

mj2mk. ~5!

The sum of squares of the distances of the elements fromcenter of clusterCp can be determined in accordance wEq. ~2!:

Ep5 (xiPCp

uxi u22mpuxpu25Ej2Ek2mjmk

mj2mkux j2 xku2.

~6!

In our case, when a decision needs to be made for each relement, the estimate is made in accordance with Eqs.~5!and ~6! for the number of vectors in clusterCk of mk51:

xp5mjxj2xk

mj21, ~7!

Ep5Ej2mj

mj21ux j2xku2. ~8!

For clusterCp formed by uniting the two clustersCj andCk ,Cp5CjøCk , provided thatCjùCk50, the correspondingvalues are given by the following equations:

xp5mjxj1mkxk

mj1mk,

Ep5Ej1Ek1mjmk

mj1mkux j2 xku2.

For our case, when clusterCk consists of one vector, thesequations take the form

xp5mjxj1xk

mj11, ~9!

Ep5Ej1mj

mj11ux j2xku2. ~10!

The algorithm of theK-intragroup means is constructeas follows: Using the initial partitioning, estimates are coputed of the centers of mass of the clusters and the sumthe squares of the deviations of the vectors belonging tocluster from the centers of mass of these clusters in acdance with Eqs.~1! and~2!, and the sum of the squares of thdeviations from the centers of mass of the clusters is demined over all the clusters. Then, for each vectorxi belong-ing to clusterCi , that clusterj Þr is sought for which thefollowing condition is satisfied:


he

he

ter

-ofer-

r-

mr

mr21uxr2xi u2.

mj

mj11ux j2xi u2.

If such a cluster turns out to be clusterCv , the sum ofsquares of the deviations of the vectors from the centermass of their clusters is reduced:

Er1En5Er2mr

mr21uxr2xi u21En1

mn

mn11uxn2xi u2.

New values of the centers of mass and of the sum of squof the deviations are computed from Eqs.~7!–~10! for clus-ters Cv and Cr , respectively. Such a transposition reducthe overall sum of squares of the deviations of the vectfrom the centers of mass of the clusters to which theylong. The classical algorithm ofK-intragroup means assumethat this process is iterated as many times as is requireorder that the sum of squares of the deviations doeschange for two successive iterations.

Studies were carried out for each of fifty different colimages. Figure 2 shows how the sum of squares of theviations of the vectors from the centers of the clusters, nmalized to the maximum value,E/Emax, depends on thenumber of iterationsnt. The reduction factor ofE changesfrom iteration to iteration, but the character of the depedence corresponds to that shown in the figure. It can be sfrom the figure that increasing the number of iterations dnot substantially reduceE.

The method ofK-intragroup means converges locallThe clusterization efficiency depends on the initial partitioing. Increasing the number of iterations improves the acracy of the partitioning mainly at the boundaries of the cluters, but does not substantially reduce the partitioning errsince it produces no new clusters.

Based on the results of the study, we limit the numberiterations tont51.

NEW METHOD OF ANALYSIS

The new method of analysis is based on a unificationsegmentation using threshold limitation and clusterizatiThe algorithm based on this method possesses the followadvantages: It makes it possible to take into account bothspatial and the color characteristics of the image. The nu

FIG. 2. Normalized intracluster errors vs the number of iterations.


eity;

FIG. 3. Automatic analysis algorithm.RGB→HLS is the unit for transforming color spaces;P(H), TL is the unit for estimating the histogram of the hucomponent and the threshold limitation in hue;K is the unit for carrying out the algorithm ofK-intragroup means, where the number of iterations is unSCR is a unit for selecting connected regions, HE is a unit for estimating the distribution histograms, TL is a threshold-limitation and histogram-quantizationunit; Md , CU is the unit for estimating the Mahalanobis measure and for cluster unification.

riniim

n

-ar

ho

uhe

onastho-

angththha

rth

r

calhe

bis

he

al-thes a

ca-ight

el-ight,the

s-ch

hessde-

oneonheI.

ber of clusters is not predetermined but is computed duprocessing in accordance with the information containedthe image being processed. The color components of theage are transformed fromRGB space toHLS space. A his-togram of the distribution of the hue componentp(H) isestimated. An initial cluster formation is produced by quatizing the p(H) histogram. The algorithm of theK-intragroup means is used withnt51. To decrease the clusterization errors, additional partitioning of the clusters is cried out by segmentation over the histograms of theR, G,and B components obtained for each cluster by the metof threshold limitation. The algorithm ofK-intragroup meansis used for the refined set of clusters. To reduce the reddancy of the clusterization, hierarchical unification of tclusters is used.5

To increase the efficiency of the algorithm at the stageestimating the histograms of the distribution of the sigcomponents, the spatial characteristic of the image is ualong with the color characteristics. Namely, to reduceinfluence of errors of the initial partitioning over the histgram of the hue component, the connected componentsselected for each cluster, all the connected regions havicertain given size are excluded from consideration, andhistogram is estimated only for the connected regions ofcluster exceeding this given size. When using the algoritof K-intragroup means, we estimated the sum of the intrluster errors inRGB space in accordance with Eq.~2!. Weestimate the distance between the clusters as the squareof the sum of squares of the intercluster distances ofcenters of mass of the components, given by Eq.~3!:

di j 5A~Ri2Rj !21~Gi2Gj !

21~Bi2Bj !2, ~11!

whereRi , Gi , andBi are theRGBcomponents of the centeof mass of clusterCi , and Rj , Gj , and Bj are theRGBcomponents of the center of mass of clusterCj .

TABLE I. Number of clusters determined from the automatic-cla


gn

-

-

-

d

n-

flede

reaee

mc-

oote

When we combine the clusters using the hierarchiunification algorithm, we shall also take into account tvariance of the distribution density of theRGB componentsof the cluster. This can be done by using the Mahalanomeasure,6 described by

mdi j 5di j

As i21s j

2,

wheredi j is determined in accordance with Eq.~11!, ands i2

ands j2 are the variances of the distribution densities of t

RGB components of clustersCi andCj , respectively.Figure 3 shows a diagram of the automatic analysis

gorithm. Such an algorithm makes it possible to break upimage into clusters corresponding to visual perception aconsequence of the refinement of the clusters.

RESULTS OF THE EXPERIMENTAL STUDIES

To estimate the efficiency of automatic image classifition, we choose as a test image a synthesized image of ecolor bands for two saturation values of 100% and 75%.

With automatic classification using the method devoped here, the number of determined clusters equals ebut the error probability equals zero, and the variance ofintracluster errors equals zero.

Let us estimate the stability of the algorithm in the preence of additive white noise. To do this, we combine eaRGBcomponent of the original image with the values of timplementation of a normally distributed random procewith mathematical expectation equal to zero and an rmsviation s that varies in the range from 10 to 50 quantizatilevels ~with eight-bit quantization of the components of thcolor image!. Such a procedure superimposes color noisethe image. Let us estimate fifty noise implementations. Tresults of automatic classification are presented in Table

ssification algorithm.


odicor

th-

usclorueat

t tt bheb

ryat

enout

eenivero-ineand

tionle-e.

ofof

teforofar-

sifi-

forodnd

d iners

e tode-or-

ingd to

foredwsrs.

g a

,’’

ti-

d

a-

On the basis of the data presented in the table, it canconcluded that the number of clusters increases as the npower increases. To reduce the classification error causethe presence of noise, we use the algorithm of hierarchunification of the clusters. Let us estimate the probabilityclassification error, defined as the sum of the false-alaprobability Pa and the missing-the-target probabilityPd , fora threshold value of the Mahalanobis measure ofMd51.1~in this case, the number of clusters determined by usingclassification algorithm equals eight!. The data of the experimental studies are shown in Fig. 4.

Based on these data, it can be concluded that, to redthe errors of automatic classification, it is necessary to emate the Mahalanobis measure and to supplement thesification algorithm with a cluster-unification algorithm. Fan image of color bands in a wide range of rms noise valfor S1 andS2, the threshold value of the Mahalanobis mesure can be assumed to be 1.1. This has the result thaobtain eight clusters as a result of classification.

The smaller the color contrast of the objects of interesus or of the object and the background, the smaller musthe threshold in terms of the Mahalanobis measure. Otwise, the clusters that are to be recognized as differentcome combined. When it is difficult to obtain a preliminaestimate of the threshold, it is preferable to use autom

FIG. 4. Estimate of the efficiency of the analysis algorithm.~a!classification-error probabilityPerr vs rms deviation of the noise forS1; ~b!classification-error probabilityPerr vs rms deviation of the noise forS2.1—blue,2—red,3—purple,4—white.


beisebyalfm

e

ceti-as-

s-we

oer-e-

ic

classification without using a cluster-unification unit. Whnecessary, the cluster-unification algorithm can be carriedin the interactive regime.

Based on this method, computer technologies have bdeveloped for recognizing color images for nondestructmethods for the appraisal and restoration of paintings, pcesses for analyzing colored microscopic objects in medicand engineering have been automated, and hardwaresoftware have been developed for solving target-recogniproblems. The main results of this work have been impmented as part of theMAGISOFT hardware–software packagand have been introduced for solving practical problems

CONCLUSIONS

A unified approach to the solution of the problemanalyzing color images has been formulated, consistingthe description of the color images in a color coordinaspace that makes it possible to use algorithms developedthe analysis of black-and-white images, with segmentationthe color images, as well as the use of the chromaticity chacteristics as an additional feature for increasing the clascation efficiency.

A method has been proposed, developed, and studiedautomatically analyzing color television images. The methinvolves combining the histogram-quantization method athe clusterization method usingK-intragroup means. Themethod makes it possible to use the information containethe image to obtain a more flexible shape of the clustwithout specifying their distribution lawsa priori. It allowsboth the spatial and the color characteristics of the imagbe taken into account. The number of clusters is not pretermined but is computed during the processing in accdance with the information contained in the image beprocessed. The boundaries of a segment are determinewithin a raster element, unlike fragmenting methods,which the accuracy with which the boundary is determindepends on the size of the fragment. The algorithm allothe feature space to be reduced to several tens of cluste

1R. Ohlander, K. Price, and D. R. Reoldy, ‘‘Picture segmentation usinrecursive region splitting method,’’ Comp. Vis. Graph. Image Proc.8, 313~1978!.

2C. Mehment, ‘‘A color clustering technique for image segmentationComp. Vis. Graph. Image Proc.52, 145 ~1990!.

3T. Uchiyama and M. A. Arbib, ‘‘Color image segmentation using competive learning,’’ IEEE T-PAMI16, 1197~1994!.

4R. N. Strikland, C-S. Kim, and W. F. McDonnell, ‘‘Luminance, hue, ansaturation processing of digital color images,’’ Proc. SPIE697, No. 9, 286~1986!.

5H. Spath,Cluster Analysis Algorithms for Data Reduction and Classifiction of Objects~Chichester, 1980!.

6J. T. Tou and R. C. Gonzalez,Pattern Recognition Principles~Addison-Wesley, Reading, Mass., 1974; Mir, Moscow, 1972!.


method of automatically analyzing color images

Documents