a new technique for color quantization based on … · g. ramella & g. sanniti di baja...

17
A NEW TECHNIQUE FOR COLOR QUANTIZATION BASED ON HISTOGRAM ANALYSIS AND CLUSTERING GIULIANA RAMELLA * and GABRIELLA SANNITI DI BAJA Institute of Cybernetics \E.Caianiello" CNR, Via Campi Flegrei 34, 80078 Pozzuoli Naples, Italy * [email protected] [email protected] Received 5 July 2011 Accepted 8 April 2013 Published 22 May 2013 A technique for color quantization is described, which consists of two processes. The ¯rst process is based on the analysis of the histograms of the three color components of the RGB input image. The second process performs clustering of the colors quantized by the ¯rst process, based on their Euclidean distance. At the end of the second process, the output image is obtained by replacing the color of each pixel of the input image with the closest representative color. The obtained results are satisfactory from both the qualitative and the quantitative point of view. Keywords : Color quantization; RGB color space; histogram analysis; clustering. 1. Introduction Regions of a digital image whose pixels are characterized by color homogeneity can be interpreted as constituting (parts of) the objects present in the image. However, if a 24-bit true color image is considered, where the number of possible di®erent colors may reach 16 millions, the number of perfectly homogeneous regions in the image would most possibly be noticeably larger than the number of (parts of) objects perceived in the image by a human observer. In fact, though the human visual system is able to distinguish a reasonably large number of colors, it generally groups colors with similar tonality, since even a few colors are often enough for image under- standing. Thus, when working with digital images color quantization is of interest. This is a process that, starting from the colors present in an input image, identi¯es and uses a reduced number of distinct colors in such a way to produce a new version of the image that is still visually similar to the original image. This process is * Corresponding author. International Journal of Pattern Recognition and Arti¯cial Intelligence Vol. 27, No. 3 (2013) 1360006 (17 pages) # . c World Scienti¯c Publishing Company DOI: 10.1142/S0218001413600069 1360006-1

Upload: phungkhuong

Post on 16-Feb-2019

212 views

Category:

Documents


0 download

TRANSCRIPT

A NEW TECHNIQUE FOR COLOR QUANTIZATION

BASED ON HISTOGRAM ANALYSIS AND CLUSTERING

GIULIANA RAMELLA* and GABRIELLA SANNITI DI BAJA†

Institute of Cybernetics \E.Caianiello"

CNR, Via Campi Flegrei 34, 80078 Pozzuoli

Naples, Italy*[email protected]

[email protected]

Received 5 July 2011

Accepted 8 April 2013

Published 22 May 2013

A technique for color quantization is described, which consists of two processes. The ¯rst process

is based on the analysis of the histograms of the three color components of the RGB input image.

The second process performs clustering of the colors quantized by the ¯rst process, based on

their Euclidean distance. At the end of the second process, the output image is obtained byreplacing the color of each pixel of the input image with the closest representative color. The

obtained results are satisfactory from both the qualitative and the quantitative point of view.

Keywords : Color quantization; RGB color space; histogram analysis; clustering.

1. Introduction

Regions of a digital image whose pixels are characterized by color homogeneity can

be interpreted as constituting (parts of) the objects present in the image. However, if

a 24-bit true color image is considered, where the number of possible di®erent colors

may reach 16 millions, the number of perfectly homogeneous regions in the image

would most possibly be noticeably larger than the number of (parts of) objects

perceived in the image by a human observer. In fact, though the human visual system

is able to distinguish a reasonably large number of colors, it generally groups colors

with similar tonality, since even a few colors are often enough for image under-

standing. Thus, when working with digital images color quantization is of interest.

This is a process that, starting from the colors present in an input image, identi¯es

and uses a reduced number of distinct colors in such a way to produce a new version

of the image that is still visually similar to the original image. This process is

*Corresponding author.

International Journal of Pattern Recognitionand Arti¯cial Intelligence

Vol. 27, No. 3 (2013) 1360006 (17 pages)

#.c World Scienti¯c Publishing Company

DOI: 10.1142/S0218001413600069

1360006-1

particularly useful for storage and transmission of multimedia data where millions of

distinct colors are present, e.g. see Refs. 13, 22, 27 and 32.

As pointed out in Refs. 5 and 6, color quantization methods can be roughly

divided in two categories respectively including image independent and image de-

pendent methods. The former methods18,20 are generally e±cient from the compu-

tational point of view, but the results are likely to be poor since the color palette is

built without taking into account the distribution of colors in the input image. In

turn, most of the image dependent methods are likely to provide rather good results

and are generally preferred to the image independent methods even if they are more

expensive. The most common approach in the framework of image dependent

methods is based on clustering. In fact, RGB color images can be interpreted as

constituted by pixels whose colors are a mixture of the primitive colors red, green and

blue. Thus, color quantization can be seen as a clustering problem in the 3D space,

where the coordinate axes are the color components and each point represents one of

the colors in the image. By means of a clustering technique, points can be grouped

into an a priori ¯xed number of clusters, each of which is associated a representative

color, generally obtained as the average of the points in the cluster.3,4,6,7,14�16,30,35

The representative color associated to any cluster replaces all the colors of the

input image that have been grouped in that cluster. Other color quantization

methods are based on a subdivision of the color space. In this framework, the most

known methods to build a colormap with an a priori ¯xed number of colors are the

median cut algorithm,13 the octree color quantization algorithm,12 and the Wu's

algorithm.34 Other quantization methods in the literature are based on histogram

analysis,9,10,17,23,24 fuzzy logic,19,29 neural network,2,11,21,25 and multi-resolution

analysis.17,23,24,26,31

Image dependent methods can be furthermore divided in two categories: pre-

clustering methods,9,12�14,16,30,34 and post-clustering methods.2,3,6,7,17,19,21,23,24 The

pre-clustering methods ¯x the desired maximum number of colors and determine

only once the color palette by using features derived from the image at hand. The

post-clustering methods de¯ne an initial palette of representative colors and suc-

cessively improve it by resorting to an iterative process.

In this paper, we present a new image dependent technique for color quantization,

QHC that can be seen framed among the post-clustering methods. QHC consists of

two processes, respectively dealing with the detection of an initial set of represen-

tative colors based on histogram analysis, and with an iterated grouping of repre-

sentative colors su±ciently similar to each other. This paper is the follow up to the

work presented at the 15th Iberoamerican Congress on Pattern Recognition CIARP

2010,24 where the main focus was on the use of multi-resolution histogram analysis.

In Ref. 24, starting from the full resolution input RGB image, a number of lower

resolution representations of the image were computed by means of a scaling down

interpolation method. The histograms of the three components (red, green and blue)

of the full resolution input image, as well as the histograms of the three components

G. Ramella & G. Sanniti di Baja

1360006-2

obtained at each lower resolution representation of the input image were computed.

For each color component, peaks and pits present in all the available histograms

were then detected. The histogram of each color component at full resolution was

simpli¯ed by retaining only peaks and pits that were present also in all the remaining

histograms at lower resolutions and that could be classi¯ed as dominating in the

histogram at full resolution. Peaks and pits remaining after histogram simpli¯cation

were used to establish a reduced number of representative values for each color

component. The image with quantized colors was then obtained by combining the

representative values of the three color components.

One of the advantages o®ered in Ref. 24 by the use of multi-resolution was the

ability of the method to originate automatically a transformed image with a number

of colors in the range established by the user. In fact, the number of resolution

representations used during the process (and, hence, the number of histograms used

to de¯ne the representative values) in°uences the number of ¯nal colors in the

quantized image. In particular, the larger the number of resolution levels is, the

smaller the number of ¯nal colors is. Thus, if the number of colors obtained by using

a given number of resolution levels is not included in the range selected by the user,

the process is automatically repeated by suitably increasing/decreasing the number

of resolution levels to be taken into account, until the number of ¯nal colors is in the

desired range.

As in Ref. 24, also in this work we start by analyzing and simplifying the structure

of the histograms of the three color components of an input RGB image so as to

identify a number of representative values for each component. Di®erently from

Ref. 24, we do not resort to the use of multi-resolution image representation to

generate an output image with a number of ¯nal colors in the desired range. This goal

is instead reached by working on the list of colors generated after the three histo-

grams have been processed. In practice, the method QHC described in this paper

consists of two processes. The ¯rst process is based on the analysis of histograms of

the three color components of the input RGB image. The number of colors detected

during this step is generally remarkably smaller than the number of colors in the

input image, but is likely to be larger than the number of colors desired by the user.

The second process performs clustering of the colors quantized by the ¯rst process,

based on their Euclidean distance. To this aim, colors resulting after the ¯rst process

are recorded in decreasing occurrence order and are sequentially examined in the

same order. Let Ci be the color at hand. Ci is taken as a representative color. Any

successive color Cj, j > i, su±ciently close to Ci is grouped with Ci and will not be

considered as representative color itself. The second process is automatically re-

peated with increasing tolerance on the distance between colors in order these colors

can be grouped in the same cluster, as far as the number of obtained representative

colors exceeds the maximum number of ¯nal colors desired by the user. At the end of

the second process, the output image is obtained by replacing the color of each pixel

of the input image with the closest representative color.

A New Technique for Color Quantization Based on Histogram Analysis and Clustering

1360006-3

Our post-clustering method is computationally convenient. In fact, the number of

initial colors that are determined by the ¯rst process is not a priori ¯xed and the

process does not require iterations. The ¯rst process signi¯cantly reduces the number

of colors with respect to those in the input image, so that the second process will

require a generally small number of iterations to obtain the desired number of colors.

Our method is able to generate automatically a transformed image with a number

of colors that does not overpass a given limit. Thus, di®erent transformed images,

each characterized by a di®erent number of colors, can be obtained starting from the

same input image by selecting di®erent limits for the number of ¯nal colors.

QHC has been tested on a large number of RGB images of di®erent size and color

distribution. The obtained results indicate that QHC has a better performance with

respect to our previous method24 both in terms of processing time and as regards the

quality of the results. We have also evaluated the performance of QHC, by com-

paring the results with those obtained by using other methods available in the

literature,11�13,34 in terms of quantitative measures such as Peak Signal to Noise

Ratio PSNR, Structural Similarity SSIM, Colorloss CL, and Compression Ratio

CR.8,25,28

The paper is organized as follows. Some preliminaries are given in Sec. 2; the

method is described in Sec. 3 and experimental results are discussed in Sec. 4.

Concluding remarks are ¯nally given in Sec. 5.

2. Preliminaries

We work with RGB images and interpret colors as three-dimensional (3D) vectors,

with each vector element having an 8-bit dynamic range. The RGB color space can

be geometrically represented as a 3D cube, where the three coordinates of each point

are the red, green and blue components of that point in the color space, see Fig. 1.

Each of the three edges of the cube has length 256, since each color component may

assume any value in the range [0, 255].

Given a color image I, the histogram of colors could be represented by using the

above 3D cube. In this case, the value in position (x, y, z) would account for the

number of pixels in I whose color components have values x, y, and z, respectively.

Of course, using this 3D histogram of colors would be computationally very expen-

sive, since it generally consists of a large (sparse) set of points. For example, consider

a color image I with size 1024� 1024. For such an image, at most 1.048.576 di®erent

colors would be possible (of course, only if each color in I has occurrence equal to

one), which is anyway only a small fraction (6,25%) of the total number of colors

possible when each color component may assume 256 di®erent values. Thus, most of

the color quantization methods based on color distribution actually work on the

histograms of the three individual color components of I. We also use this kind of

independent analysis of the three histograms, during the ¯rst process, though we are

aware that some important information contained in the dependence among the

channels may be lost.

G. Ramella & G. Sanniti di Baja

1360006-4

For a color image I, let K stand for any of the three components of I (the red

component R, the green component G, or the blue component B) and let us denote

by HK the histogram of the values in the color component K. For each value p of K,

with p in the range [0, 255], the height of the corresponding bin in HK shows the

occurrence of p, i.e. accounts for the number of pixels of K with value p.

Peaks and pits of HK are detected as the values for which the height of the

corresponding bin is locally maximal and locally minimal, respectively. Formally, if

p� 1, p and pþ 1 are three consecutive values, and hðpÞ is the height of the bin

corresponding to p in the color component K, then:

p is a peak of HK if

hðp� 1Þ � hðpÞ and hðpþ 1Þ < hðpÞ or

hðp� 1Þ < hðpÞ and hðpþ 1Þ � hðpÞ;p is a pit of HK if

hðpÞ � hðp� 1Þ and hðpÞ < hðpþ 1Þ or

hðpÞ < hðp� 1Þ and hðpÞ � hðpþ 1Þ:

Peaks and pits can be interpreted as vertices of a polygonal approximation of HK

and can be used to identify the representative values for the color component K by

replacing all values from a pit to the successive pit ofHK with a single representative

value. Actually, to furthermore reduce the number of representative values forK, we

perform a process aimed at disregarding a number of peaks and pits ofHK , while still

providing a reasonably good polygonal approximation of HK . To this aim, all ini-

tially detected peaks and pits of HK are taken as candidate vertices among which to

Red (255, 0, 0)

Green (0, 255,0)

Blue (0,0,255) Magenta (255,0,255)

Yellow (255, 255, 0)

Cyan (0, 255, 255)

R

B

G

Black (0,0, 0)

White (255,255,255)

Fig. 1. A 3D cube geometrically representing the RGB color space.

A New Technique for Color Quantization Based on Histogram Analysis and Clustering

1360006-5

select the ¯nal vertices. Three parameters, ai, ci, and di are associated with each

candidate vertex vi, where ai is the area of the triangle with vertices vi�1, vi and viþ1;

ci is the cosine of the angle formed by the two line segments respectively delimited by

the two vertices vi and vi�1 and by the two vertices vi and viþ1; di is the distance of vifrom the straight line joining vi�1 and viþ1.

The above three parameters, shown in Fig. 2(a), were introduced in Ref. 1 in the

framework of two-dimensional (2D) object's contour analysis to de¯ne the domi-

nance of a vertex by taking into account factors in°uencing human perception, such

as the size of the region viewed by a vertex vi (roughly related to the area ai of a

protrusion or an intrusion of the object) and the \cornerity" (roughly related to ci) of

the contour arc surrounding vi. In turn, di is related to a measure often used of

perceptual signi¯cance, de¯ned as the point-wise error between a contour arc and the

corresponding side in the approximating polygon.

Let v1; v2 ; . . . ; vn be the candidate vertices, i.e. all the peaks and pits detected in

HK , and denote by A be the arithmetic mean of the area ai of the triangles associated

to all vertices vi; i ¼ 1; . . . ;n. Then, candidate vertices are sequentially inspected.

Let � be a weight, whose value is ¯xed depending on the tolerance for polygonal

approximation. Any two consecutive candidate vertices of HK , say vi and viþ1, are

not retained as ¯nal vertices if: (i) both ai < �A and aiþ1 < �A, and (ii) ai ¼ aiþ1, or

ci ¼ ciþ1, or di ¼ diþ1. Moreover, any candidate vertex vi of HK which is no longer a

relative maximum or a relative minimum of HK due to removal of neighboring

candidate vertices is not retained as a vertex.

As an example of the performance of polygonal approximation, see Fig. 2(b) right.

To evaluate the performance of color quantization algorithms, the most com-

monly used measures are the Peak Signal to Noise Ratio PSNR, the Structural

SIMilarity SSIM, the Colorloss CL, and the Compression Ratio CR.

For gray-level images, PSNR is computed as follows:

PSNR ¼ 20� log10

255ffiffiffiffiffiffiffiffiffiffiffiMSE

p� �

;

(a) (b)

Fig. 2. (a) The three parameters ai, ci, di associated to the vertex vi; (b) an histogram, left, and itspolygonal approximation, right (color online).

G. Ramella & G. Sanniti di Baja

1360006-6

where

MSE ¼ 1

H �K

XHi¼1

XKj¼1

ðvi;j � wi;jÞ2

and vi;j and wi;j respectively belong to the input image and to the output image of

size H�K.

For RGB images, where there are three values per pixel, the de¯nition of PSNR is

still the same, but MSE is the sum over all squared value di®erences divided by image

size and by three.

The more the original image and the quantized image are similar, the smaller is

MSE. As a consequence, PSNR increases when similarity between input and output

images increases, reaching in¯nity if the two images are identical.

The SSIM index for two gray-level images v and w is computed as follows:

SSIMðv;wÞ ¼ ð2�v�w þ c1Þð2covvw þ c2Þð�2

v þ �2w þ c1Þð�2

v þ �2w þ c2Þ

;

where

�v is the average of v;

�w is the average of w;

�2v is the variance of v;

�2w is the variance of w;

covvw is the covariance of v and w;

c1 ¼ ðk1LÞ2, c2 ¼ ðk2LÞ2 are two variables to stabilize the division with weak

denominator;

L is the dynamic range of the pixel values (255 for 8-bit image);

k1 ¼ 0; 01 and k2 ¼ 0; 03 are default values.

The SSIM index is computed by using a sliding window approach. The window

size is ¯xed to 8� 8, as suggested in Ref. 33. The sliding window moves pixel-by-pixel

from the top-left corner to the bottom-right corner of the image, and the SSIM index

is computed within the sliding window. As a result, an SSIM index map of the image

is obtained, and the overall quality value is de¯ned as the average of the SSIM index

map, i.e. the mean SSIM index. The SSIM value is in the range [0,1], where higher

values denote better structural similarity. For RGB images, the SSIM index is

computed for the three channel components independently and the quality value is

obtained by the average of the three indexes.

The computation of SSIM has been done in this paper by using the code developed

for Open CV in Cþþ by Rabah Mehdi,36 where an 11� 11 Gaussian weighting

function is used to compute average, variance and covariance of the images.

The Colorloss CL is a quantitative measurement of the loss of color information

caused by quantization. CL is de¯ned as the Euclidean color distance of a pixel in the

original image and the corresponding pixel in the quantized image. The larger the

A New Technique for Color Quantization Based on Histogram Analysis and Clustering

1360006-7

colorloss, the greater is the loss in color information. Let I consist of N pixels, and let

the RGB values of a pixel p be (rp, gp, bp). Let I0 be the quantized image and let (r 0p,

g 0p, b

0p) be the color components of the pixel corresponding to p in I 0: The colorloss

between the images I and I 0 is de¯ned as follows:

CLðI; I 0Þ ¼PN

1

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiðrp � r 0pÞ2 þ ðgp � g 0pÞ2 þ ðbp � b 0pÞ2

pN

:

The compression ratio CR denotes which is the percentage of the original size of

the image that results after compression. CR is computed as the ratio between the

size of the output stream and the input stream expressed in bit per pixel (bpp).28

3. The QHC Algorithm

QHC is a post-clustering method and as such consists of two processes, respectively

aimed at the detection of an initial set of representative colors, and at the re¯nement

of this set in order to achieve the desired number of ¯nal representative colors. The

¯rst process is based on histogram analysis. In principle, one could resort to a simple

quantization of the color histogram by dividing the 3D cube into a number of equally

sized smaller cubes and by suitably selecting representative colors for each of them.

However, a subdivision of the color space into equal volumes does not generally

produce good results. On the other hand, working with the 3D histogram of the input

image, which is generally characterized by a large number of sparse colors, is com-

putationally expensive. Thus, we prefer to resort to the analysis of the histograms of

the three color components of I. Aim of this process is to reduce signi¯cantly the

number of colors, while maintaining a high degree of similarity with the input image.

Starting from the input color image I, the three color components R, G and B are

considered. For each of them, say K, the histogram HK is generated and the cor-

responding polygonal approximation is computed as described in Sec. 2.

Once the structure ofHK has been simpli¯ed, the representative values for the color

componentK are computed as follows. All values from a pit to the successive pit ofHK

are replaced by a single representative value. In general, one peak only is expected in

between two successive pits. However, due to the fact that relative local maxima and

relative local minima are detected and due to the histogram simpli¯cation process,

peaks and pits do not always regularly alternate. Thus, more than one peak, or no peak

at all, could exist in between two successive pits. Therefore, three cases are possible

and the following three criteria are used to select the representative value:

(1) If exactly one peak exists between two successive pits, the representative value is

the value of the peak.

(2) If more than one peak exists in between two successive pits, the representative

value is the value of the leftmost peak.

(3) Otherwise, the representative value is the value of the leftmost pit.

G. Ramella & G. Sanniti di Baja

1360006-8

After the histogram of each color component has been simpli¯ed by retaining only

a subset of the initially detected peaks and pits and the representative values have

been computed for each color component, the quantized image I 0 is obtained by

combining the representative values of the three color components. If the number of

colors of I 0 is not larger than the maximum number of ¯nal colors desired by the user

the process terminates. Otherwise, the second process is performed.

We note that the number of colors resulting at the end of the ¯rst process

depends on the combination of the representative values for the three color com-

ponents as well as on number of vertices remaining in each of the three histograms

after their simpli¯cation. In turn, the number of vertices remaining after histogram

simpli¯cation depends on the value of the multiplicative weight � used to compare

the area aj of the triangle associated to the current candidate vertex vj with the

arithmetic mean A of the area of the triangles associated to all candidate vertices

vi; i ¼ 1; . . . ;n. We have used for � di®erent values satisfying the condition

0 < � � 1. The more � is close to 1, the smaller is the number of vertices. However,

for � very close to 1 polygonal approximation provides a less faithful representation

of the histogram. We have experimentally found that, in the average, the best

compromise between number of vertices and faithfulness of the polygonal approxi-

mation is obtained by taking � ¼ 0:5. Accordingly, we recommend to use � ¼ 0:5 as

default value.

The second process reduces the number of colors obtained at the end of the

¯rst process to a number smaller than or equal to the maximum number of colors

desired by the user. The goal is reached by clustering colors based on their Euclidean

distance.

Representative colors determined by the ¯rst process and recorded in a list, are

sequentially analyzed to decide whether they should be taken as ¯nal representative

colors or as members of clusters associated with other already selected ¯nal repre-

sentative colors. Once the proper decision is taken for a color, such a color is

suitably marked in the list so as to avoid considering it again either as a new ¯nal

representative color or as member of a cluster associated with another ¯nal repre-

sentative color. More in detail, colors resulting from the ¯rst process are initially

sorted in decreasing occurrence order, and the minimal distance � among colors is

computed. At this stage, all colors are still unmarked, so that each of them can be

processed. The currently processed color Ci is taken as a new ¯nal representative

color and is marked. Each successive unmarked color Cj, j > i, whose distance from

Ci is smaller than �, is taken as member of the cluster associated with Ci and is

marked.

The number of clusters obtained after all representative colors have been pro-

cessed is compared with the maximum number of ¯nal colors desired by the user. If

the obtained ¯nal representative colors are still too many, the second process is

repeated, starting from the colors provided by the ¯rst process. To reduce the

number of ¯nal representative colors, a larger tolerance on distance is obviously

A New Technique for Color Quantization Based on Histogram Analysis and Clustering

1360006-9

necessary to group more colors in clusters. Thus, at each repetition of the second

process, the value of � used during the previous application of the process is increased

by 1. The second process is repeated until the obtained number of ¯nal representative

colors is not larger than the desired maximum number of colors. The number of

repetitions of the second process depends on the number of colors provided by the

¯rst process and on the desired number of ¯nal colors.

At the end of the second process, the output image is obtained by replacing the

color of each pixel of I with the closest representative color.

4. Experimental Results

We have applied the color quantization algorithm QHC to a collection of images with

di®erent size and color distribution, taken from available databases.37�39 The small

dataset of sixteen images in Fig. 3 is used to show the performance of our method in

terms of the obtained number of representative colors RC, and of the quantitative

measures PSNR, SSIM, CL and CR.

Table 1 summarizes the results obtained at the end of the ¯rst process, performed

with � ¼ 0:5 for all test images. For each test image, its size, the number of colors of

the input image IC, the number of representative colors resulting after the ¯rst

process RC1, the Peak Signal to Noise Ratio PSNR, the Structural SIMilarity SSIM,

the Colorloss CL, and the Compression Ratio CR are reported. As already pointed

out, a smaller number of colors RC1 could be obtained at the end of the ¯rst process

by selecting a larger value for �. However, such a choice would produce resulting

images with lower similarity to the original images. By taking also in consideration

that the computational cost of the second process is rather small, we prefer to use

� ¼ 0:5 during the ¯rst process so as to have better quality results.

Table 2 summarizes the results at the end of the second process and compares the

performance of QHC with that of other methods in the literature, namely with the

Median Cut MC,13 the Octree OT,12 the method W by Wu,34 and the method NQ

by Dekker.11 In particular, as concerns NQ we have set to 10 the sampling factor.

baboon barbara bike cablecar colors cornfield flower1 flower2

fruits housed kodim14 kodim15 lake monarch soccer yacht

Fig. 3. Test images.

G. Ramella & G. Sanniti di Baja

1360006-10

This choice is motivated by the fact that with such a sampling factor the speed of

the algorithm is reasonably increased and the quality of the results is only slightly

reduced.

The best values (i.e. the maximal values for PSNR and SSIM, and the minimal

values for RC, CL and CR) are in bold. For the comparison, we have set to 256 the

maximum number of desired ¯nal colors. The arithmetic means computed on the

sixteen test images for the values of PSNR, SSIM and CL are respectively 32.114,

0.897 and 9.338 for MC, 33.187, 0,908 and 8,914 for OT, 33.957, 0.936 and 7.846 for

W, 33.626, 0.938 and 7.650 for NQ, and 33.868, 0.921 and 7.974 for QHC. We point

out that the above average values for QHC computed on the sixteen test images do

not signi¯cantly di®er from the arithmetic means computed on the whole set of

images that we have used.

It can be observed that QHC always compares favorably with respect to MC and

OT, while in some cases it performs worse than W or NQ. However, we remark that

when the performance of W or NQ is better than the performance of QHC in terms of

PSNR or SSIM, the number of representative colors RC characterizing our method is

generally smaller than the number of colors quantized by W or NQ.

An interesting feature of QHC is the possibility for the user to set a di®erent limit

to the maximum number of ¯nal colors for the quantized image. As an example, refer

to Fig. 4, where for the test images barbara, cablecar, °ower1, fruits and kodim15

four di®erent limits have been used for the maximum number of ¯nal representative

colors, namely 128, 256, 384 and 512.

For the test images for which QHC originates the results shown in Fig. 4, the

number of ¯nal representative colors in the corresponding output images and the

resulting PSNR and SSIM are given in Table 3, where the limits to the maximal

number of ¯nal representative colors are indicated between brackets.

Table 1. Results after the application of the ¯rst process of QHC.

Size IC RC1 PSNR SSIM CL CR

baboon 512� 512 171 045 26 289 36.622 0.976 4.924 0.845barbara 480� 384 156 234 4028 35.417 0.964 6.587 0.694

bike 780� 916 206 155 28 048 36.936 0.955 4.697 0.837

cablecar 512� 480 130 416 17 945 40.931 0.985 3.169 0.832

colors 432� 288 97 465 12 509 33.347 0.909 7.177 0.821corn¯eld 512� 480 134 514 17 836 39.163 0.987 3.370 0.829

°ower1 512� 480 111 841 10 310 39.287 0.974 3.620 0.795

°ower2 512� 512 178 778 10 787 35.501 0.960 5.380 0.768fruits 512� 480 160 476 21 724 38.526 0.978 3.153 0.833

housed 512� 512 154 605 2369 30.749 0.932 9.496 0.650

kodim14 768� 512 55 117 13 239 37.960 0.981 3.821 0.869

kodim15 768� 512 44 576 10 107 39.733 0.976 3.461 0.861lake 512� 512 168 459 8315 34.625 0.955 6.473 0.750

monarch 480� 320 90 403 2579 34.976 0.953 6.449 0.688

soccer 512� 480 139 156 19 177 40.155 0.987 3.293 0.833

yacht 512� 480 150 053 25 398 41.261 0.984 2.996 0.851

A New Technique for Color Quantization Based on Histogram Analysis and Clustering

1360006-11

Table 2. Comparisons of the performances of QHC, MC, OT, W and NQ.

QHC MC OT W NQ QHC MC OT W NQ

baboon fruits

RC 249 256 256 256 256 RC 234 256 256 256 255

SSIM 0.926 0.908 0.918 0.937 0.938 SSIM 0.899 0.859 0.870 0.907 0.910

PSNR 31.679 30.118 31.024 31.677 31.559 PSNR 34.346 32.579 32.872 34.205 34.038

CL 10.852 12.593 11.788 10.468 10.362 CL 7.505 9.501 9.208 7.723 7.518

CR 0.458 0.460 0.460 0.460 0.460 CR 0.455 0.463 0.463 0.463 0.462

barbara housed

RC 243 256 252 256 256 RC 253 256 256 256 255

SSIM 0.927 0.889 0.897 0.932 0.922 SSIM 0.926 0.907 0.925 0.943 0.946

PSNR 33.060 30.609 31.488 32.743 32.207 PSNR 32.521 31.862 33.463 34.066 33.972

CL 8.987 11.982 11.110 9.166 9.394 CL 8.529 10.157 8.396 7.713 7.416

CR 0.459 0.464 0.462 0.464 0.464 CR 0.463 0.464 0.464 0.464 0.464

bike kodim14

RC 244 256 256 256 250 RC 231 256 251 256 248

SSIM 0.883 0.853 0.860 0.912 0.919 SSIM 0,950 0.932 0.952 0.964 0.967

PSNR 32.008 30.021 31.198 32.920 32.490 PSNR 35.006 33.614 35.399 35.202 34.883

CL 9.154 11.894 10.736 8.114 7.797 CL 6.626 8.103 6.626 6.590 6.207

CR 0.449 0.453 0.453 0.453 0.451 CR 0.499 0.508 0.506 0.508 0.505

cablecar kodim15

RC 247 256 256 256 238 RC 225 256 252 256 246

SSIM 0.931 0.913 0.917 0.947 0.950 SSIM 0.938 0.907 0.920 0.949 0.958

PSNR 34.376 32.466 33.049 34.152 33.529 PSNR 36.153 33.414 35.205 35.821 35.705

CL 7.309 9.089 8.543 7.408 7.189 CL 5.713 8.278 6.750 6.188 5.570

CR 0.468 0,471 0,471 0,471 0.465 CR 0.506 0.518 0.517 0.518 0.514

colors lake

RC 240 256 256 256 256 RC 224 256 256 256 254

SSIM 0.862 0.859 0.842 0.895 0.891 SSIM 0.914 0.875 0.900 0.934 0.934

PSNR 31.402 29.866 30.450 32.030 30.650 PSNR 33.465 31.769 32.924 33.589 32.986

CL 10.682 12.200 11.899 9.474 10.367 CL 8.389 10.374 9.167 8.224 8.046

CR 0.477 0.483 0.483 0.483 0.483 CR 0.450 0.461 0.461 0.460 0.460

corn¯eld monarch

RC 222 256 254 256 242 RC 222 256 256 256 246

SSIM 0.949 0.932 0.932 0.961 0.965 SSIM 0.936 0.919 0.935 0.950 0.953

PSNR 34.618 32.753 33.118 34.465 34.146 PSNR 34.207 33.943 35.235 35.120 34.877

CL 7.070 9,062 8.800 7.232 6.961 CL 7.222 8.085 6.984 6.897 6.663

CR 0.457 0.470 0.469 0.470 0.465 CR 0.473 0.486 0.486 0.486 0.482

°ower1 soccer

RC 247 256 255 256 245 RC 226 256 256 256 250

SSIM 0.932 0.896 0.917 0.939 0.943 SSIM 0.946 0.930 0.938 0.960 0.964

PSNR 36.715 34.646 35.916 35.639 35.809 PSNR 34.537 32.531 33.584 34.361 34.095

CL 5.675 7.405 6.430 6.594 6.054 CL 7.535 9.270 8.315 7.378 7.135

CR 0.474 0.477 0.477 0.477 0.473 CR 0.458 0.468 0.468 0.468 0.466

°ower2 yacht

RC 243 256 252 256 256 RC 237 256 256 256 253

SSIM 0.896 0.861 0.880 0.902 0.898 SSIM 0.918 0.908 0.917 0.937 0.944

PSNR 33.760 30.897 32.691 33.488 33.396 PSNR 34.029 32.735 33.368 33.839 33.667

CL 8.303 11.397 9.435 8.472 8.162 CL 8.038 9.072 8.435 7.892 7.565

CR 0.454 0.459 0.457 0.459 0.459 CR 0.459 0.465 0.465 0.465 0.464

G. Ramella & G. Sanniti di Baja

1360006-12

The qualitative performance of QHC on a few more images with di®erent size and

color distribution can be appreciated with reference to Fig. 5. The input images (odd

lines) and the resulting quantized versions (even lines) have been obtained by setting

to 128 the limit on the maximal number of ¯nal representative colors. For each

Fig. 4. Each line shows from left to right the four quantized images, obtained by limiting to 128, 256, 384,

and 512 the number of ¯nal representative colors.

Table 3. Performance of QHC by setting to 128, 256, 384 and 512 the maximum number of represen-

tative colors.

RC

(128) PSNR SSIM

RC

(256) PSNR SSIM

RC

(384) PSNR SSIM

RC

(512) PSNR SSIM

barbara 115 31.001 0.888 243 36.060 0.927 346 33.871 0.939 393 34.174 0.944

cablecar 121 32.171 0.903 247 34.376 0.931 364 35.412 0.945 510 36.336 0.957°ower1 123 38.844 0.899 247 36.715 0.932 330 37.327 0.940 419 37.713 0.947

fruits 120 32.491 0.860 234 34.346 0.899 340 35.126 0.913 469 35.956 0.929

kodim15 121 34.354 0.910 225 36.153 0.938 330 37.086 0.950 470 37.847 0.961

A New Technique for Color Quantization Based on Histogram Analysis and Clustering

1360006-13

image, the number of colors in the initial and in the quantized image (bold) and the

size (in brackets) are also indicated.

5. Concluding Remarks

The color quantization algorithm QHC has been described that, starting from an

input color image, generates a quantized image with a smaller number of colors, but

still maintaining the visual aspect of the input image satisfactorily. The algorithm

consists of two successive processes. The ¯rst process is based on the analysis of

the histograms of the three color components of the RGB input image. The second

47819 96395 18764 73580 (512 512) (720 576) (481 321) (512 512)

105 123 100 113

77426 36175 197410 43365 (480 320) (340 256) (787 576) (344 340)

118 113 126 124

Fig. 5. Results of the application of QHC by setting to 128 the maximal number of representative colors.

G. Ramella & G. Sanniti di Baja

1360006-14

process furthermore reduces the number of representative colors by clustering colors

based on their Euclidean distance. At the end of the second process, the output image

is obtained by replacing the color of each pixel of the input image with the closest

representative color. QHC is able to produce di®erent quantized images with a dif-

ferent number of representative colors, limited to a maximum ¯xed by the user.

The computational cost of the algorithm is O(N), where N is the number of pixels

in the image at hand. QHC has been implemented on a Pentium 4 (3.39 GHz, 2GB

RAM) personal computer and has been applied to a large set of images, producing

satisfactory results in terms of PSNR, SSIM, CL and CR.

References

1. C. Arcelli and G. Ramella, Finding contour-based abstractions of planar patterns,Pattern Recogn. 26(10) (1993) 1563�1577.

2. A. Atsalakis and N. Papamarkos, Color reduction and estimation of the number ofdominant colors by using a self-growing and self-organized neural gas, Eng. Appl. Artif.Intell. 19 (2006) 769�786.

3. Z. Bing, S. Junyi and P. Qinke, An adjustable algorithm for color quantization, PatternRecogn. Lett. 25 (2004) 1787�1797.

4. J. P. Braquelaire and L. Brun, Comparison and optimization of methods of color imagequantization, IEEE Trans. Image Process. 6(7) (1997) 1048�1052.

5. L. Brun and A. Tr�emeau, Color quantization, in Digital Color Imaging Handbook,Electrical and Applied Signal Processing (CRC Press, 2002), pp. 589�638.

6. M. E. Celebi, Improving the performance of k-means for color quantization, Image VisionComput. 29 (2011) 260�271.

7. T. W. Chen, Y. L. Chen and S. Y. Chien, Fast image segmentation based on K-meansclustering with histograms in HSV color space, in Proc. IEEE 10th Workshop onMultimedia Signal Processing (2008), pp. 322�325.

8. H. C. Chan, Perceived image similarity and quantization resolution, Displays 29 (2008)451�457.

9. S. C. Cheng and C. K. Yang, A fast and novel technique for color quantization usingreduction of color space dimensionality, Pattern Recogn. Lett. 22 (2001) 845�856.

10. J. Delon, A. Desolneux, J. L. Lisani and A. B. Petro, A nonparametric approach forhistogram segmentation, IEEE Trans. Image Process. 16(1) (2007) 253�261.

11. A. Dekker, Kohonen neural networks for optimal colour quantization, Network-Comp.Neural Syst. 5(3) (1994) 351�367.

12. M. Gervautz and W. Purgathofer, A simple method for color quantization: Octreequantization, in Graphics Gems, ed. A. S. Glassner (Academic Press, 1990), pp. 287�293.

13. P. S. Heckbert, Color image quantization for frame bu®er display, Proc. ACM SIG-GRAPH'82 16(3) (1982) 297�307.

14. I. S. Hsieh and K. C. Fan, An adaptive clustering algorithm for color quantization,Pattern Recogn. Lett. 21 (2000) 337�346.

15. Y. Jiang, Y. Wang, L. Jin, H. Gao and K. Zhang, Investigation on color quantizationalgorithm of color image, ECWAC 2011, Part II, eds. G. Shen and X. Huang, CCIS 144(Springer, Berlin, 2011), pp. 181�187.

16. K. Kanjanawanishkul and B. Uyyanonvara, Novel fast color reduction algorithm for time-constrained applications, J. Visual Commun. Image Represent. 16 (2005) 311�332.

A New Technique for Color Quantization Based on Histogram Analysis and Clustering

1360006-15

17. N. Kim and N. Kehtarnavaz, DWT-based scene-adaptive color quantization, Real-TimeImag. 11 (2005) 443�453.

18. A. Mojsilovic and E. Soljanin, Color quantization and processing by Fibonacci lattices,IEEE Trans. Image Process. 10(11) (2001) 1712�1725.

19. D. Ozdemir and L. Akarun, A fuzzy algorithm for color quantization of images, PatternRecogn. 35 (2002) 1785�1791.

20. A. W. Paeth, Mapping RGB triples onto four bits, in Graphics Gems, ed. A. S. Glassner(Academic Press, Cambridge, MA, 1990), pp. 233�245.

21. N. Papamarkos, A. E. Atsalakis and C. P. Strouthopoulos, Adaptive color reduction,IEEE Trans. Syst. Man Cyber. 32(1) (2002) 44�56.

22. K. N. Plataniotis and A. N. Venetsanopoulos, Color Image Processing and Applications(Springer, Berlin, 2000).

23. G. Ramella and G. Sanniti di Baja, Color quantization by multiresolution analysis, inComputer Analysis of Images and Patterns, eds. X. Jiang and N. Petkov, LNCS 5702(Springer, Berlin, 2009), pp. 525�532.

24. G. Ramella and G. Sanniti di Baja, Multiresolution histogram analysis for colorreduction, Proc. 15th Iberoamerican Congress on Pattern Recognition, eds. I. Bloch andM. R. Cesar-Jr., LNCS 6419 (Springer, Berlin, 2010), pp. 22�29.

25. J. Rasti, A. Monadjemi and A. Vafaei, Color reduction using a multi-stage Kohonen self-organizing map with redundant features, Expert Syst. Appl. 38 (2011) 13188�13197.

26. J. A. Robinson, Adaptive prediction trees for image compression, IEEE Trans. ImageProcess. 15(8) (2006) 2131�2145.

27. Y. Rui and T. S. Huang, Image retrieval: Current techniques, promising directions, andopen issues, J. Visual Commun. Image Represent. 10 (1999) 39�62.

28. D. Salomon, Data Compression: The Complete Reference (Springer-Verlag, London,2007).

29. N. Shorter and T. Kasparis, Fuzzy ART for relatively fast unsupervised image colorquantization, Proc. of 19th Int. Conf. Pattern Recognition ISBN/ISSN: 978-1-4244-2175-6 (IEEE CS Press, 2008).

30. Y. Sirisathitkul, S. Auwatanamongkol and B. Uyyanonvara, Color image quantizationusing distances between adjacent colors along the color axis with highest color variance,Pattern Recogn. Lett. 25 (2004) 1025�1043.

31. G. Sreelekha and P. S. Sathidevi, An HVS based adaptive quantization scheme for thecompression of color images, Digit. Sig. Process. 20 (2010) 1129�1149.

32. J. Wang, W. J. Yang and R. Acharya, Color space quantization for color-content-basedquery systems, Multi. Tools Appl. 13 (2001) 73�91.

33. Z. Wang, L. Lu and A. C. Bovik, Video quality assessment based on structural distortionmeasurement, Sig. Process.: Image Commun. 19(2) (2004) 121�132.

34. X. Wu, Color quantization by dynamic programming and principal analysis, ACMTrans.Graph. 11(4) (1992) 349�372.

35. X. Zhang, Z. Song, Y. Wang and H. Wang, Color quantization of digital images, PCM2005, Part II, eds. Y. S. Ho and H. J. Kim, LNCS 3768 (Springer, Berlin, 2005),pp. 653�664.

36. http://mehdi.rabah.free.fr/SSIM.37. http://www.hlevkin.com/TestImages/.38. http://r0k.us/graphics/kodak/.39. http://sipi.usc.edu/database/.

G. Ramella & G. Sanniti di Baja

1360006-16

Giuliana Ramella re-ceived her doctoral degreein Physics from the Uni-versity of Naples FedericoII, Naples, Italy, in 1990.In the period 1990�1997,she was granted re-search fellowship at theInstitute of Cybernetics\E. Caianiello" of theItalian National Research

Council, where in 1997 she got the permanentposition of researcher. Her main interests aredigital geometry and topology, multiresolution,shape representation and analysis, and imagecompression. Since the year 2000, she has had anumber of teaching contracts with three univer-sities of Naples (Federico II, Second University,Parthenope).

Gabriella Sanniti diBaja received her doc-toral degree \cum laude"in Physics from the Uni-versity of Naples, Italy,in 1973. In 2002, she re-ceived her Ph.D. HonorisCausa from the UppsalaUniversity, Sweden. Since1973, she has been work-ing in the ¯eld of image

processing and pattern recognition at the Insti-tute of Cybernetics \E. Caianiello" of theNational Research Council of Italy, Naples,where she is currently the director of research.Her main research activities concern 2D and 3Dshape representation, decomposition and des-cription. She has published more than 130 papersin international journals and conference pro-ceedings and is an editor-in-chief of PatternRecognition Letters (special issues). She has beena member of the Executive Committee of theInternational Association for Pattern Recogni-tion (IAPR) for ten years, being IAPR Presidentfrom 2000�2002. She is an IAPR fellow andForeign Member of the Royal Society of Sciencesat Uppsala, Sweden.

A New Technique for Color Quantization Based on Histogram Analysis and Clustering

1360006-17