a new technique for color quantization based on … · g. ramella & g. sanniti di baja...
TRANSCRIPT
A NEW TECHNIQUE FOR COLOR QUANTIZATION
BASED ON HISTOGRAM ANALYSIS AND CLUSTERING
GIULIANA RAMELLA* and GABRIELLA SANNITI DI BAJA†
Institute of Cybernetics \E.Caianiello"
CNR, Via Campi Flegrei 34, 80078 Pozzuoli
Naples, Italy*[email protected]
Received 5 July 2011
Accepted 8 April 2013
Published 22 May 2013
A technique for color quantization is described, which consists of two processes. The ¯rst process
is based on the analysis of the histograms of the three color components of the RGB input image.
The second process performs clustering of the colors quantized by the ¯rst process, based on
their Euclidean distance. At the end of the second process, the output image is obtained byreplacing the color of each pixel of the input image with the closest representative color. The
obtained results are satisfactory from both the qualitative and the quantitative point of view.
Keywords : Color quantization; RGB color space; histogram analysis; clustering.
1. Introduction
Regions of a digital image whose pixels are characterized by color homogeneity can
be interpreted as constituting (parts of) the objects present in the image. However, if
a 24-bit true color image is considered, where the number of possible di®erent colors
may reach 16 millions, the number of perfectly homogeneous regions in the image
would most possibly be noticeably larger than the number of (parts of) objects
perceived in the image by a human observer. In fact, though the human visual system
is able to distinguish a reasonably large number of colors, it generally groups colors
with similar tonality, since even a few colors are often enough for image under-
standing. Thus, when working with digital images color quantization is of interest.
This is a process that, starting from the colors present in an input image, identi¯es
and uses a reduced number of distinct colors in such a way to produce a new version
of the image that is still visually similar to the original image. This process is
*Corresponding author.
International Journal of Pattern Recognitionand Arti¯cial Intelligence
Vol. 27, No. 3 (2013) 1360006 (17 pages)
#.c World Scienti¯c Publishing Company
DOI: 10.1142/S0218001413600069
1360006-1
particularly useful for storage and transmission of multimedia data where millions of
distinct colors are present, e.g. see Refs. 13, 22, 27 and 32.
As pointed out in Refs. 5 and 6, color quantization methods can be roughly
divided in two categories respectively including image independent and image de-
pendent methods. The former methods18,20 are generally e±cient from the compu-
tational point of view, but the results are likely to be poor since the color palette is
built without taking into account the distribution of colors in the input image. In
turn, most of the image dependent methods are likely to provide rather good results
and are generally preferred to the image independent methods even if they are more
expensive. The most common approach in the framework of image dependent
methods is based on clustering. In fact, RGB color images can be interpreted as
constituted by pixels whose colors are a mixture of the primitive colors red, green and
blue. Thus, color quantization can be seen as a clustering problem in the 3D space,
where the coordinate axes are the color components and each point represents one of
the colors in the image. By means of a clustering technique, points can be grouped
into an a priori ¯xed number of clusters, each of which is associated a representative
color, generally obtained as the average of the points in the cluster.3,4,6,7,14�16,30,35
The representative color associated to any cluster replaces all the colors of the
input image that have been grouped in that cluster. Other color quantization
methods are based on a subdivision of the color space. In this framework, the most
known methods to build a colormap with an a priori ¯xed number of colors are the
median cut algorithm,13 the octree color quantization algorithm,12 and the Wu's
algorithm.34 Other quantization methods in the literature are based on histogram
analysis,9,10,17,23,24 fuzzy logic,19,29 neural network,2,11,21,25 and multi-resolution
analysis.17,23,24,26,31
Image dependent methods can be furthermore divided in two categories: pre-
clustering methods,9,12�14,16,30,34 and post-clustering methods.2,3,6,7,17,19,21,23,24 The
pre-clustering methods ¯x the desired maximum number of colors and determine
only once the color palette by using features derived from the image at hand. The
post-clustering methods de¯ne an initial palette of representative colors and suc-
cessively improve it by resorting to an iterative process.
In this paper, we present a new image dependent technique for color quantization,
QHC that can be seen framed among the post-clustering methods. QHC consists of
two processes, respectively dealing with the detection of an initial set of represen-
tative colors based on histogram analysis, and with an iterated grouping of repre-
sentative colors su±ciently similar to each other. This paper is the follow up to the
work presented at the 15th Iberoamerican Congress on Pattern Recognition CIARP
2010,24 where the main focus was on the use of multi-resolution histogram analysis.
In Ref. 24, starting from the full resolution input RGB image, a number of lower
resolution representations of the image were computed by means of a scaling down
interpolation method. The histograms of the three components (red, green and blue)
of the full resolution input image, as well as the histograms of the three components
G. Ramella & G. Sanniti di Baja
1360006-2
obtained at each lower resolution representation of the input image were computed.
For each color component, peaks and pits present in all the available histograms
were then detected. The histogram of each color component at full resolution was
simpli¯ed by retaining only peaks and pits that were present also in all the remaining
histograms at lower resolutions and that could be classi¯ed as dominating in the
histogram at full resolution. Peaks and pits remaining after histogram simpli¯cation
were used to establish a reduced number of representative values for each color
component. The image with quantized colors was then obtained by combining the
representative values of the three color components.
One of the advantages o®ered in Ref. 24 by the use of multi-resolution was the
ability of the method to originate automatically a transformed image with a number
of colors in the range established by the user. In fact, the number of resolution
representations used during the process (and, hence, the number of histograms used
to de¯ne the representative values) in°uences the number of ¯nal colors in the
quantized image. In particular, the larger the number of resolution levels is, the
smaller the number of ¯nal colors is. Thus, if the number of colors obtained by using
a given number of resolution levels is not included in the range selected by the user,
the process is automatically repeated by suitably increasing/decreasing the number
of resolution levels to be taken into account, until the number of ¯nal colors is in the
desired range.
As in Ref. 24, also in this work we start by analyzing and simplifying the structure
of the histograms of the three color components of an input RGB image so as to
identify a number of representative values for each component. Di®erently from
Ref. 24, we do not resort to the use of multi-resolution image representation to
generate an output image with a number of ¯nal colors in the desired range. This goal
is instead reached by working on the list of colors generated after the three histo-
grams have been processed. In practice, the method QHC described in this paper
consists of two processes. The ¯rst process is based on the analysis of histograms of
the three color components of the input RGB image. The number of colors detected
during this step is generally remarkably smaller than the number of colors in the
input image, but is likely to be larger than the number of colors desired by the user.
The second process performs clustering of the colors quantized by the ¯rst process,
based on their Euclidean distance. To this aim, colors resulting after the ¯rst process
are recorded in decreasing occurrence order and are sequentially examined in the
same order. Let Ci be the color at hand. Ci is taken as a representative color. Any
successive color Cj, j > i, su±ciently close to Ci is grouped with Ci and will not be
considered as representative color itself. The second process is automatically re-
peated with increasing tolerance on the distance between colors in order these colors
can be grouped in the same cluster, as far as the number of obtained representative
colors exceeds the maximum number of ¯nal colors desired by the user. At the end of
the second process, the output image is obtained by replacing the color of each pixel
of the input image with the closest representative color.
A New Technique for Color Quantization Based on Histogram Analysis and Clustering
1360006-3
Our post-clustering method is computationally convenient. In fact, the number of
initial colors that are determined by the ¯rst process is not a priori ¯xed and the
process does not require iterations. The ¯rst process signi¯cantly reduces the number
of colors with respect to those in the input image, so that the second process will
require a generally small number of iterations to obtain the desired number of colors.
Our method is able to generate automatically a transformed image with a number
of colors that does not overpass a given limit. Thus, di®erent transformed images,
each characterized by a di®erent number of colors, can be obtained starting from the
same input image by selecting di®erent limits for the number of ¯nal colors.
QHC has been tested on a large number of RGB images of di®erent size and color
distribution. The obtained results indicate that QHC has a better performance with
respect to our previous method24 both in terms of processing time and as regards the
quality of the results. We have also evaluated the performance of QHC, by com-
paring the results with those obtained by using other methods available in the
literature,11�13,34 in terms of quantitative measures such as Peak Signal to Noise
Ratio PSNR, Structural Similarity SSIM, Colorloss CL, and Compression Ratio
CR.8,25,28
The paper is organized as follows. Some preliminaries are given in Sec. 2; the
method is described in Sec. 3 and experimental results are discussed in Sec. 4.
Concluding remarks are ¯nally given in Sec. 5.
2. Preliminaries
We work with RGB images and interpret colors as three-dimensional (3D) vectors,
with each vector element having an 8-bit dynamic range. The RGB color space can
be geometrically represented as a 3D cube, where the three coordinates of each point
are the red, green and blue components of that point in the color space, see Fig. 1.
Each of the three edges of the cube has length 256, since each color component may
assume any value in the range [0, 255].
Given a color image I, the histogram of colors could be represented by using the
above 3D cube. In this case, the value in position (x, y, z) would account for the
number of pixels in I whose color components have values x, y, and z, respectively.
Of course, using this 3D histogram of colors would be computationally very expen-
sive, since it generally consists of a large (sparse) set of points. For example, consider
a color image I with size 1024� 1024. For such an image, at most 1.048.576 di®erent
colors would be possible (of course, only if each color in I has occurrence equal to
one), which is anyway only a small fraction (6,25%) of the total number of colors
possible when each color component may assume 256 di®erent values. Thus, most of
the color quantization methods based on color distribution actually work on the
histograms of the three individual color components of I. We also use this kind of
independent analysis of the three histograms, during the ¯rst process, though we are
aware that some important information contained in the dependence among the
channels may be lost.
G. Ramella & G. Sanniti di Baja
1360006-4
For a color image I, let K stand for any of the three components of I (the red
component R, the green component G, or the blue component B) and let us denote
by HK the histogram of the values in the color component K. For each value p of K,
with p in the range [0, 255], the height of the corresponding bin in HK shows the
occurrence of p, i.e. accounts for the number of pixels of K with value p.
Peaks and pits of HK are detected as the values for which the height of the
corresponding bin is locally maximal and locally minimal, respectively. Formally, if
p� 1, p and pþ 1 are three consecutive values, and hðpÞ is the height of the bin
corresponding to p in the color component K, then:
p is a peak of HK if
hðp� 1Þ � hðpÞ and hðpþ 1Þ < hðpÞ or
hðp� 1Þ < hðpÞ and hðpþ 1Þ � hðpÞ;p is a pit of HK if
hðpÞ � hðp� 1Þ and hðpÞ < hðpþ 1Þ or
hðpÞ < hðp� 1Þ and hðpÞ � hðpþ 1Þ:
Peaks and pits can be interpreted as vertices of a polygonal approximation of HK
and can be used to identify the representative values for the color component K by
replacing all values from a pit to the successive pit ofHK with a single representative
value. Actually, to furthermore reduce the number of representative values forK, we
perform a process aimed at disregarding a number of peaks and pits ofHK , while still
providing a reasonably good polygonal approximation of HK . To this aim, all ini-
tially detected peaks and pits of HK are taken as candidate vertices among which to
Red (255, 0, 0)
Green (0, 255,0)
Blue (0,0,255) Magenta (255,0,255)
Yellow (255, 255, 0)
Cyan (0, 255, 255)
R
B
G
Black (0,0, 0)
White (255,255,255)
Fig. 1. A 3D cube geometrically representing the RGB color space.
A New Technique for Color Quantization Based on Histogram Analysis and Clustering
1360006-5
select the ¯nal vertices. Three parameters, ai, ci, and di are associated with each
candidate vertex vi, where ai is the area of the triangle with vertices vi�1, vi and viþ1;
ci is the cosine of the angle formed by the two line segments respectively delimited by
the two vertices vi and vi�1 and by the two vertices vi and viþ1; di is the distance of vifrom the straight line joining vi�1 and viþ1.
The above three parameters, shown in Fig. 2(a), were introduced in Ref. 1 in the
framework of two-dimensional (2D) object's contour analysis to de¯ne the domi-
nance of a vertex by taking into account factors in°uencing human perception, such
as the size of the region viewed by a vertex vi (roughly related to the area ai of a
protrusion or an intrusion of the object) and the \cornerity" (roughly related to ci) of
the contour arc surrounding vi. In turn, di is related to a measure often used of
perceptual signi¯cance, de¯ned as the point-wise error between a contour arc and the
corresponding side in the approximating polygon.
Let v1; v2 ; . . . ; vn be the candidate vertices, i.e. all the peaks and pits detected in
HK , and denote by A be the arithmetic mean of the area ai of the triangles associated
to all vertices vi; i ¼ 1; . . . ;n. Then, candidate vertices are sequentially inspected.
Let � be a weight, whose value is ¯xed depending on the tolerance for polygonal
approximation. Any two consecutive candidate vertices of HK , say vi and viþ1, are
not retained as ¯nal vertices if: (i) both ai < �A and aiþ1 < �A, and (ii) ai ¼ aiþ1, or
ci ¼ ciþ1, or di ¼ diþ1. Moreover, any candidate vertex vi of HK which is no longer a
relative maximum or a relative minimum of HK due to removal of neighboring
candidate vertices is not retained as a vertex.
As an example of the performance of polygonal approximation, see Fig. 2(b) right.
To evaluate the performance of color quantization algorithms, the most com-
monly used measures are the Peak Signal to Noise Ratio PSNR, the Structural
SIMilarity SSIM, the Colorloss CL, and the Compression Ratio CR.
For gray-level images, PSNR is computed as follows:
PSNR ¼ 20� log10
255ffiffiffiffiffiffiffiffiffiffiffiMSE
p� �
;
(a) (b)
Fig. 2. (a) The three parameters ai, ci, di associated to the vertex vi; (b) an histogram, left, and itspolygonal approximation, right (color online).
G. Ramella & G. Sanniti di Baja
1360006-6
where
MSE ¼ 1
H �K
XHi¼1
XKj¼1
ðvi;j � wi;jÞ2
and vi;j and wi;j respectively belong to the input image and to the output image of
size H�K.
For RGB images, where there are three values per pixel, the de¯nition of PSNR is
still the same, but MSE is the sum over all squared value di®erences divided by image
size and by three.
The more the original image and the quantized image are similar, the smaller is
MSE. As a consequence, PSNR increases when similarity between input and output
images increases, reaching in¯nity if the two images are identical.
The SSIM index for two gray-level images v and w is computed as follows:
SSIMðv;wÞ ¼ ð2�v�w þ c1Þð2covvw þ c2Þð�2
v þ �2w þ c1Þð�2
v þ �2w þ c2Þ
;
where
�v is the average of v;
�w is the average of w;
�2v is the variance of v;
�2w is the variance of w;
covvw is the covariance of v and w;
c1 ¼ ðk1LÞ2, c2 ¼ ðk2LÞ2 are two variables to stabilize the division with weak
denominator;
L is the dynamic range of the pixel values (255 for 8-bit image);
k1 ¼ 0; 01 and k2 ¼ 0; 03 are default values.
The SSIM index is computed by using a sliding window approach. The window
size is ¯xed to 8� 8, as suggested in Ref. 33. The sliding window moves pixel-by-pixel
from the top-left corner to the bottom-right corner of the image, and the SSIM index
is computed within the sliding window. As a result, an SSIM index map of the image
is obtained, and the overall quality value is de¯ned as the average of the SSIM index
map, i.e. the mean SSIM index. The SSIM value is in the range [0,1], where higher
values denote better structural similarity. For RGB images, the SSIM index is
computed for the three channel components independently and the quality value is
obtained by the average of the three indexes.
The computation of SSIM has been done in this paper by using the code developed
for Open CV in Cþþ by Rabah Mehdi,36 where an 11� 11 Gaussian weighting
function is used to compute average, variance and covariance of the images.
The Colorloss CL is a quantitative measurement of the loss of color information
caused by quantization. CL is de¯ned as the Euclidean color distance of a pixel in the
original image and the corresponding pixel in the quantized image. The larger the
A New Technique for Color Quantization Based on Histogram Analysis and Clustering
1360006-7
colorloss, the greater is the loss in color information. Let I consist of N pixels, and let
the RGB values of a pixel p be (rp, gp, bp). Let I0 be the quantized image and let (r 0p,
g 0p, b
0p) be the color components of the pixel corresponding to p in I 0: The colorloss
between the images I and I 0 is de¯ned as follows:
CLðI; I 0Þ ¼PN
1
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiðrp � r 0pÞ2 þ ðgp � g 0pÞ2 þ ðbp � b 0pÞ2
pN
:
The compression ratio CR denotes which is the percentage of the original size of
the image that results after compression. CR is computed as the ratio between the
size of the output stream and the input stream expressed in bit per pixel (bpp).28
3. The QHC Algorithm
QHC is a post-clustering method and as such consists of two processes, respectively
aimed at the detection of an initial set of representative colors, and at the re¯nement
of this set in order to achieve the desired number of ¯nal representative colors. The
¯rst process is based on histogram analysis. In principle, one could resort to a simple
quantization of the color histogram by dividing the 3D cube into a number of equally
sized smaller cubes and by suitably selecting representative colors for each of them.
However, a subdivision of the color space into equal volumes does not generally
produce good results. On the other hand, working with the 3D histogram of the input
image, which is generally characterized by a large number of sparse colors, is com-
putationally expensive. Thus, we prefer to resort to the analysis of the histograms of
the three color components of I. Aim of this process is to reduce signi¯cantly the
number of colors, while maintaining a high degree of similarity with the input image.
Starting from the input color image I, the three color components R, G and B are
considered. For each of them, say K, the histogram HK is generated and the cor-
responding polygonal approximation is computed as described in Sec. 2.
Once the structure ofHK has been simpli¯ed, the representative values for the color
componentK are computed as follows. All values from a pit to the successive pit ofHK
are replaced by a single representative value. In general, one peak only is expected in
between two successive pits. However, due to the fact that relative local maxima and
relative local minima are detected and due to the histogram simpli¯cation process,
peaks and pits do not always regularly alternate. Thus, more than one peak, or no peak
at all, could exist in between two successive pits. Therefore, three cases are possible
and the following three criteria are used to select the representative value:
(1) If exactly one peak exists between two successive pits, the representative value is
the value of the peak.
(2) If more than one peak exists in between two successive pits, the representative
value is the value of the leftmost peak.
(3) Otherwise, the representative value is the value of the leftmost pit.
G. Ramella & G. Sanniti di Baja
1360006-8
After the histogram of each color component has been simpli¯ed by retaining only
a subset of the initially detected peaks and pits and the representative values have
been computed for each color component, the quantized image I 0 is obtained by
combining the representative values of the three color components. If the number of
colors of I 0 is not larger than the maximum number of ¯nal colors desired by the user
the process terminates. Otherwise, the second process is performed.
We note that the number of colors resulting at the end of the ¯rst process
depends on the combination of the representative values for the three color com-
ponents as well as on number of vertices remaining in each of the three histograms
after their simpli¯cation. In turn, the number of vertices remaining after histogram
simpli¯cation depends on the value of the multiplicative weight � used to compare
the area aj of the triangle associated to the current candidate vertex vj with the
arithmetic mean A of the area of the triangles associated to all candidate vertices
vi; i ¼ 1; . . . ;n. We have used for � di®erent values satisfying the condition
0 < � � 1. The more � is close to 1, the smaller is the number of vertices. However,
for � very close to 1 polygonal approximation provides a less faithful representation
of the histogram. We have experimentally found that, in the average, the best
compromise between number of vertices and faithfulness of the polygonal approxi-
mation is obtained by taking � ¼ 0:5. Accordingly, we recommend to use � ¼ 0:5 as
default value.
The second process reduces the number of colors obtained at the end of the
¯rst process to a number smaller than or equal to the maximum number of colors
desired by the user. The goal is reached by clustering colors based on their Euclidean
distance.
Representative colors determined by the ¯rst process and recorded in a list, are
sequentially analyzed to decide whether they should be taken as ¯nal representative
colors or as members of clusters associated with other already selected ¯nal repre-
sentative colors. Once the proper decision is taken for a color, such a color is
suitably marked in the list so as to avoid considering it again either as a new ¯nal
representative color or as member of a cluster associated with another ¯nal repre-
sentative color. More in detail, colors resulting from the ¯rst process are initially
sorted in decreasing occurrence order, and the minimal distance � among colors is
computed. At this stage, all colors are still unmarked, so that each of them can be
processed. The currently processed color Ci is taken as a new ¯nal representative
color and is marked. Each successive unmarked color Cj, j > i, whose distance from
Ci is smaller than �, is taken as member of the cluster associated with Ci and is
marked.
The number of clusters obtained after all representative colors have been pro-
cessed is compared with the maximum number of ¯nal colors desired by the user. If
the obtained ¯nal representative colors are still too many, the second process is
repeated, starting from the colors provided by the ¯rst process. To reduce the
number of ¯nal representative colors, a larger tolerance on distance is obviously
A New Technique for Color Quantization Based on Histogram Analysis and Clustering
1360006-9
necessary to group more colors in clusters. Thus, at each repetition of the second
process, the value of � used during the previous application of the process is increased
by 1. The second process is repeated until the obtained number of ¯nal representative
colors is not larger than the desired maximum number of colors. The number of
repetitions of the second process depends on the number of colors provided by the
¯rst process and on the desired number of ¯nal colors.
At the end of the second process, the output image is obtained by replacing the
color of each pixel of I with the closest representative color.
4. Experimental Results
We have applied the color quantization algorithm QHC to a collection of images with
di®erent size and color distribution, taken from available databases.37�39 The small
dataset of sixteen images in Fig. 3 is used to show the performance of our method in
terms of the obtained number of representative colors RC, and of the quantitative
measures PSNR, SSIM, CL and CR.
Table 1 summarizes the results obtained at the end of the ¯rst process, performed
with � ¼ 0:5 for all test images. For each test image, its size, the number of colors of
the input image IC, the number of representative colors resulting after the ¯rst
process RC1, the Peak Signal to Noise Ratio PSNR, the Structural SIMilarity SSIM,
the Colorloss CL, and the Compression Ratio CR are reported. As already pointed
out, a smaller number of colors RC1 could be obtained at the end of the ¯rst process
by selecting a larger value for �. However, such a choice would produce resulting
images with lower similarity to the original images. By taking also in consideration
that the computational cost of the second process is rather small, we prefer to use
� ¼ 0:5 during the ¯rst process so as to have better quality results.
Table 2 summarizes the results at the end of the second process and compares the
performance of QHC with that of other methods in the literature, namely with the
Median Cut MC,13 the Octree OT,12 the method W by Wu,34 and the method NQ
by Dekker.11 In particular, as concerns NQ we have set to 10 the sampling factor.
baboon barbara bike cablecar colors cornfield flower1 flower2
fruits housed kodim14 kodim15 lake monarch soccer yacht
Fig. 3. Test images.
G. Ramella & G. Sanniti di Baja
1360006-10
This choice is motivated by the fact that with such a sampling factor the speed of
the algorithm is reasonably increased and the quality of the results is only slightly
reduced.
The best values (i.e. the maximal values for PSNR and SSIM, and the minimal
values for RC, CL and CR) are in bold. For the comparison, we have set to 256 the
maximum number of desired ¯nal colors. The arithmetic means computed on the
sixteen test images for the values of PSNR, SSIM and CL are respectively 32.114,
0.897 and 9.338 for MC, 33.187, 0,908 and 8,914 for OT, 33.957, 0.936 and 7.846 for
W, 33.626, 0.938 and 7.650 for NQ, and 33.868, 0.921 and 7.974 for QHC. We point
out that the above average values for QHC computed on the sixteen test images do
not signi¯cantly di®er from the arithmetic means computed on the whole set of
images that we have used.
It can be observed that QHC always compares favorably with respect to MC and
OT, while in some cases it performs worse than W or NQ. However, we remark that
when the performance of W or NQ is better than the performance of QHC in terms of
PSNR or SSIM, the number of representative colors RC characterizing our method is
generally smaller than the number of colors quantized by W or NQ.
An interesting feature of QHC is the possibility for the user to set a di®erent limit
to the maximum number of ¯nal colors for the quantized image. As an example, refer
to Fig. 4, where for the test images barbara, cablecar, °ower1, fruits and kodim15
four di®erent limits have been used for the maximum number of ¯nal representative
colors, namely 128, 256, 384 and 512.
For the test images for which QHC originates the results shown in Fig. 4, the
number of ¯nal representative colors in the corresponding output images and the
resulting PSNR and SSIM are given in Table 3, where the limits to the maximal
number of ¯nal representative colors are indicated between brackets.
Table 1. Results after the application of the ¯rst process of QHC.
Size IC RC1 PSNR SSIM CL CR
baboon 512� 512 171 045 26 289 36.622 0.976 4.924 0.845barbara 480� 384 156 234 4028 35.417 0.964 6.587 0.694
bike 780� 916 206 155 28 048 36.936 0.955 4.697 0.837
cablecar 512� 480 130 416 17 945 40.931 0.985 3.169 0.832
colors 432� 288 97 465 12 509 33.347 0.909 7.177 0.821corn¯eld 512� 480 134 514 17 836 39.163 0.987 3.370 0.829
°ower1 512� 480 111 841 10 310 39.287 0.974 3.620 0.795
°ower2 512� 512 178 778 10 787 35.501 0.960 5.380 0.768fruits 512� 480 160 476 21 724 38.526 0.978 3.153 0.833
housed 512� 512 154 605 2369 30.749 0.932 9.496 0.650
kodim14 768� 512 55 117 13 239 37.960 0.981 3.821 0.869
kodim15 768� 512 44 576 10 107 39.733 0.976 3.461 0.861lake 512� 512 168 459 8315 34.625 0.955 6.473 0.750
monarch 480� 320 90 403 2579 34.976 0.953 6.449 0.688
soccer 512� 480 139 156 19 177 40.155 0.987 3.293 0.833
yacht 512� 480 150 053 25 398 41.261 0.984 2.996 0.851
A New Technique for Color Quantization Based on Histogram Analysis and Clustering
1360006-11
Table 2. Comparisons of the performances of QHC, MC, OT, W and NQ.
QHC MC OT W NQ QHC MC OT W NQ
baboon fruits
RC 249 256 256 256 256 RC 234 256 256 256 255
SSIM 0.926 0.908 0.918 0.937 0.938 SSIM 0.899 0.859 0.870 0.907 0.910
PSNR 31.679 30.118 31.024 31.677 31.559 PSNR 34.346 32.579 32.872 34.205 34.038
CL 10.852 12.593 11.788 10.468 10.362 CL 7.505 9.501 9.208 7.723 7.518
CR 0.458 0.460 0.460 0.460 0.460 CR 0.455 0.463 0.463 0.463 0.462
barbara housed
RC 243 256 252 256 256 RC 253 256 256 256 255
SSIM 0.927 0.889 0.897 0.932 0.922 SSIM 0.926 0.907 0.925 0.943 0.946
PSNR 33.060 30.609 31.488 32.743 32.207 PSNR 32.521 31.862 33.463 34.066 33.972
CL 8.987 11.982 11.110 9.166 9.394 CL 8.529 10.157 8.396 7.713 7.416
CR 0.459 0.464 0.462 0.464 0.464 CR 0.463 0.464 0.464 0.464 0.464
bike kodim14
RC 244 256 256 256 250 RC 231 256 251 256 248
SSIM 0.883 0.853 0.860 0.912 0.919 SSIM 0,950 0.932 0.952 0.964 0.967
PSNR 32.008 30.021 31.198 32.920 32.490 PSNR 35.006 33.614 35.399 35.202 34.883
CL 9.154 11.894 10.736 8.114 7.797 CL 6.626 8.103 6.626 6.590 6.207
CR 0.449 0.453 0.453 0.453 0.451 CR 0.499 0.508 0.506 0.508 0.505
cablecar kodim15
RC 247 256 256 256 238 RC 225 256 252 256 246
SSIM 0.931 0.913 0.917 0.947 0.950 SSIM 0.938 0.907 0.920 0.949 0.958
PSNR 34.376 32.466 33.049 34.152 33.529 PSNR 36.153 33.414 35.205 35.821 35.705
CL 7.309 9.089 8.543 7.408 7.189 CL 5.713 8.278 6.750 6.188 5.570
CR 0.468 0,471 0,471 0,471 0.465 CR 0.506 0.518 0.517 0.518 0.514
colors lake
RC 240 256 256 256 256 RC 224 256 256 256 254
SSIM 0.862 0.859 0.842 0.895 0.891 SSIM 0.914 0.875 0.900 0.934 0.934
PSNR 31.402 29.866 30.450 32.030 30.650 PSNR 33.465 31.769 32.924 33.589 32.986
CL 10.682 12.200 11.899 9.474 10.367 CL 8.389 10.374 9.167 8.224 8.046
CR 0.477 0.483 0.483 0.483 0.483 CR 0.450 0.461 0.461 0.460 0.460
corn¯eld monarch
RC 222 256 254 256 242 RC 222 256 256 256 246
SSIM 0.949 0.932 0.932 0.961 0.965 SSIM 0.936 0.919 0.935 0.950 0.953
PSNR 34.618 32.753 33.118 34.465 34.146 PSNR 34.207 33.943 35.235 35.120 34.877
CL 7.070 9,062 8.800 7.232 6.961 CL 7.222 8.085 6.984 6.897 6.663
CR 0.457 0.470 0.469 0.470 0.465 CR 0.473 0.486 0.486 0.486 0.482
°ower1 soccer
RC 247 256 255 256 245 RC 226 256 256 256 250
SSIM 0.932 0.896 0.917 0.939 0.943 SSIM 0.946 0.930 0.938 0.960 0.964
PSNR 36.715 34.646 35.916 35.639 35.809 PSNR 34.537 32.531 33.584 34.361 34.095
CL 5.675 7.405 6.430 6.594 6.054 CL 7.535 9.270 8.315 7.378 7.135
CR 0.474 0.477 0.477 0.477 0.473 CR 0.458 0.468 0.468 0.468 0.466
°ower2 yacht
RC 243 256 252 256 256 RC 237 256 256 256 253
SSIM 0.896 0.861 0.880 0.902 0.898 SSIM 0.918 0.908 0.917 0.937 0.944
PSNR 33.760 30.897 32.691 33.488 33.396 PSNR 34.029 32.735 33.368 33.839 33.667
CL 8.303 11.397 9.435 8.472 8.162 CL 8.038 9.072 8.435 7.892 7.565
CR 0.454 0.459 0.457 0.459 0.459 CR 0.459 0.465 0.465 0.465 0.464
G. Ramella & G. Sanniti di Baja
1360006-12
The qualitative performance of QHC on a few more images with di®erent size and
color distribution can be appreciated with reference to Fig. 5. The input images (odd
lines) and the resulting quantized versions (even lines) have been obtained by setting
to 128 the limit on the maximal number of ¯nal representative colors. For each
Fig. 4. Each line shows from left to right the four quantized images, obtained by limiting to 128, 256, 384,
and 512 the number of ¯nal representative colors.
Table 3. Performance of QHC by setting to 128, 256, 384 and 512 the maximum number of represen-
tative colors.
RC
(128) PSNR SSIM
RC
(256) PSNR SSIM
RC
(384) PSNR SSIM
RC
(512) PSNR SSIM
barbara 115 31.001 0.888 243 36.060 0.927 346 33.871 0.939 393 34.174 0.944
cablecar 121 32.171 0.903 247 34.376 0.931 364 35.412 0.945 510 36.336 0.957°ower1 123 38.844 0.899 247 36.715 0.932 330 37.327 0.940 419 37.713 0.947
fruits 120 32.491 0.860 234 34.346 0.899 340 35.126 0.913 469 35.956 0.929
kodim15 121 34.354 0.910 225 36.153 0.938 330 37.086 0.950 470 37.847 0.961
A New Technique for Color Quantization Based on Histogram Analysis and Clustering
1360006-13
image, the number of colors in the initial and in the quantized image (bold) and the
size (in brackets) are also indicated.
5. Concluding Remarks
The color quantization algorithm QHC has been described that, starting from an
input color image, generates a quantized image with a smaller number of colors, but
still maintaining the visual aspect of the input image satisfactorily. The algorithm
consists of two successive processes. The ¯rst process is based on the analysis of
the histograms of the three color components of the RGB input image. The second
47819 96395 18764 73580 (512 512) (720 576) (481 321) (512 512)
105 123 100 113
77426 36175 197410 43365 (480 320) (340 256) (787 576) (344 340)
118 113 126 124
Fig. 5. Results of the application of QHC by setting to 128 the maximal number of representative colors.
G. Ramella & G. Sanniti di Baja
1360006-14
process furthermore reduces the number of representative colors by clustering colors
based on their Euclidean distance. At the end of the second process, the output image
is obtained by replacing the color of each pixel of the input image with the closest
representative color. QHC is able to produce di®erent quantized images with a dif-
ferent number of representative colors, limited to a maximum ¯xed by the user.
The computational cost of the algorithm is O(N), where N is the number of pixels
in the image at hand. QHC has been implemented on a Pentium 4 (3.39 GHz, 2GB
RAM) personal computer and has been applied to a large set of images, producing
satisfactory results in terms of PSNR, SSIM, CL and CR.
References
1. C. Arcelli and G. Ramella, Finding contour-based abstractions of planar patterns,Pattern Recogn. 26(10) (1993) 1563�1577.
2. A. Atsalakis and N. Papamarkos, Color reduction and estimation of the number ofdominant colors by using a self-growing and self-organized neural gas, Eng. Appl. Artif.Intell. 19 (2006) 769�786.
3. Z. Bing, S. Junyi and P. Qinke, An adjustable algorithm for color quantization, PatternRecogn. Lett. 25 (2004) 1787�1797.
4. J. P. Braquelaire and L. Brun, Comparison and optimization of methods of color imagequantization, IEEE Trans. Image Process. 6(7) (1997) 1048�1052.
5. L. Brun and A. Tr�emeau, Color quantization, in Digital Color Imaging Handbook,Electrical and Applied Signal Processing (CRC Press, 2002), pp. 589�638.
6. M. E. Celebi, Improving the performance of k-means for color quantization, Image VisionComput. 29 (2011) 260�271.
7. T. W. Chen, Y. L. Chen and S. Y. Chien, Fast image segmentation based on K-meansclustering with histograms in HSV color space, in Proc. IEEE 10th Workshop onMultimedia Signal Processing (2008), pp. 322�325.
8. H. C. Chan, Perceived image similarity and quantization resolution, Displays 29 (2008)451�457.
9. S. C. Cheng and C. K. Yang, A fast and novel technique for color quantization usingreduction of color space dimensionality, Pattern Recogn. Lett. 22 (2001) 845�856.
10. J. Delon, A. Desolneux, J. L. Lisani and A. B. Petro, A nonparametric approach forhistogram segmentation, IEEE Trans. Image Process. 16(1) (2007) 253�261.
11. A. Dekker, Kohonen neural networks for optimal colour quantization, Network-Comp.Neural Syst. 5(3) (1994) 351�367.
12. M. Gervautz and W. Purgathofer, A simple method for color quantization: Octreequantization, in Graphics Gems, ed. A. S. Glassner (Academic Press, 1990), pp. 287�293.
13. P. S. Heckbert, Color image quantization for frame bu®er display, Proc. ACM SIG-GRAPH'82 16(3) (1982) 297�307.
14. I. S. Hsieh and K. C. Fan, An adaptive clustering algorithm for color quantization,Pattern Recogn. Lett. 21 (2000) 337�346.
15. Y. Jiang, Y. Wang, L. Jin, H. Gao and K. Zhang, Investigation on color quantizationalgorithm of color image, ECWAC 2011, Part II, eds. G. Shen and X. Huang, CCIS 144(Springer, Berlin, 2011), pp. 181�187.
16. K. Kanjanawanishkul and B. Uyyanonvara, Novel fast color reduction algorithm for time-constrained applications, J. Visual Commun. Image Represent. 16 (2005) 311�332.
A New Technique for Color Quantization Based on Histogram Analysis and Clustering
1360006-15
17. N. Kim and N. Kehtarnavaz, DWT-based scene-adaptive color quantization, Real-TimeImag. 11 (2005) 443�453.
18. A. Mojsilovic and E. Soljanin, Color quantization and processing by Fibonacci lattices,IEEE Trans. Image Process. 10(11) (2001) 1712�1725.
19. D. Ozdemir and L. Akarun, A fuzzy algorithm for color quantization of images, PatternRecogn. 35 (2002) 1785�1791.
20. A. W. Paeth, Mapping RGB triples onto four bits, in Graphics Gems, ed. A. S. Glassner(Academic Press, Cambridge, MA, 1990), pp. 233�245.
21. N. Papamarkos, A. E. Atsalakis and C. P. Strouthopoulos, Adaptive color reduction,IEEE Trans. Syst. Man Cyber. 32(1) (2002) 44�56.
22. K. N. Plataniotis and A. N. Venetsanopoulos, Color Image Processing and Applications(Springer, Berlin, 2000).
23. G. Ramella and G. Sanniti di Baja, Color quantization by multiresolution analysis, inComputer Analysis of Images and Patterns, eds. X. Jiang and N. Petkov, LNCS 5702(Springer, Berlin, 2009), pp. 525�532.
24. G. Ramella and G. Sanniti di Baja, Multiresolution histogram analysis for colorreduction, Proc. 15th Iberoamerican Congress on Pattern Recognition, eds. I. Bloch andM. R. Cesar-Jr., LNCS 6419 (Springer, Berlin, 2010), pp. 22�29.
25. J. Rasti, A. Monadjemi and A. Vafaei, Color reduction using a multi-stage Kohonen self-organizing map with redundant features, Expert Syst. Appl. 38 (2011) 13188�13197.
26. J. A. Robinson, Adaptive prediction trees for image compression, IEEE Trans. ImageProcess. 15(8) (2006) 2131�2145.
27. Y. Rui and T. S. Huang, Image retrieval: Current techniques, promising directions, andopen issues, J. Visual Commun. Image Represent. 10 (1999) 39�62.
28. D. Salomon, Data Compression: The Complete Reference (Springer-Verlag, London,2007).
29. N. Shorter and T. Kasparis, Fuzzy ART for relatively fast unsupervised image colorquantization, Proc. of 19th Int. Conf. Pattern Recognition ISBN/ISSN: 978-1-4244-2175-6 (IEEE CS Press, 2008).
30. Y. Sirisathitkul, S. Auwatanamongkol and B. Uyyanonvara, Color image quantizationusing distances between adjacent colors along the color axis with highest color variance,Pattern Recogn. Lett. 25 (2004) 1025�1043.
31. G. Sreelekha and P. S. Sathidevi, An HVS based adaptive quantization scheme for thecompression of color images, Digit. Sig. Process. 20 (2010) 1129�1149.
32. J. Wang, W. J. Yang and R. Acharya, Color space quantization for color-content-basedquery systems, Multi. Tools Appl. 13 (2001) 73�91.
33. Z. Wang, L. Lu and A. C. Bovik, Video quality assessment based on structural distortionmeasurement, Sig. Process.: Image Commun. 19(2) (2004) 121�132.
34. X. Wu, Color quantization by dynamic programming and principal analysis, ACMTrans.Graph. 11(4) (1992) 349�372.
35. X. Zhang, Z. Song, Y. Wang and H. Wang, Color quantization of digital images, PCM2005, Part II, eds. Y. S. Ho and H. J. Kim, LNCS 3768 (Springer, Berlin, 2005),pp. 653�664.
36. http://mehdi.rabah.free.fr/SSIM.37. http://www.hlevkin.com/TestImages/.38. http://r0k.us/graphics/kodak/.39. http://sipi.usc.edu/database/.
G. Ramella & G. Sanniti di Baja
1360006-16
Giuliana Ramella re-ceived her doctoral degreein Physics from the Uni-versity of Naples FedericoII, Naples, Italy, in 1990.In the period 1990�1997,she was granted re-search fellowship at theInstitute of Cybernetics\E. Caianiello" of theItalian National Research
Council, where in 1997 she got the permanentposition of researcher. Her main interests aredigital geometry and topology, multiresolution,shape representation and analysis, and imagecompression. Since the year 2000, she has had anumber of teaching contracts with three univer-sities of Naples (Federico II, Second University,Parthenope).
Gabriella Sanniti diBaja received her doc-toral degree \cum laude"in Physics from the Uni-versity of Naples, Italy,in 1973. In 2002, she re-ceived her Ph.D. HonorisCausa from the UppsalaUniversity, Sweden. Since1973, she has been work-ing in the ¯eld of image
processing and pattern recognition at the Insti-tute of Cybernetics \E. Caianiello" of theNational Research Council of Italy, Naples,where she is currently the director of research.Her main research activities concern 2D and 3Dshape representation, decomposition and des-cription. She has published more than 130 papersin international journals and conference pro-ceedings and is an editor-in-chief of PatternRecognition Letters (special issues). She has beena member of the Executive Committee of theInternational Association for Pattern Recogni-tion (IAPR) for ten years, being IAPR Presidentfrom 2000�2002. She is an IAPR fellow andForeign Member of the Royal Society of Sciencesat Uppsala, Sweden.
A New Technique for Color Quantization Based on Histogram Analysis and Clustering
1360006-17