distance measures for color image retrieval

Upload: al-ghamdi-noura

Post on 05-Apr-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 Distance Measures for Color Image Retrieval

    1/5

    DISTANCE MEASURES FOR COLOR IMAGE RETRIEVALD . Androutsost, K. N . Plataniotist and A . N . Venetsanopoulost

    t Digital Signal & Image Processing LabDepartment of Electrical & Computer EngineeringUniversity of Toronto10 Kings College Road, Toronto, Ontario, M5S 3G4, CANADAhttp: /www.dsp. toronto.edu{zeus,anv}@dsp. oronto.edu

    SRyerson Polytechnic University, Department of Math, Physics & Computer Science350 Victoria Street, Toronto, Ontario, M5B 2K3, CANADAkplat [email protected]

    ABSTRACTIn this paper we address the issue of image database

    retrieval based on color using various vector distancemetrics. Our system is based on color segmentationwhere only a few representative color vectors are ex-tracted from each image and used as image indices.These vectors are then used with vector distance mea-sures to determine similarity between a query color anda database image. We test numerous popular vectordistance measures in our system and find that direc-tional measures provide the most accurate and percep-tually relevant retrievals.

    1. INTRODUCTIONEfficient access to digital da ta has become an issue ofutmost importance recently. In particular, the amountof digital image and video data available is staggeringand the challenging issues of cataloging and retrievalhas gained increasing importance. Without a doubt,efficient access to relevant data directly determines itsvalue.As digital acquisition and storage grow, a numberof industrial fields, such as medical imagery, graphicarts, textile and paint, satellite imagery, criminologyand film, require efficient access to their data . Content-Based Image Retrieval (CBIR) is a relatively new re-search area which is dedicated to the image retrievalproblem [I] and a number of image database systemhave been developed [2 , 31.A key aspect of image databases is the creationof robust and efficient indices which are used for re-trieval. Color remains the most important low-level

    feature which is used to index database images [4]. Thisis not surprising since visual recognition and human re-call are highly dependant t o color.

    However, the majority of retrieval techniques imple-ment color histograms for image retrieval using color.These histogram-based techniques provide good results,however, issues such as histogram dimensionality, andthe lack of good perceptually-based similarity measurescall for new methods. Furthermore, the granularitywhich histogram techniques provide is not essential forefficient retrieval since it is our perception of color thatis of utmost importance and, as humans, we cannotdiscern the difference between very close color values.In this paper we study various vector distance mea-sures which we implement in our image database sys-tem for image retrieval by color. We use color segmen-tation to extract regions of prominent color and userepresentative vectors from these extracted regions inthe image indices. Distance measures are then used inthe query process to determine image similarity. Sec-tion 2 describes our database system and the indexingprocess. Section 3 discusses the distance measures im-plemented and Section 4 presents some results. Finally,Section 5 concludes the paper with a final discussion.

    2. COLOR INDEXINGTo build indices into our image database we take intoconsideration factors such as human color perceptionand recall. For example, as humans it is very diffi-cult, if not impossible, for us to visually discern thedifference between two very close (R ,G, B ) values, e.g.,(255,48,32) and (254 ,48 ,32) . Furthermore, if we were

    7700-8186-8821-1/98$10.000 998 IEEE

    http://toronto.edu/http://toronto.edu/http://toronto.edu/http://toronto.edu/
  • 8/2/2019 Distance Measures for Color Image Retrieval

    2/5

    to describe the color content of an image, we would useterms such as red, dark yellow or deep green, not RGBvalues. Essentially, we build a low level model of theimage in question in our mind and compare candidateimages to this model. The color granularity providedby histogram indexing is, in most cases, not necessary,especially when the final observer is a human.

    2.1. HSV SegmentationOur method of color indexing implements segmenta-tion t o extract regions within the image which containperceptually similar color. We do this by threshold-ing the HSV histogra.ms of the image. Specifically, itis the hue histogram which contains most of the colorinformation. The saituration and value are examinedand used to determine which regions of the image areachromatic. We have found, in the literature and ex-perimentally [5, 61, that colors with value< 25 can beclassified as black and that colors with saturation< 20and value> 60 can be classified as white.

    The remaining pixels all fall in the chromatic regionof the HSV cone, as shown in Figure 1. We build thehue histogram of these remaining pixels and thresholdthe resulting peaks t80segment the image into n re-gions. Finally, we calculate the average color of each of

    VAL

    Green Yellow

    Figure 1: HSV cone depicting BLACK and WHITEregions.the n regions and use that RGB value as each regionsrepresentative vector. Figure 2 shows a typical imageand its corresponding hu e histogram. Figure 3 showsthe segmented result after histogram thresholding. Ascan be seen, the result is an accurate low-level repre-sentation of the color content in the image using only n

    Figure 2: Typical image and it s HUE histogram.

    color vectors. In addition, we also have at our disposalspatial color information which can also be indexed.

    The above segmentation technique was performedon 2000 24-bit natural images of 512 x 512 resolutionand each n representative vector for each image, wereused to build each index.

    3. DIST.ANCE MEASURESTo perform the actual image retrieval we investigated anumber of vector distance measures to discover whichgave the most accurate and perceptually correct result:

    1. The generalized Mink ow sk i metric (L Mnorm):

    771

  • 8/2/2019 Distance Measures for Color Image Retrieval

    3/5

    Figure 3: Segmented image.

    1P M-Md M ( i , j )=(c($ -$ ) I , ( l)k= l

    where p is the dimension of the vector & and xfis the kth element of & . Three special cases ofthe LM metric are of particular interest, namely,L = 1,2,00.The Canberra distance defined as follows:

    where p is the dimension of the vector Zi and xfis the kt h element of & . The Canberra metricapplies only to non-negative multivariate data,which is the case when color vectors described inthe RGB reference system are considered.Another distance measure applicable only to vec-tors with non-negative components, such as colorsignals is the Czekanowski coe f ic ient , defined asfollows:

    The angular d istance between two vectors can beused to quantify similarity:

    Since, similar colors have almost parallel orien-tations, significantly different colors point in dif-ferent overall directions in the RGB color space.Thus, the angular distance which quantifies theorientat ion difference between two color signals isa meaningful measure of their similarity.

    5 . We have also developed a new distance measurewhich combines the angle between two vectorsand their magnitude difference. When two vec-tors under consideration are collinear, only mag-nitude difference is used to quantify intensity dif-ferences:

    angle magni tudeSince we deal with RGB vectors, we are con-

    strained to one quadrant of the Cartesian space.Thus, the normalization factor of $ in the angleportion is attributed to the fact that the maxi-mum angle which can possibly be attained is ;Also, the d m ormalization factor, in themagn i tude part of (6), is due to the fact thatthe maximum difference vector which can existis (255,255,255) and its magnitude is d w .This measure has the added advantage that themagnitude and angle portions can be assignedweights to stress one over the other.

    4. RESULTSThe query was performed by providing a que ry colorfrom a color picker and the system applied the givendistance measure to each n representative vectors ofeach database image. In addition we chose to lookfor images which contain over 50% of the query color.The results were then numerically sor ted and the best23 images were displayed along with the query color.Quantitat ive performance was evaluated by calculatingthe retrieval rate, defined as [7]:

    where Ni is the to tal images in a given query set Q, (i.e.,all images in the database which match the query),

    772

  • 8/2/2019 Distance Measures for Color Image Retrieval

    4/5

    and N j are the number of images which appear in thetop Ni retrieval positions which are part of Q. Theset Q was obtained by asking a number of volunteersto manually search through our 2000 image databaseand list the images which were considered to containat least 50% of a given query color with RGB valuesof 130 ,164 ,53 . Table 1 lists the retrieval rates for theabove mentioned distance measures. Clearly, it can beseen that the angular-based measures perform muchbetter in terms of retrieval rate. For qualitative anal-

    Table 1: Retrieval rate for 7 different vector distancemeasures

    measureL1---LmCanberra

    czelcimowslcianglecombo

    ysis, figures 4-10 depict the top 23 retrieval results forthe 7 discussed measures. In addition, the query coloris included in the top-left corner of each figure. As indi-cated from Table 1 , the Czekanowski measure gives thepoorest result, where the majority of retrieved imagesare visually erroneous.

    The L-norms prcivide improved results (Figs. 4-6).The L1 and La, in particular, provide similar retrievalresults, especially in the first 6 rankings.The Canberra measure (Fig. 7, provided good rank-ing for the first 6 positions but then included manyerroneous images in tlhe remaining results.

    The angular distance-based measures (Figs. 9-10,perceptually provided the best retrieval results of theinvestigated measures. All the retrieved images, withthe exception of 5 , exhibited a high color content sim-ilar to the query color.

    5. CONCLUSIONSIn this paper we have investigated various vector dis-tance measures for use in color image retrieval. In oursystem, we color segment each database image and gen-erate representative BtGB vectors for each of the n col-ors extracted from each image. These n vectors arethen used as database: indices for each color image. Wetest a number of well-known distance measures usingthese indices and a query color and we have found tha t

    Figure 4: Retrieval result using L1 norm. Top leftimage is the query color. Decreasing similarity fromleft to right, top to bottom.

    Figure 5: Retrieval result using Lz norm. Top leftimage is the query color. Decreasing similarity fromleft t o right, top to bottom.

    angle-based distance measures provide perceptually ac-curate results with a high retrieval rate.

    6. REFERENCES[l ] V.N. Gudivada and V.V. Raghavan, Content-

    Computer 28 ,ased image retrieval systems,September 1995.

    [2] W. Niblack, et. al., The QBIC project: Query-ing images by content using color, texture andshape, Storage and Retrieval for Image and VideoDatabases,M.H. Loew, ed., Proc. SPIE 1908 , 1993 .

    [3] J.R. Smith and Shih-Fu Chang, VisualSEEK:a fully automated content-based image query sys-tem, ACM Multimedia Conference, November1996.

    773

  • 8/2/2019 Distance Measures for Color Image Retrieval

    5/5

    Figure 6: Retrieval result using L , norm. Top leftimage is the query color. ~ ~ ~ ~ ~ ~ ~ i ~imilarity fromleft to right, top to bottom.

    Figure 8: Retrieval result using Czekanowski distance.Top left image is the query color. Decreasing similarityfrom left to right, top t o bottom.

    Figure 7: Retrieval result using Canberra distance. Topleft image is th e query color. Decreasing similarity fromleft to right, top to bottom. Figure 9: Retrieval result using angular distance. Top

    left image is the query color. Decreasing similarity fromleft t o right, top to bottom.M. Stricker and M. Orengo, Similarity of colorimages, Storage and Retrieval for Image and VideoDatabases III, Proc. SPIE 2420,pp. 381-392, 1995.Y. Gong, M. Sakauchi, Detection of regions

    matching specified chromatic features, ComputerVision and Image Understanding, 61(2), March,1995.

    N. Herodotou, K.N. Plataniotis, A.N. Venet-sanopoulos, A Content-Based Storage and Re-trieval Scheme for Image and Video Databases,SPIE, Visual Communications and Image Process-ing, Proc SPIE3309, pp. 697-708 San Jose, Jan-uary, 1998.H. Zhang, Y. Gong, S.W. Smoliar, Image retrievalbased on color features: an evaluation study, Dig-ita1 Image Storage and Archiving Systems, procSPIE 2606, pp. 212-220, 1995.

    Figure 10: Retrieval result using the new combinationdistance. Top left image is the query color. Decreasingsimilarity from left to right, top to bottom.

    774