comparison of cbir techniques using dct and fft for ... · most of the existing cbir based search...

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Web Site: www.ijettcs.org Email: [email protected], [email protected]

Volume 1, Issue 4, November – December 2012 ISSN 2278-6856

Volume 1, Issue 4 November - December 2012 Page 90

Abstract: In recent years, the accelerated growth of digital media collections has established the need for the development of tools for the efficient access and retrieval of visual information .The paper presents innovative content based image retrieval (CBIR) techniques based on feature vectors as DC coefficients of transformed images using DCT and FFT. Here the feature vector size per image is greatly reduced by taking only DC coefficients of each R, G, and B component of transformed image. The proposed CBIR techniques are implemented on a database having 1000 images spread across 10 categories. For each proposed CBIR technique 50 queries (5 per category) are fired on the database and net average precision and recall are computed for all feature sets per transform. Finally FFT surpasses DCT transforms in performance with highest precision and recall values. Keywords: Feature vector, DCT, FFT, DC Coefficient, precision, recall

1. INTRODUCTION Image processing is a field which faces drastic changes and increased users day by day. One of the widely used applications of image processing is the content based image retrieval, which aims to retrieve similar kinds of images from the database with respect to the query image. Due to the rapid development of World Wide Web (WWW) and imaging technology, more and more images are available in the Internet and stored in databases. Searching the related images by the querying image is becoming tedious and difficult Most of the existing CBIR based search engines are keyword-dependant. If the keywords are not relevant to the images, then the purpose of retrieving similar kinds of images is lost! In order to overcome this problem, CBIR based on the semantics of the images came into existence. Content Based Image Retrieval (CBIR) is a set of techniques for retrieving semantically-relevant images from an image database based on automatically-derived image features [1].Typical CBIR involves two phases. In the first phase, some feature characterizing each image in the database is computed and stored as feature vectors. In the second phase, the same set of feature vector is calculated for the user given query image and it is compared with all the stored feature vectors using distance measure such as Euclidian distance as shown in Figure1.

Success of image retrieval depends on performance and speed of the retrieval technique used. Many techniques have already been proposed for content based image retrieval; however, the thrust of better performance and faster retrieval is yet present. The retrieval accuracy, computational complexity, retrieval time depend on the dimension of the feature vector [7]. Higher the dimension of feature vectors, better the retrieval accuracy but the memory for storage, retrieval time and computational complexity increases. Thus is it important to find balance between the dimension of feature vectors and the accuracy with satisfactory storage and computational requirements. The paper proposes novel content based image retrieval techniques to minimize the size of feature vector to reduce the system complexity and to obtain faster retrieval .In this paper it is proposed to extract the feature vector using frequency domain analysis of the image [10]. Here we obtain the transform coefficients of applied transform and use them as image feature. Frequency domain calculations were made using DCT and FFT.

Figure 1: Architecture of CBIR

2. FOURIER TRANSFORM The Fourier Transform is an important image processing tool which is used to decompose an image into its sine and cosine components. The output of the transformation represents the image in the Fourier or frequency domain, while the input image is the spatial domain equivalent. In the Fourier domain image, each point represents a particular frequency contained in the spatial domain image. The practical implementation of the DFT on a computer nearly always uses the Fast Fourier transform (FFT). FFT is simply an algorithm (i.e., a particular method of performing a series of computations) that can compute

Comparison of CBIR Techniques using DCT and FFT for Feature Vector Generation

Vibha Bhandari1, Sandeep B.Patil2

1M.E. student at SSCET Bhilai (C.G.) INDIA

2Associate Professor ETC department, SSCET Bhilai (C.G.) INDIA




the discrete Fourier transform much more rapidly than other available algorithms.FFT is surely the most widely used signal processing algorithm and is the basic building block for a large percentage of algorithms in current usage. The 2D FFT pair is given by:

-------------- (1)

-------------- (2) Where, 0 m, k M-1,0 n,1 N-1

3. DISCRETE COSINE TRANSFORM Like any Fourier-related transform, discrete cosine transforms (DCTs) express a function or a signal in terms of a sum of sinusoids with different frequencies and amplitudes. Like the discrete Fourier transforms (DFT), a DCT operates on a function at a finite number of discrete data points. The obvious distinction between a DCT and a DFT is that the former uses only cosine functions, while the latter uses both cosines and sines (in the form of complex exponentials). It is a separable linear transformation; that is, the two-dimensional transform is equivalent to a one-dimensional DCT performed along a single dimension followed by a one-dimensional DCT in the other dimension

-------------- (3) Where C(m),C(n)=1/ for m,n=0 and C(m),C(n)=1 otherwise For full 2-Dimensional DCT for an NxN image the number of multiplications required are N2(2N) and number of additions required are N2(2N-2). 4. ALGORITHM The method of image search and retrieval proposed here mainly focuses on the generation of the feature vector of search based on the image transform generated by the DCT or FFT. Steps of the algorithm are given below. Step1: Extract Red, Green and Blue components of the color image. Step2: Apply the Transform (DCT or FFT) on individual color planes of image to extract feature vector.. Step3: The left topmost component (DC term) of DCT of each color components i.e. R, G and B are taken as a image feature and stored in feature vector. Step4: The Euclidian distances between the feature vectors of query image and the feature vectors of images in the database are calculated

Step5: The algorithm performance is measured based on the average precision and average recall of each class of images and their average across the class. 5. PROPOSED METHODOLOGY 5.1 Development Environment The implementation of the proposed algorithm is done in MATLAB 7.0 using a computer with Intel Core2 Duo Processor E4500 (2.20GHz) and 2 GB RAM. 5.2 Database Generation The database consists of different categories such as Africans and villages, Beaches, Buses, Dinosaurs, Elephants, Flowers, Horses, Mountains and glaciers, Food and Natural scenes collected from the open source [8]. These images are stored in JPEG format with size 384x256 and 256×384, each image is represented with RGB color space. 5.3 Generation of Feature Vector When a query image is submitted by a user, we need to compute the feature vector as before and match it to the precomputed feature vector in the database. The proposed algorithm makes use of well known Discrete Cosine Transform (DCT) and Fast Fourier Transform (FFT) to generate the feature vectors for the purpose of search and retrieval of database images

Figure 2: Flow diagram of feature vector generation

Figure 3: Flow diagram of proposed method




In this method for generating feature vector first we divide the color image into its R, G, B components, and then transformation is applied to obtain transform coefficients. The value of transform at the origin of frequency domain [i.e.F(0,0)] is called the DC component of the transform.[11] The first coefficient (DC component) represents the energy information. The first coefficient (DC component) of each transformed color component i.e. R, G & B is used for feature vector generation, so in this method for each color component we obtain one feature value. the feature vector of one RGB image consist of three feature values one feature corresponding to each color component . Once the feature vectors are generated for all images in the database, they are stored in a feature database. 5.4 Similarity measure For similarity comparison, we have used Euclidean distance d using equation

2 ---------- (4)

Where Fq[i] is the i th query image feature, and Fdb[i] is the corresponding feature in the feature vector database. 6. EXPERIMENTAL RESULTS Database of 1000 images of 10 different classes is used to check the performance of the algorithm developed. In order to measure retrieval effectiveness for an image retrieval system, precision and recall values are used. Five different images of each category are used as query images, some representative sample images which are used as query images are shown in Figure 4.

Figure 4: Sample query Images

For testing the performance of each proposed CBIR technique, per technique 50 queries (5 from each category) are fired on the database of 1000 variable size generic images spread across 10 categories. The query and database image matching is done using Euclidian distance. A feature vector of query image of each class is calculated to search the feature database. The sorted Euclidian distance in ascending order between the query image and

the database images feature vectors are used to calculate the precision and recall to measure the retrieval performance of the algorithm as shown in the equations(3) and (4). Precision = -

--------- (5) Recall =

--- ---- (6) The average precision and average recall are computed by grouping the number of retrieved images sorted according to ascending Euclidian distances with the query image. Crossover point in precision and recall is the point on the graph where both the precision and recall curves meet. Crossover point can be used in a way to measure how correct our algorithm is, higher the crossover point, better is the performance of the method. [5] The following figures shows the graphs of precision/recall values plotted against number of retrieved images for FFT and DCT based image retrieval techniques. Graph for the system using DCT for feature vector generation:

Figure 5 Average Precision and Average Recall

Performance for Dinosaur class using DCT-CBIR

Figure 6 Average Precision and Average Recall Performance for Flower class using DCT-CBIR

Figure 7 Average Precision and Average Recall Performance for Horse class using DCT-CBIR





Performance for Elephant class using DCT-CBIR

Figure 9 Average Precision and Average Recall Performance for Sunset class using DCT-CBIR

Figure 10 Average Precision and Average Recall Performance for Mountain class using DCT-CBIR

Figure 11 Average Precision and Average Recall Performance for Beach class using DCT-CBIR


Performance for Food class using DCT-CBIR

Figure 13 Average Precision and Average Recall Performance for African class using DCT-CBIR

Figure 14 Average Precision and Average Recall Performance for Bus class using DCT-CBIR

Graph for the system using FFT for feature vector generation:




Figure 15 Average Precision and Average Recall Performance for Dinosaur class using FFT-CBIR

Figure16Average Precision and Average Recall Performance for Flower class using FFT-CBIR

Figure 17 Average Precision and Average Recall Performance for Horse class using FFT-CBIR

Figure 18 Average Precision and Average Recall Performance for Elephant class using FFT-CBIR


Performance for Sunset class using FFT-CBIR

Figure 20 Average Precision and Average Recall Performance for Mountain class using FFT-CBIR

Figure 21 Average Precision and Average Recall Performance for Beach class using FFT-CBIR




Figure 22 Average Precision and Average Recall Performance for food class using FFT-CBIR

Figure 23 Average Precision and Average Recall Performance for African class using FFT-CBIR


Performance for Bus class using FFT-CBIR Figure 26 shows the performance comparison of both transforms for proposed CBIR techniques. Figure 25 is indicating the crossover points of DCT CBIR and FFT CBIR for different classes of Image in the database. Here almost for all the classes FFT performs better than DCT. For African class of image DCT outperforms FFT.

Figure 25 Comparison of average cross over point for different categories

Figure 26 shows the plot of average cross over point including all the categories of both the transformation techniques .it is clear from the graph that FFT surpasses DCT in term of average precision recall cross over point.

Figure 26 Comparison of average cross over point

7. CONCLUSIONS In this paper, we presented a system for efficient retrieval of images in response to a query expressed as an example image. Our system can be categorized as a content-based image retrieval system as it is efficient for similar image searching. We have used transformation techniques to generate feature vectors. In all 3 components i.e. DC component of transformed R, G and B plane are considered for feature vector generation. Thus the algorithm is very fast as compared to the algorithms using full transforms. The two transformation techniques DCT and FFT are compared here in terms of average precision and recall crossover point. .Finally FFT performs better than DCT with highest precision and recall values. REFERENCES

[1] F. Long, H. Zhang, H. Dagan, and D. Feng, “Fundamentals of content based image retrieval,” in Multimedia Information Retrieval and Management: Technological Fundamentals and Applications, D. Feng, W. Siu, and H. Zhang, Eds., Berlin Heidelberg New York: Springer-Verlag, 2003, ch. 1, pp. 1-26.

[2] Wai-Pak Choi, Kin-Man Lam and Wan-Chi Siu, “ An efficient algorithm for the extraction of Euclidean skeleton,” IEEE Transaction on Image processing, 2002.

[3] Rui Y, Huang T S, Chang S F. Image retrieval: current techniques, promising directions and open issues, Journal of Visual Communication and Image Representation, 1999, 10( I): 39-62

[4] p.s.suhasini , dr. K.sri rama krishna, dr. I. V. Murali krishna,” cbir using color histogram processing” Journal of Theoretical and Applied Information Technology 2009.

[5]H. B. Kekre, Dhirendra Mishra, “CBIR using Upper Six FFT Sectors of Color Images for Feature Vector Generation”, International Journal of Engineering and Technology Vol.2(2), 2010, 49-54




[6]Ngo, C.W., Pong, T.C. and Chin, R.T., 2001, Exploiting Image Indexing Techniques in DCT Domain, Pattern Recognition, 34, 1841-1851.

[7] Mustafa Ozden, Ediz Polat,” A color image segmentation approach for content-based image retrieval “2006 Pattern Recognition Society. Published by Elsevier Ltd.

[8] Young Deok Chun; Nam Chul Kim; Ick Hoon Jang; , "Content-Based Image Retrieval Using Multiresolution Color and Texture Features," Multimedia, IEEE Transactions on , vol.10, no.6, pp.1073-1084, Oct. 2008.

[9]http://wang.ist.psu.edu/docs/related/Image.orig [10] B.G.Prasad, K.K. Biswas, and S. K. Gupta,

“Region –based image retrieval using integrated color, shape, and location index”, International Journal on Computer Vision and Image Understanding Special Issue: Colour for Image Indexing and Retrieval, Volume 94, Issues 1-3, April-June 2004, pp.193-233.

[11] Rafael C. Gonzalez, Richard E. Woods, Stevan L. Eddins.“Digital image processing using MATLAB”

[12]Swain, M.J. and Ballard, D.H., 1991, Color Indexing, International Journal of Computer Vision,7, 1, 11-32

[13] P. Gnanasivam and S. Muttan “Gender Identification Using Fingerprint through Frequency Domain Analysis” European Journal of Scientific Research, ISSN 1450-216X Vol.59 No.2 (2011), pp.191-199, © Euro Journals Publishing, Inc. 2011

AUTHOR

Vibha Bhandari is currently pursuing Masters Degree in Communication System from Chhattisgarh Swami Vivekananda Technical University Bhilai , India. Mr. Sandeep B. Patil received his B.E. (E&Tc) and M.E. (Elex) from Dr. B.A.M.U. Aurangabad (M.S). He is a research scholar and pursuing his Ph.D. in the field of image processing from CSVTU Bhilai. He is working as Associate Professor in the

department of Electronics & Telecommunication of Shri Shankaracharya Group of institutions Bhilai. He has published 16 research papers in various National and International journals and conferences. His research interest include Biometrics and its applications, Devnagri character recognition Using neural Network, PCA, HMM, Pattern recognition, Biomedical imaging

comparison of cbir techniques using dct and fft for ... · most of the existing cbir based search...

Documents