research article a novel image retrieval based on a combination...

13
Research Article A Novel Image Retrieval Based on a Combination of Local and Global Histograms of Visual Words Zahid Mehmood, 1,2 Syed Muhammad Anwar, 2 Nouman Ali, 1 Hafiz Adnan Habib, 3 and Muhammad Rashid 4 1 Department of Computer Engineering, University of Engineering and Technology, Taxila 47050, Pakistan 2 Department of Soſtware Engineering, University of Engineering and Technology, Taxila 47050, Pakistan 3 Department of Computer Science, University of Engineering and Technology, Taxila 47050, Pakistan 4 Department of Computer Engineering, Umm Al-Qura University, Makkah 21421, Saudi Arabia Correspondence should be addressed to Zahid Mehmood; [email protected] Received 19 April 2016; Revised 14 June 2016; Accepted 19 June 2016 Academic Editor: Jinyang Liang Copyright © 2016 Zahid Mehmood et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Content-based image retrieval (CBIR) provides a sustainable solution to retrieve similar images from an image archive. In the last few years, the Bag-of-Visual-Words (BoVW) model gained attention and significantly improved the performance of image retrieval. In the standard BoVW model, an image is represented as an orderless global histogram of visual words by ignoring the spatial layout. e spatial layout of an image carries significant information that can enhance the performance of CBIR. In this paper, we are presenting a novel image representation that is based on a combination of local and global histograms of visual words. e global histogram of visual words is constructed over the whole image, while the local histogram of visual words is constructed over the local rectangular region of the image. e local histogram contains the spatial information about the salient objects. Extensive experiments and comparisons conducted on Corel-A, Caltech-256, and Ground Truth image datasets demonstrate that the proposed image representation increases the performance of image retrieval. 1. Introduction CBIR is used to search the images from an image archive that are in a semantic relationship with the query image [1–3]. Occlusion, overlapping objects, spatial layout, image resolution, variations in illumination, semantic gap, and exponential growth in multimedia contents make CBIR a challenging problem [1–3]. In CBIR, the feature vector is used to represent the image in the form of low-level visual features [1, 2]. e feature vector of a query image is computed and compared with the feature vectors of the images placed in an image archive [4]. e closeness of the feature vector values determines the output. e appearance of a similar view in the images belonging to the different classes result in the closeness of the feature vector values and it decreases the performance of image retrieval [4]. e main focus of the research in CBIR is to retrieve the images that are in a semantic relationship with the query image [3, 5]. In the standard BoVW model [6], an image is represented as an orderless global histogram of visual words by ignoring the spatial layout of 2D image space. e spatial attributes of an image carry information that enhances the performance of image retrieval [7]. e approaches based on query expansion [8], large vocabulary size [7], and soſt quantization [9] are applied to enhance the performance of CBIR. All of these approaches ignore the spatial layout that provides the discriminating details [7]. According to Anwar et al. [10], two approaches add the spatial information to the inverted index of the BoVW based image representation. e first approach deals with the visual words cooccurrence and requires computational cost with the larger size vocabulary [11]. e second approach divides an image into subregions and construct histograms from each of the subregions [12]. Various types of image representations are proposed by selecting different semantic regions from the images [12– 16]. Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2016, Article ID 8217250, 12 pages http://dx.doi.org/10.1155/2016/8217250

Upload: others

Post on 13-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Article A Novel Image Retrieval Based on a Combination …downloads.hindawi.com/journals/mpe/2016/8217250.pdf · 2019-07-30 · image retrieval by using a combination of

Research ArticleA Novel Image Retrieval Based on a Combination of Local andGlobal Histograms of Visual Words

Zahid Mehmood12 Syed Muhammad Anwar2 Nouman Ali1 Hafiz Adnan Habib3

and Muhammad Rashid4

1Department of Computer Engineering University of Engineering and Technology Taxila 47050 Pakistan2Department of Software Engineering University of Engineering and Technology Taxila 47050 Pakistan3Department of Computer Science University of Engineering and Technology Taxila 47050 Pakistan4Department of Computer Engineering Umm Al-Qura University Makkah 21421 Saudi Arabia

Correspondence should be addressed to Zahid Mehmood zahidmehmooduettaxilaedupk

Received 19 April 2016 Revised 14 June 2016 Accepted 19 June 2016

Academic Editor Jinyang Liang

Copyright copy 2016 Zahid Mehmood et al This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

Content-based image retrieval (CBIR) provides a sustainable solution to retrieve similar images from an image archive In the lastfew years the Bag-of-Visual-Words (BoVW)model gained attention and significantly improved the performance of image retrievalIn the standard BoVW model an image is represented as an orderless global histogram of visual words by ignoring the spatiallayout The spatial layout of an image carries significant information that can enhance the performance of CBIR In this paper weare presenting a novel image representation that is based on a combination of local and global histograms of visual words Theglobal histogram of visual words is constructed over the whole image while the local histogram of visual words is constructedover the local rectangular region of the image The local histogram contains the spatial information about the salient objectsExtensive experiments and comparisons conducted on Corel-A Caltech-256 and Ground Truth image datasets demonstrate thatthe proposed image representation increases the performance of image retrieval

1 Introduction

CBIR is used to search the images from an image archivethat are in a semantic relationship with the query image[1ndash3] Occlusion overlapping objects spatial layout imageresolution variations in illumination semantic gap andexponential growth in multimedia contents make CBIR achallenging problem [1ndash3] In CBIR the feature vector isused to represent the image in the form of low-level visualfeatures [1 2]The feature vector of a query image is computedand compared with the feature vectors of the images placedin an image archive [4] The closeness of the feature vectorvalues determines the output The appearance of a similarview in the images belonging to the different classes resultin the closeness of the feature vector values and it decreasesthe performance of image retrieval [4] The main focus ofthe research in CBIR is to retrieve the images that are in asemantic relationship with the query image [3 5]

In the standard BoVWmodel [6] an image is representedas an orderless global histogram of visual words by ignoringthe spatial layout of 2D image space The spatial attributes ofan image carry information that enhances the performance ofimage retrieval [7]The approaches based on query expansion[8] large vocabulary size [7] and soft quantization [9]are applied to enhance the performance of CBIR All ofthese approaches ignore the spatial layout that provides thediscriminating details [7] According to Anwar et al [10]two approaches add the spatial information to the invertedindex of the BoVW based image representation The firstapproach deals with the visual words cooccurrence andrequires computational cost with the larger size vocabulary[11] The second approach divides an image into subregionsand construct histograms from each of the subregions [12]Various types of image representations are proposed byselecting different semantic regions from the images [12ndash16]

Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2016 Article ID 8217250 12 pageshttpdxdoiorg10115520168217250

2 Mathematical Problems in Engineering

Figure 1 Images from four different classes of Corel-A image benchmark

Keeping in view the robust performance of the secondapproach [12 14] the proposed research is constructing twohistograms from a single image The global histogram ofvisual words is constructed over the whole image whilethe local histogram of visual words is constructed over thelocal rectangular region of the image The details about theselection of the local rectangular area for the constructionof local histogram are mentioned in Section 3 Figure 1 isrepresenting the images from four different classes (AfricaBeach Elephants and Horses) of Corel-A image benchmarkGrass sky and trees are in the global appearance of allthese images while the objects of interest or salient objectslie in the central region of the images (like people beachelephant and horses) There is no semantic relationshipbetween these images as they belong to the different classesThe discriminating information is likely to be available inthe central region of an image The second row of Figure 1 isrepresenting the extracted local rectangular regions of theseimages (area of the image that is selected for the constructionof local histogram of visual words) The extracted localrectangular regions of all of these images are discriminatingas they contain the information about the salient objectsof the image Figure 2 is representing an image from thesemantic class Elephant of the Corel-A image benchmarkThe global appearance of the image contains sky trees andgrass while the main object of interest in the image iselephant The construction of a local histogram from theimage central area adds the spatial attributes of the salientobject to the inverted index of the BoVW representation

Keeping these facts in view dense features are extractedfrom a set of training and test images the feature spaceis quantized to construct a visual vocabulary Images arerepresented as histograms of visual words (by using a com-bination of local and global histograms of visual words) Thelocal histogram that is constructed over the local rectangularregion of an image contains the spatial information aboutthe salient objects The global and local histograms of visualwords are concatenated and this information is added tothe inverted index of the BoVW representation The maincontributions of this paper are

(1) image representation as a combination of local andglobal histograms of visual words

(2) the addition of spatial information from the centralarea of the image to the inverted index of BoVWrepresentation

(3) reduction of semantic gap between the low-level fea-tures of an image and high-level semantic concepts

The rest of the paper is organized as follows Section 2 isabout the related research Section 3 describes the proposedresearch methodology Section 4 is about the experimentaldetails and results conducted on three image benchmarksSection 5 concludes our researchwork and points towards thefuture directions

2 Related Work

Query by Image Content (QBIC) is the first system launchedby IBM for image search [2] After that a variety of imageretrieval techniques are proposed that are based on colortexture and shape [1ndash3 5] Interest points detectors like His-togram of Oriented Gradients (HOG) [22] Scale-InvariantFeature Transform (SIFT) [23] Maximally Stable ExtremalRegions (MSER) [24] Speeded-Up Robust Features (SURF)[25] and BRISK features [26] are used for robust content-based image matching [27] Ashraf et al [28] proposed animage retrieval by using a combination of color features andbandlet transformation The bandlet transformation-basedrepresentation was selected to extract the salient objectsfrom the images Artificial Neural Networks (ANN) with aninverted index was selected for an efficient image retrievalZeng et al [29] proposed an image representation basedon spatiogram that was generalized histogram of quantizedcolors In the first step the color space was quantizedby applying the Gaussian Mixture Models (GMM) andExpectation-Maximization (EM) algorithm The number ofquantized color bins was determined on the basis of BayesianInformation Criterion (BIC) A spatiogram was based on ahistogram with a spatial distribution of color and each binrepresents the weighted distribution of pixels The similaritybetween the two images was determined on the basis ofthe similarity between the respective spatiograms Walia andPal [30] proposed a framework for color image retrievalby using a combination of low-level features The ColorDifference Histogram (CDH) was used to extract the color

Mathematical Problems in Engineering 3

(a) (b)

Figure 2 (a) is presenting the procedure for computation of global histogram while (b) is presenting the procedure for the extraction oflocal histogram

and textureThe shape features were extracted by applying anAngular Radial Transform (ART) The retrieval effectivenessof the proposed framework was enhanced by modifyingthe CDH algorithm Irtaza and Jaffar [31] proposed a com-bination of Genetic Algorithm (GA) and Support VectorMachines (SVM) to reduce the semantic gap Images wererepresented as a combination of curvelet wavelet and Gaborfeatures Relevance Feedback (RF) was applied to enhancethe accuracy of the proposed work Tian et al [19] proposeda rotation-invariant and scale-invariant Edge OrientationDifferenceHistogram (EODH) descriptor Steerable filter andvector sumwere applied to obtain themain orientation of thepixelsThe effectiveness of the feature space was improved byselecting a combination of Color SIFT and EODHCodebookwas constructed by applying the weighted average of ColorSIFT and EODH Yu et al [32] proposed an image retrieval byusing a combination of Local Binary Pattern (LBP)withHOGand SIFT The midlevel featuresrsquo combination was selectedto achieve high performance for the complex backgroundVisual features were separately extracted from the images byusing SIFT and LBPWeighted average clustering was appliedfor the codebook construction to obtain the integrationof two midlevel features According to the experimentalresults [32] the best performance of image retrieval wasobtained by applying the feature integration of SIFT and LBPWang et al [15] proposed Spatial Weighting Bag-of-Features(SWBOF) framework and extracted spatial information fromthe subblocks of the images Local entropy local varianceand adjacent blocks distance were selected to calculate thespatial information According to experimental results [15]SWBOF with spatial information performs better than thetraditional BoF representation Yildizer et al [33] proposedCBIR by using multiple Support Vector Machines ensembleSVM is used for regression and Support Vector Regression(SVR) ensemble is selected for classification According tothe experimental results the proposed technique improvesthe accuracy of image retrieval Cardoso et al [20] proposediterative technique for CBIR that is based on Multiple SVMEnsembles Discrete Cosine Transform (DCT) is selected forfeature extraction and multi-SVM is used for classificationYildizer et al [21] integrated wavelets for an effective content-based image retrieval k-means with B+-tree data structure isused for clustering with Daubechies wavelet transform thathas excellent spatial and spectral locality properties whichmake it very useful for CBIR it is applied to partitioning theimages into different levels Youssef [17] proposed a novel

Integrating Curvelet Transform with Enhanced DominantColors extraction and Texture (ICTEDCT) analysis for effi-cientCBIRCurveletmultiscale ridgeletswere integratedwithregion-based vector codebook subband clustering to enhancedominant colors extraction and texture analysis

3 Proposed Methodology

The lack of spatial information is the main problem inthe standard BoVW representation [7 15] Visual wordsare represented in a histogram without considering theirlocations in the 2D image plane The spatial informationcarries discriminating details that enhance the performanceof CBIR [7 15] The block diagram of proposed researchmethodology is presented in Figure 3

(1) In the BoVW representation a raw image 119868 is repre-sented as

119868 = (119886119894119895) (1)

where 119886119894119895is the pixel at the position (119894 119895)

(2) Dense SIFT features are extracted from the image andan image 119868 is represented as

119868 = 1198891 1198892 119889

119898 (2)

where 1198891to 119889119898are image descriptors

(3) Quantization algorithms like k-means clustering isapplied to construct a visual vocabulary (codebook)consisting of 119899 visual words represented as

voc = 1199081 1199082 119908

119899 (3)

where 1199081to 119908119899are visual words

(4) For the construction of global histogram of visualwords mapping of each visual word is done over thewhole image For the construction of local histogramof visual words mapping of each visual word is doneby extracting the image central area by using (5) Thenearest visual words are assigned to the quantizeddescriptors by using the following equation

119908 (119889119896) = argmin119908isinvoc

Dist (119908 119889119896) (4)

where 119908(119889119896) is representing the visual word assigned

to the kth descriptor 119889119896 while Dist(119908 119889

119896) is the

4 Mathematical Problems in Engineering

Query image

Learning ofclassifier

on histogramsof training

images

Retrieved images

Dense features extraction

Clustering

Dense features extraction

Classification

Similarity measure

Clustering

Vocabulary construction

Training images representations by using

a combination of local and global histograms

of visual words

Class-1 Class-N

Figure 3 Block diagram of proposed research based on a combination of local and global histograms of visual words

distance between the descriptor 119889119896and visual word

119908 Each image is represented as a collection of patchesand each patch is represented by visual words

(5) Two histograms of 119873 visual words are extractedfrom a single image The computed local and globalhistograms are concatenated and this information isadded to the inverted index of BoVW representationThe visual words of the local histogram aremapped toa rectangular area that is defined by image width (W)and height (H) Four points are defined to extract theoptimized local rectangular area as follows

1199011= 119882 times 119909

1199012= (119882 times 119910) + 119901

1= (119882 times 119910) + (119882 times 119909)

= 119882 times (119909 + 119910)

1199013= 119867 times 119909

1199014= (119867 times 119910) + 119901

3= (119867 times 119910) + (119867 times 119909)

= 119867 times (119909 + 119910)

(5)

where 119909 is the normalized starting point and 119910 is theending point of the local rectangular area

(6) Consider 119873 as the number of visual words of thevocabulary Let 119863

119894be the set of the descriptors that

are mapped to the visual word 119908119894 then the 119894th bin of

the histogram of visual words 119887119894is the cardinality of

the set119863119894

119887119894= Card (119863

119894)

119863119894= 119889119896 119896 isin (1 119873) | 119908 (119889119896) = 119908119894

(6)

31 Image Classification SVM is a state-of-the-art supervisedlearning classification algorithm [5] The linear SVM sepa-rates two classes by using a hyperplane The dataset with twoclasses is represented as

(119909119894 119910119894)119873

119894=1119910119894= +1 minus1 (7)

where 119909119894and 119910119894are input datasets and +1 and minus1 are the corre-

spondence labels of the classes respectively The hyperplanesare generated by finding the values of the coefficients

119908119879sdot 119909 + 119887 = 0 (8)

where119908 is weight vector and 119887 is biasThemaximummarginis determined by 2119908 hyperplanes and the two classes

Mathematical Problems in Engineering 5

are separable from each other according to the followingequations

119908119879sdot 119909119894+ 119887 = 1

119908119879sdot 119909119894+ 119887 = minus1

(9)

This can be expressed equivalently as

119910119894(119908119879sdot 119909119894+ 119887) ge 1 (10)

The kernel method [34] is used in SVM to compute thedot product in the high-dimensional feature space that pro-vides the ability to generate nonlinear decision boundariesThe kernel function permits using the data with no obviousfixed dimensionsThehistograms of visual words constructedover the local and global areas of the image are normalizedand SVM Hellinger kernel [35] is applied on the normalizedhistograms by using following equation

119870(ℎ ℎ1015840) = sum

119894

radicℎ (119894) ℎ1015840(119894) (11)

where ℎ and ℎ1015840 are the normalized histogramsThe SVM Hellinger kernel is selected because of its low

computational cost and instead of computing the kernelvalues it explicitly computes the featuremap and the classifierremains linear The one-versus-one rule is applied for 119896number of classes 119896 sdot (119896 minus 1)2 classifiers are constructed andeach classifier trains the data by using two classes

4 Experiments and Results

This section is about the details of experiments conducted forthe evaluation of the proposed image representation basedon a combination of local and global histograms of visualwords The proposed research is evaluated on Corel-A imagebenchmark Caltech-256 and Ground Truth image datasetsThe images are randomly divided into training and testdatasets The visual vocabulary (codebook) is constructedfrom the training images and retrieval precision is calculatedby using the test dataset Keeping in view the unsupervisednature of clustering by applying k-means each experimentis repeated 10 times and average values of precision arereported In order to evaluate the performance of proposedresearch we determined the relevant images retrieved inresponse to a query image A computer simulation is used toselect images randomly from test dataset and use them as aquery imageThe response to the query image is evaluated onthe basis of relevant images retrieved Precision determinesthe number of relevant images retrieved in response to aquery image and it shows the specificity of the image retrievalsystem

Precision =Number of relevant images retrievedTotal number of images retrieved

(12)

Recall measures the sensitivity of the image retrievalsystem Recall is calculated by the ratio of correct images

retrieved to the total number of images of that class in theimage benchmark

Recall =Number of relevant images retrievedTotal number of relevant images

(13)

The details about experimental parameters are givenbelow

(1) Vocabulary Size The size of visual vocabulary is amajor parameter that affects the performance of content-based image matching [36 37] increasing the size of visualvocabulary at certain level and increases the performanceand larger size visual vocabulary tends to overfit Differentsizes of visual vocabulary are constructed for the Corel-A (2050 100 200 and 400) Caltech-256 (20 50 100 200 400and 600) and Ground Truth (10 20 30 40 and 50) imagebenchmarks in order to evaluate the best performance of theproposed image representation

(2) Dense Pixel Stride The dense SIFT features are extractedfrom the training and test images For a precise content-based image matching we extracted dense features usingthree different scales The dense pixel stride or the step sizeis used to control the spatial resolution of the dense grid Asmaller pixel stride results in a finer grid while a larger pixelstride makes the grid coarser With a pixel stride of 5 15 and25 we considered every 5th 15th and 25th pixel respectivelyas a features to calculate the SIFT descriptor from local andglobal regions of each image

(3) Feature Percentage for the Vocabulary Construction Thefeature percentage to construct a visual vocabulary froma training dataset is one of the parameters that affect theperformance of image retrieval According to [36] increasingthe percentage of features for visual vocabulary constructionincreases the performance of content-based image matchingand vice versa In experiments different percentages (1025 50 75 and 100) of dense features per image areused to construct visual vocabulary

(4) Value of 119909 and y x and 119910 are used for the extraction oflocal rectangular area After a number of experiments theoptimized normalized starting value of 119909 = 022 and endingvalue of 119910 = 060 are selected for training and test images

41 Retrieval Performance on Corel-A Image Benchmark TheCorel-A image benchmark (httpwangistpsuedudocsrelated) is selected for the evaluation of the proposedimage representation and results are compared with thestandard BoVW representation [6] and the state-of-the-artCBIR research [4 15 17ndash19] The Corel-A image benchmarkcontains 1000 images that are divided into ten semantic cat-egories namely Africa Buildings Beach Dinosaurs BusesElephants Horses Flowers Mountains and Food Eachsemantic category consists of 100 images with a resolution of256 times 384 pixels or 384 times 256 pixels Figure 4 is representingthe images from all the semantic classes of Corel-A imagebenchmark

6 Mathematical Problems in Engineering

Figure 4 Samples of images from each category of Corel-A image benchmark

Table 1 MAP of the proposed image representation on a vocabulary size of 200 visual words and pixel stride of 5

Vocabulary size ampfeatures used 20 50 100 200 400

10 7740 8027 8198 8401 825725 7774 8092 8202 8412 831650 7785 8106 8253 8425 830175 7808 8116 8307 8433 8324100 7836 8132 8327 8438 8329MAP 7788 8095 8257 8421 8305Standard deviation 03609 04050 05899 01522 02905Confidence interval 7743ndash7833 8044ndash8144 8184ndash8330 8402ndash8440 8269ndash8341Standard error 01614 01811 02638 00680 01299

In order to maintain a balance 50 images are used forthe training and the remaining 50 images are used for thetesting Different sizes of visual vocabulary (20 50 100 200and 400) are constructed from a set of training images MeanAverage Precision (MAP) is calculated by a random selectionof 500 images from the test dataset MAP for top-20 imageretrievals as a function of vocabulary size and percentage ofdense features per image used in vocabulary construction arepresented in Table 1

The additional statistical investigation in Table 1 rein-forces our described results We have calculated standarderror as description and estimated 95 confidence intervalas inferential results From these results we can easilyconclude that both the lower and the upper bounds at 5level of significance exceed 84 of MAP and the smallervalue of standard error further enhances the fact that thevisual vocabulary of size 200 visual words shows preciseand consistent result as compared to other sizes of visual

vocabulary on different features percentages (10 25 5075 and 100)

The MAP obtained by using proposed image representa-tion based on a combination of local and global histogramsof visual words (by using a vocabulary size of 200 visualwords) on the pixel strides of 5 15 and 25 is 84218212 and 7829 respectively The MAP of the proposedimage representation is compared with the standard BoVWrepresentation (without considering the spatial informationof local histogram) The MAP comparison of the proposedresearch and standard BoVWas a function of vocabulary sizeon the pixel strides of 5 15 and 25 is graphically representedin Figure 5 The proposed image representation outperformsthe standard BoVW representation on all vocabulary sizes(pixel strides of 5 15 and 25) According to the experimentalresults the MAP for all vocabulary sizes on a pixel stride of 5is greater than pixel strides of 15 and 25

Mathematical Problems in Engineering 7

Table 2 Comparison of MAP for top-20 image retrievals on Corel-A image benchmark

Class amp method Proposed research Youssef [17] Irtaza et al [4] Poursistani et al [18] Tian et al [19] Wang et al [15]Africa 7303 635 65 7024 746 64Beach 7458 642 60 4444 378 54Buildings 8024 698 62 708 539 53Buses 9584 915 85 763 967 94Dinosaurs 9795 992 93 100 99 98Elephants 8764 781 65 638 66 78Flowers 8513 948 94 924 92 71Horses 8629 952 77 947 87 93Mountains 8243 738 73 562 585 42Food 7896 806 81 745 622 50

Table 3 Comparison of recall for top-20 image retrievals on Corel-A image benchmark

Class amp method Proposed research Youssef [17] Irtaza et al [4] Poursistani et al [18] Tian et al [19] Wang et al [15]Africa 1461 1270 1300 1405 1492 1280Beach 1492 1284 1200 889 756 1080Buildings 1605 1396 1240 1416 1078 1060Buses 1917 1830 1700 1526 1934 1880Dinosaurs 1959 1984 1860 2000 1980 1960Elephants 1753 1562 1300 1276 1320 1560Flowers 1703 1896 1880 1848 1840 1420Horses 1726 1904 1540 1894 1740 1860Mountains 1649 1476 1460 1124 1170 840Food 1579 1612 1620 1490 1244 1000Mean 1684 1621 1510 1457 1455 1394

100 200 300 4000Size of vocabulary

75

80

85

MA

P

Proposed research with a pixel stride of 5BoVW with a pixel stride of 5Proposed research with a pixel stride of 15BoVW with a pixel stride of 15Proposed work with a pixel stride of 25BoVW with a pixel stride of 25

Figure 5 MAP as a function of vocabulary size

In order to present a sustainable performance of theproposed research the image retrieval precision and recallfor the top-20 image retrievals are compared with the state-of-the-art research of CBIR [4 15 17ndash19] Tables 2 and 3 arepresenting the classwise comparison of average precision andrecall of the proposed research (with a vocabulary size of 200words on a pixel stride of 5 and by using 100 features perimage) with the existing state-of-the-art techniques of CBIRThe MAP comparison is shown in Figure 6 while precision-recall curve is shown in Figure 7

The experimental results conducted on Corel-A imagebenchmark prove the robustness of the proposed imagerepresentation The MAP of the proposed research is higherthan the existing state-of-the-art research [4 15 17ndash19] TheMAP obtained from the proposed image representation is8421 (with a vocabulary size of 200 visual words on a pixelstride of 5 and by using 100 features per image)

The image retrieval results obtained by using proposedimage representation for the semantic class ldquoBusesrdquo andldquoBeachrdquo are shown in Figures 8 and 9 respectively in res-ponse to the query images that shows reduction of semanticgap in terms of classifier decision value (score) The classifierdecision label determines the class of the image whileclassifier decision value (score) is used to find the similar

8 Mathematical Problems in Engineering

Proposed researchYoussef et alIrtaza et al

Poursistani et al

Wang et alTian et al

84218107

7550 7434 72776970

60

65

70

75

80

85

90

Mea

n av

erag

e pre

cisio

n (M

AP)

()

Figure 6 Comparison of MAP with the state-of-the-art methodson Corel-A image benchmark

Proposed research with pixel stride of 5SIFT-LBP [32]

20 40 60 80 1000Recall ()

40

60

80

100

Prec

ision

()

Figure 7 Precision-recall curve for Corel-A image benchmark

images The real values shown at the top of each image arethe classifier decision value (score) of the respective imagecomputed by applying the Euclidean distance between scoreof the query image and scores of the retrieved images Top-20retrieved images whose score is close to the score of the queryimage aremore similar to the query image and vice versa (dueto limited space the retrieval results in response to the queryimages are shown for two semantic classes only)

42 Performance on Caltech-256 Image Benchmark Caltech-256 image benchmark (httpwwwvisioncaltechedu) con-sists of 256 semantic classes and each semantic class contains80 to 827 images For the simplicity we randomly selected10 classes from the Caltech-256 image benchmark Theselected classes are Motorbike Faceeasy Fireworks BonsaiButterfly Leopard Airplanes Ketch and Hibiscus as shownin Figure 10

Different sizes of visual vocabulary (20 50 100 200 400and 600) are constructed from a set of training images 50

190792

192121

181437

207091

168814

1897

184608

20671

212677

191177

196715

204835

212566

18671

177793

170756

214148

188865

18053

173708

168421

Figure 8 Image retrieval result shows reduction of semantic gap forthe semantic class ldquoBusesrdquo

209122

206148

200182

194074

182063

211198

202284

194183

186

210937

203655

221345

190409

214339

220837

19114

181279

205414

197978

225197

181495

Figure 9 Image retrieval result shows reduction of semantic gap forthe semantic class ldquoBeachrdquo

images are used for the training and the remaining 50images are used for the testing The MAP of the proposedresearch is compared with standard BoVW representation(without considering the spatial information) on differentpixel strides (5 15 and 25) vocabulary sizes (20 50 100200 400 and 600) and feature percentages (10 2550 75 and 100) for vocabulary construction The MAPobtained from the proposed image representation and thestandard BoVW representation by using different parametersis graphically presented in Figure 11

Mathematical Problems in Engineering 9

Figure 10 Samples of images from 10 semantic classes of Caltech-256 image benchmark

100 200 300 400 500 6000Size of vocabulary

75

80

85

MA

P

Proposed research with pixel stride of 5BoVW with pixel stride of 5Proposed research with pixel stride of 15BoVW with pixel stride of 15Proposed research with pixel stride of 25BoVW with pixel stride of 25

Figure 11 MAP as a function of vocabulary size for the Caltech-256image benchmark

The experimental results conducted on 10 semanticclasses of Caltech-256 image benchmark prove the robustnessof the proposed image representation The MAP obtained byusing the proposed image representation (with a vocabularysize of 400 visualwords andby using 100 features per image)on the pixel strides of 5 15 and 25 is 8650 8398 and8171 respectively while MAP obtained by using the sameexperimental parameters for standard BoVW representationis 8545 8305 and 8095 respectively According tothe experimental results the MAP for all vocabulary sizesdecreases by increasing the pixel stride as shown in Figure 11while precision-recall curve is shown in Figure 12

Proposed research with pixel stride of 5BoVW with pixel stride of 5

50

60

70

80

90Pr

ecisi

on (

)

20 40 60 80 1000Recall ()

Figure 12 Precision-recall curve for Caltech-256 image benchmark

43 Performance on Ground Truth Image BenchmarkGround Truth image benchmark (httpimagedatabasecswashingtonedugroundtruth) is a publicly available imagebenchmark and is used for the evaluation of CBIR research[20 21 33] There are a total of 1109 images that are dividedinto 22 semantic classes We manually selected all the imagesfrom 5 different semantic classes of Ground Truth imagebenchmark The selected classes for the evaluation of theproposed image representation are Abro green CherriesFootball Green Lake and Swiss Mountains as shown inFigure 13 The 5 classes are selected in order to comparethe performance of the proposed image representation withexisting state-of-the-art research [20 21 33] because theseresearchers also reported their results on the same numberof classes of Ground Truth image benchmark

50 images are used for the training and the remaining50 images are used for the testing Different sizes (10 20

10 Mathematical Problems in Engineering

Figure 13 Samples of images from 5 semantic classes of Ground Truth image benchmark

8284 8133

62805909

Proposed researchCardoso et al

Yildizer et alYildizer et al

45

50

55

60

65

70

75

80

85

MA

P (

)

Figure 14 Comparison of MAP with the state-of-the-art methodson Ground Truth image benchmark

Table 4 Classwise comparison of average precision

Class amp method Proposed method [20] [21]Abro green 7525 80 6667Cherries 7194 80 50Football 9018 100 75Green Lake 8423 80 50Swiss Mountains 9263 6067 50

30 40 and 50) of visual vocabulary are constructed andMAP on the vocabulary of these different sizes is calculatedAccording to the experimental results the MAP of 8284 isobtained by using the proposed image representation (with avocabulary size of 40 visual words on pixel stride of 5 andby using 100 features per image) The classwise averageprecision obtained from the proposed image representationis presented in Table 4 while MAP is graphically presentedin Figure 14

Table 5 Computational cost (in seconds) of the proposed algo-rithm

Number of images retrieved Proposed researchTop-5 02872Top-10 03972Top-15 06470Top-20 07837Top-25 09184Top-30 11103

Experimental results and comparisons conducted on theGround Truth image benchmark prove the robustness of theproposed research based on a combination of local and globalhistograms of visual words The MAP obtained from theproposed image representation is higher than the existingstate-of-the-art research [20 21 33]

44 Complexity Performance The computational cost of theproposed algorithm is calculated using Intel(R) Core i3(fourth generation) 17 Ghz CPU with 4GB RAM (DDR3)3MB L3 cache and Windows 7 operating system Theproposed algorithm is implemented in MATLAB and visualvocabulary is constructed offline using a training dataset andtested at run time using a test dataset The average CPUtime (in seconds) required from features extraction to imageretrieval is presented in Table 5

5 Conclusion and Future Directions

In this paper we proposed a novel image representationbased on a combination of local and global histograms ofvisual words The local histogram of visual words adds thespatial information of the central area to the inverted index ofBoVW representation The combination of local and globalhistograms of visual words is a possible solution to capturethe semantic information from the image The performance

Mathematical Problems in Engineering 11

of the proposed image representation is evaluated on threechallenging image datasets The proposed image represen-tation outperforms the state-of-the-art research includingthe standard BoVW representation For the future work weplan to replace the BoVW model with either Fisher kernelframework or the Vector of Locally Aggregated Descriptors(VLAD) model to evaluate a large-scale image retrieval

Competing Interests

The authors declare that they have no competing interests

References

[1] A-M Tousch S Herbin and J-Y Audibert ldquoSemantic hierar-chies for image annotation a surveyrdquo Pattern Recognition vol45 no 1 pp 333ndash345 2012

[2] R Datta D Joshi J Li and J Z Wang ldquoImage retrieval ideasinfluences and trends of the new agerdquoACMComputing Surveysvol 40 no 2 article 5 2008

[3] Y Liu D Zhang G Lu and W-Y Ma ldquoA survey of content-based image retrieval with high-level semanticsrdquo Pattern Recog-nition vol 40 no 1 pp 262ndash282 2007

[4] A Irtaza M A Jaffar E Aleisa and T-S Choi ldquoEmbeddingneural networks for semantic association in content basedimage retrievalrdquoMultimedia Tools and Applications vol 72 no2 pp 1911ndash1931 2014

[5] D ZhangMM Islam andG Lu ldquoA reviewon automatic imageannotation techniquesrdquo Pattern Recognition vol 45 no 1 pp346ndash362 2012

[6] J Sivic and A Zisserman ldquoVideo google a text retrievalapproach to object matching in videosrdquo in Proceedings of the 9thIEEE International Conference on Computer Vision pp 1470ndash1477 IEEE 2003

[7] J PhilbinO ChumM Isard J Sivic andA Zisserman ldquoObjectretrieval with large vocabularies and fast spatial matchingrdquoin Proceedings of the IEEE Computer Society Conference onComputer Vision and Pattern Recognition (CVPR rsquo07) pp 1ndash8IEEE Minneapolis Minn USA June 2007

[8] O Chum J Philbin J Sivic M Isard and A Zisserman ldquoTotalrecall automatic query expansion with a generative featuremodel for object retrievalrdquo in Proceedings of the 2007 IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[9] J PhilbinOChumM Isard J Sivic andA Zisserman ldquoLost inquantization improving particular object retrieval in large scaleimage databasesrdquo in Proceedings of the 26th IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo08) pp 1ndash8IEEE Anchorage Alaska USA June 2008

[10] H Anwar S Zambanini M Kampel and K VondrovecldquoAncient coin classification using reverse motif recognitionimage-based classification of roman republican coinsrdquo IEEESignal Processing Magazine vol 32 no 4 pp 64ndash74 2015

[11] S Zhang Q Tian G Hua Q Huang and W Gao ldquoGeneratingdescriptive visual words and visual phrases for large-scale imageapplicationsrdquo IEEETransactions on Image Processing vol 20 no9 pp 2664ndash2677 2011

[12] Y Su and F Jurie ldquoImproving image classification using seman-tic attributesrdquo International Journal of Computer Vision vol 100no 1 pp 59ndash77 2012

[13] Z Mehmood S M Anwar M Altaf and N Ali ldquoA novelimage retrieval based on rectangular spatial histograms of visualwordsrdquo Kuwait Journal of Science In press

[14] S Lazebnik C Schmid and J Ponce ldquoBeyond bags of featuresspatial pyramid matching for recognizing natural scene cate-goriesrdquo in Proceedings of the IEEE Computer Society Conferenceon Computer Vision and Pattern Recognition (CVPR rsquo06) vol 2pp 2169ndash2178 New York NY USA June 2006

[15] C Wang B Zhang Z Qin and J Xiong ldquoSpatial weighting forbag-of-features based image retrievalrdquo in Integrated Uncertaintyin Knowledge Modelling and Decision Making Z Qin and V-NHuynh Eds vol 8032 of Lecture Notes in Computer Science pp91ndash100 Springer Berlin Germany 2013

[16] N Ali K B Bajwa R Sablatnig and Z Mehmood ldquoImageretrieval by addition of spatial information based on histogramsof triangular regionsrdquoComputers amp Electrical Engineering 2016

[17] SM Youssef ldquoICTEDCT-CBIR integrating curvelet transformwith enhanced dominant colors extraction and texture analysisfor efficient content-based image retrievalrdquo Computers andElectrical Engineering vol 38 no 5 pp 1358ndash1376 2012

[18] P Poursistani H Nezamabadi-pour R Askari Moghadam andM Saeed ldquoImage indexing and retrieval in JPEG compresseddomain based on vector quantizationrdquoMathematical and Com-puter Modelling vol 57 no 5-6 pp 1005ndash1017 2013

[19] X Tian L Jiao X Liu and X Zhang ldquoFeature integration ofEODH and Color-SIFT application to image retrieval based oncodebookrdquo Signal Processing Image Communication vol 29 no4 pp 530ndash545 2014

[20] D N M Cardoso J Muller F Alexandre L A P Neves P MG Trevisani andG A Giraldi ldquoIterative technique for content-based image retrieval using multiple SVM ensemblesrdquo in ATreatise on Electricity andMagnetism J ClerkMaxwell Ed vol2 pp 68ndash73 Cambridge University Press 1873

[21] E Yildizer A M Balci M Hassan and R Alhajj ldquoEfficientcontent-based image retrieval using multiple support vectormachines ensemblerdquo Expert Systems with Applications vol 39no 3 pp 2385ndash2396 2012

[22] N Dalal and B Triggs ldquoHistograms of oriented gradients forhuman detectionrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo05) vol 1 pp 886ndash893 IEEE SanDiego Calif USA June 2005

[23] D G Lowe ldquoDistinctive image features from scale-invariantkeypointsrdquo International Journal of Computer Vision vol 60 no2 pp 91ndash110 2004

[24] J Matas O Chum M Urban and T Pajdla ldquoRobust wide-baseline stereo from maximally stable extremal regionsrdquo Imageand Vision Computing vol 22 no 10 pp 761ndash767 2004

[25] H Bay T Tuytelaars and L Van Gool ldquoSurf speeded uprobust featuresrdquo in Computer VisionmdashECCV 2006 pp 404ndash417Springer 2006

[26] D J Mankowitz and S Ramamoorthy ldquoBRISK-based visualfeature extraction for resource constrained robotsrdquo in RoboCup2013 Robot World Cup XVII S Behnke M Veloso A Visserand R Xiong Eds vol 8371 of Lecture Notes in ComputerScience pp 195ndash206 Springer Berlin Germany 2014

[27] S Krig ldquoInterest point detector and feature descriptor surveyrdquoin Computer Vision Metrics pp 217ndash282 Springer BerlinGermany 2014

[28] R Ashraf K Bashir A Irtaza and M T Mahmood ldquoContentbased image retrieval using embedded neural networks withbandletized regionsrdquo Entropy vol 17 no 6 pp 3552ndash3580 2015

12 Mathematical Problems in Engineering

[29] S Zeng R Huang H Wang and Z Kang ldquoImage retrievalusing spatiograms of colors quantized by Gaussian MixtureModelsrdquo Neurocomputing vol 171 pp 673ndash684 2016

[30] E Walia and A Pal ldquoFusion framework for effective colorimage retrievalrdquo Journal of Visual Communication and ImageRepresentation vol 25 no 6 pp 1335ndash1348 2014

[31] A Irtaza and M A Jaffar ldquoCategorical image retrieval throughgenetically optimized support vector machines (GOSVM) andhybrid texture featuresrdquo Signal Image and Video Processing vol9 no 7 pp 1503ndash1519 2014

[32] J Yu Z Qin T Wan and X Zhang ldquoFeature integrationanalysis of bag-of-features model for image retrievalrdquo Neuro-computing vol 120 pp 355ndash364 2013

[33] E Yildizer A M Balci T N Jarada and R Alhajj ldquoIntegratingwavelets with clustering and indexing for effective content-based image retrievalrdquoKnowledge-Based Systems vol 31 pp 55ndash66 2012

[34] J Shawe-Taylor and N Cristianini Kernel Methods for PatternAnalysis Cambridge University Press New York NY USA2004

[35] AVedaldi andA Zisserman ldquoSparse kernel approximations forefficient classification and detectionrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition (CVPRrsquo12) pp 2320ndash2327 June 2012

[36] E Nowak F Jurie and B Triggs ldquoSampling strategies for bag-of-features image classifiationrdquo in Computer VisionmdashECCV2006 pp 490ndash503 Springer 2006

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings oftheWorkshop on Statistical Learning in Computer Vision (ECCVrsquo04) vol 1 pp 1ndash2 Prague Czech Republic 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: Research Article A Novel Image Retrieval Based on a Combination …downloads.hindawi.com/journals/mpe/2016/8217250.pdf · 2019-07-30 · image retrieval by using a combination of

2 Mathematical Problems in Engineering

Figure 1 Images from four different classes of Corel-A image benchmark

Keeping in view the robust performance of the secondapproach [12 14] the proposed research is constructing twohistograms from a single image The global histogram ofvisual words is constructed over the whole image whilethe local histogram of visual words is constructed over thelocal rectangular region of the image The details about theselection of the local rectangular area for the constructionof local histogram are mentioned in Section 3 Figure 1 isrepresenting the images from four different classes (AfricaBeach Elephants and Horses) of Corel-A image benchmarkGrass sky and trees are in the global appearance of allthese images while the objects of interest or salient objectslie in the central region of the images (like people beachelephant and horses) There is no semantic relationshipbetween these images as they belong to the different classesThe discriminating information is likely to be available inthe central region of an image The second row of Figure 1 isrepresenting the extracted local rectangular regions of theseimages (area of the image that is selected for the constructionof local histogram of visual words) The extracted localrectangular regions of all of these images are discriminatingas they contain the information about the salient objectsof the image Figure 2 is representing an image from thesemantic class Elephant of the Corel-A image benchmarkThe global appearance of the image contains sky trees andgrass while the main object of interest in the image iselephant The construction of a local histogram from theimage central area adds the spatial attributes of the salientobject to the inverted index of the BoVW representation

Keeping these facts in view dense features are extractedfrom a set of training and test images the feature spaceis quantized to construct a visual vocabulary Images arerepresented as histograms of visual words (by using a com-bination of local and global histograms of visual words) Thelocal histogram that is constructed over the local rectangularregion of an image contains the spatial information aboutthe salient objects The global and local histograms of visualwords are concatenated and this information is added tothe inverted index of the BoVW representation The maincontributions of this paper are

(1) image representation as a combination of local andglobal histograms of visual words

(2) the addition of spatial information from the centralarea of the image to the inverted index of BoVWrepresentation

(3) reduction of semantic gap between the low-level fea-tures of an image and high-level semantic concepts

The rest of the paper is organized as follows Section 2 isabout the related research Section 3 describes the proposedresearch methodology Section 4 is about the experimentaldetails and results conducted on three image benchmarksSection 5 concludes our researchwork and points towards thefuture directions

2 Related Work

Query by Image Content (QBIC) is the first system launchedby IBM for image search [2] After that a variety of imageretrieval techniques are proposed that are based on colortexture and shape [1ndash3 5] Interest points detectors like His-togram of Oriented Gradients (HOG) [22] Scale-InvariantFeature Transform (SIFT) [23] Maximally Stable ExtremalRegions (MSER) [24] Speeded-Up Robust Features (SURF)[25] and BRISK features [26] are used for robust content-based image matching [27] Ashraf et al [28] proposed animage retrieval by using a combination of color features andbandlet transformation The bandlet transformation-basedrepresentation was selected to extract the salient objectsfrom the images Artificial Neural Networks (ANN) with aninverted index was selected for an efficient image retrievalZeng et al [29] proposed an image representation basedon spatiogram that was generalized histogram of quantizedcolors In the first step the color space was quantizedby applying the Gaussian Mixture Models (GMM) andExpectation-Maximization (EM) algorithm The number ofquantized color bins was determined on the basis of BayesianInformation Criterion (BIC) A spatiogram was based on ahistogram with a spatial distribution of color and each binrepresents the weighted distribution of pixels The similaritybetween the two images was determined on the basis ofthe similarity between the respective spatiograms Walia andPal [30] proposed a framework for color image retrievalby using a combination of low-level features The ColorDifference Histogram (CDH) was used to extract the color

Mathematical Problems in Engineering 3

(a) (b)

Figure 2 (a) is presenting the procedure for computation of global histogram while (b) is presenting the procedure for the extraction oflocal histogram

and textureThe shape features were extracted by applying anAngular Radial Transform (ART) The retrieval effectivenessof the proposed framework was enhanced by modifyingthe CDH algorithm Irtaza and Jaffar [31] proposed a com-bination of Genetic Algorithm (GA) and Support VectorMachines (SVM) to reduce the semantic gap Images wererepresented as a combination of curvelet wavelet and Gaborfeatures Relevance Feedback (RF) was applied to enhancethe accuracy of the proposed work Tian et al [19] proposeda rotation-invariant and scale-invariant Edge OrientationDifferenceHistogram (EODH) descriptor Steerable filter andvector sumwere applied to obtain themain orientation of thepixelsThe effectiveness of the feature space was improved byselecting a combination of Color SIFT and EODHCodebookwas constructed by applying the weighted average of ColorSIFT and EODH Yu et al [32] proposed an image retrieval byusing a combination of Local Binary Pattern (LBP)withHOGand SIFT The midlevel featuresrsquo combination was selectedto achieve high performance for the complex backgroundVisual features were separately extracted from the images byusing SIFT and LBPWeighted average clustering was appliedfor the codebook construction to obtain the integrationof two midlevel features According to the experimentalresults [32] the best performance of image retrieval wasobtained by applying the feature integration of SIFT and LBPWang et al [15] proposed Spatial Weighting Bag-of-Features(SWBOF) framework and extracted spatial information fromthe subblocks of the images Local entropy local varianceand adjacent blocks distance were selected to calculate thespatial information According to experimental results [15]SWBOF with spatial information performs better than thetraditional BoF representation Yildizer et al [33] proposedCBIR by using multiple Support Vector Machines ensembleSVM is used for regression and Support Vector Regression(SVR) ensemble is selected for classification According tothe experimental results the proposed technique improvesthe accuracy of image retrieval Cardoso et al [20] proposediterative technique for CBIR that is based on Multiple SVMEnsembles Discrete Cosine Transform (DCT) is selected forfeature extraction and multi-SVM is used for classificationYildizer et al [21] integrated wavelets for an effective content-based image retrieval k-means with B+-tree data structure isused for clustering with Daubechies wavelet transform thathas excellent spatial and spectral locality properties whichmake it very useful for CBIR it is applied to partitioning theimages into different levels Youssef [17] proposed a novel

Integrating Curvelet Transform with Enhanced DominantColors extraction and Texture (ICTEDCT) analysis for effi-cientCBIRCurveletmultiscale ridgeletswere integratedwithregion-based vector codebook subband clustering to enhancedominant colors extraction and texture analysis

3 Proposed Methodology

The lack of spatial information is the main problem inthe standard BoVW representation [7 15] Visual wordsare represented in a histogram without considering theirlocations in the 2D image plane The spatial informationcarries discriminating details that enhance the performanceof CBIR [7 15] The block diagram of proposed researchmethodology is presented in Figure 3

(1) In the BoVW representation a raw image 119868 is repre-sented as

119868 = (119886119894119895) (1)

where 119886119894119895is the pixel at the position (119894 119895)

(2) Dense SIFT features are extracted from the image andan image 119868 is represented as

119868 = 1198891 1198892 119889

119898 (2)

where 1198891to 119889119898are image descriptors

(3) Quantization algorithms like k-means clustering isapplied to construct a visual vocabulary (codebook)consisting of 119899 visual words represented as

voc = 1199081 1199082 119908

119899 (3)

where 1199081to 119908119899are visual words

(4) For the construction of global histogram of visualwords mapping of each visual word is done over thewhole image For the construction of local histogramof visual words mapping of each visual word is doneby extracting the image central area by using (5) Thenearest visual words are assigned to the quantizeddescriptors by using the following equation

119908 (119889119896) = argmin119908isinvoc

Dist (119908 119889119896) (4)

where 119908(119889119896) is representing the visual word assigned

to the kth descriptor 119889119896 while Dist(119908 119889

119896) is the

4 Mathematical Problems in Engineering

Query image

Learning ofclassifier

on histogramsof training

images

Retrieved images

Dense features extraction

Clustering

Dense features extraction

Classification

Similarity measure

Clustering

Vocabulary construction

Training images representations by using

a combination of local and global histograms

of visual words

Class-1 Class-N

Figure 3 Block diagram of proposed research based on a combination of local and global histograms of visual words

distance between the descriptor 119889119896and visual word

119908 Each image is represented as a collection of patchesand each patch is represented by visual words

(5) Two histograms of 119873 visual words are extractedfrom a single image The computed local and globalhistograms are concatenated and this information isadded to the inverted index of BoVW representationThe visual words of the local histogram aremapped toa rectangular area that is defined by image width (W)and height (H) Four points are defined to extract theoptimized local rectangular area as follows

1199011= 119882 times 119909

1199012= (119882 times 119910) + 119901

1= (119882 times 119910) + (119882 times 119909)

= 119882 times (119909 + 119910)

1199013= 119867 times 119909

1199014= (119867 times 119910) + 119901

3= (119867 times 119910) + (119867 times 119909)

= 119867 times (119909 + 119910)

(5)

where 119909 is the normalized starting point and 119910 is theending point of the local rectangular area

(6) Consider 119873 as the number of visual words of thevocabulary Let 119863

119894be the set of the descriptors that

are mapped to the visual word 119908119894 then the 119894th bin of

the histogram of visual words 119887119894is the cardinality of

the set119863119894

119887119894= Card (119863

119894)

119863119894= 119889119896 119896 isin (1 119873) | 119908 (119889119896) = 119908119894

(6)

31 Image Classification SVM is a state-of-the-art supervisedlearning classification algorithm [5] The linear SVM sepa-rates two classes by using a hyperplane The dataset with twoclasses is represented as

(119909119894 119910119894)119873

119894=1119910119894= +1 minus1 (7)

where 119909119894and 119910119894are input datasets and +1 and minus1 are the corre-

spondence labels of the classes respectively The hyperplanesare generated by finding the values of the coefficients

119908119879sdot 119909 + 119887 = 0 (8)

where119908 is weight vector and 119887 is biasThemaximummarginis determined by 2119908 hyperplanes and the two classes

Mathematical Problems in Engineering 5

are separable from each other according to the followingequations

119908119879sdot 119909119894+ 119887 = 1

119908119879sdot 119909119894+ 119887 = minus1

(9)

This can be expressed equivalently as

119910119894(119908119879sdot 119909119894+ 119887) ge 1 (10)

The kernel method [34] is used in SVM to compute thedot product in the high-dimensional feature space that pro-vides the ability to generate nonlinear decision boundariesThe kernel function permits using the data with no obviousfixed dimensionsThehistograms of visual words constructedover the local and global areas of the image are normalizedand SVM Hellinger kernel [35] is applied on the normalizedhistograms by using following equation

119870(ℎ ℎ1015840) = sum

119894

radicℎ (119894) ℎ1015840(119894) (11)

where ℎ and ℎ1015840 are the normalized histogramsThe SVM Hellinger kernel is selected because of its low

computational cost and instead of computing the kernelvalues it explicitly computes the featuremap and the classifierremains linear The one-versus-one rule is applied for 119896number of classes 119896 sdot (119896 minus 1)2 classifiers are constructed andeach classifier trains the data by using two classes

4 Experiments and Results

This section is about the details of experiments conducted forthe evaluation of the proposed image representation basedon a combination of local and global histograms of visualwords The proposed research is evaluated on Corel-A imagebenchmark Caltech-256 and Ground Truth image datasetsThe images are randomly divided into training and testdatasets The visual vocabulary (codebook) is constructedfrom the training images and retrieval precision is calculatedby using the test dataset Keeping in view the unsupervisednature of clustering by applying k-means each experimentis repeated 10 times and average values of precision arereported In order to evaluate the performance of proposedresearch we determined the relevant images retrieved inresponse to a query image A computer simulation is used toselect images randomly from test dataset and use them as aquery imageThe response to the query image is evaluated onthe basis of relevant images retrieved Precision determinesthe number of relevant images retrieved in response to aquery image and it shows the specificity of the image retrievalsystem

Precision =Number of relevant images retrievedTotal number of images retrieved

(12)

Recall measures the sensitivity of the image retrievalsystem Recall is calculated by the ratio of correct images

retrieved to the total number of images of that class in theimage benchmark

Recall =Number of relevant images retrievedTotal number of relevant images

(13)

The details about experimental parameters are givenbelow

(1) Vocabulary Size The size of visual vocabulary is amajor parameter that affects the performance of content-based image matching [36 37] increasing the size of visualvocabulary at certain level and increases the performanceand larger size visual vocabulary tends to overfit Differentsizes of visual vocabulary are constructed for the Corel-A (2050 100 200 and 400) Caltech-256 (20 50 100 200 400and 600) and Ground Truth (10 20 30 40 and 50) imagebenchmarks in order to evaluate the best performance of theproposed image representation

(2) Dense Pixel Stride The dense SIFT features are extractedfrom the training and test images For a precise content-based image matching we extracted dense features usingthree different scales The dense pixel stride or the step sizeis used to control the spatial resolution of the dense grid Asmaller pixel stride results in a finer grid while a larger pixelstride makes the grid coarser With a pixel stride of 5 15 and25 we considered every 5th 15th and 25th pixel respectivelyas a features to calculate the SIFT descriptor from local andglobal regions of each image

(3) Feature Percentage for the Vocabulary Construction Thefeature percentage to construct a visual vocabulary froma training dataset is one of the parameters that affect theperformance of image retrieval According to [36] increasingthe percentage of features for visual vocabulary constructionincreases the performance of content-based image matchingand vice versa In experiments different percentages (1025 50 75 and 100) of dense features per image areused to construct visual vocabulary

(4) Value of 119909 and y x and 119910 are used for the extraction oflocal rectangular area After a number of experiments theoptimized normalized starting value of 119909 = 022 and endingvalue of 119910 = 060 are selected for training and test images

41 Retrieval Performance on Corel-A Image Benchmark TheCorel-A image benchmark (httpwangistpsuedudocsrelated) is selected for the evaluation of the proposedimage representation and results are compared with thestandard BoVW representation [6] and the state-of-the-artCBIR research [4 15 17ndash19] The Corel-A image benchmarkcontains 1000 images that are divided into ten semantic cat-egories namely Africa Buildings Beach Dinosaurs BusesElephants Horses Flowers Mountains and Food Eachsemantic category consists of 100 images with a resolution of256 times 384 pixels or 384 times 256 pixels Figure 4 is representingthe images from all the semantic classes of Corel-A imagebenchmark

6 Mathematical Problems in Engineering

Figure 4 Samples of images from each category of Corel-A image benchmark

Table 1 MAP of the proposed image representation on a vocabulary size of 200 visual words and pixel stride of 5

Vocabulary size ampfeatures used 20 50 100 200 400

10 7740 8027 8198 8401 825725 7774 8092 8202 8412 831650 7785 8106 8253 8425 830175 7808 8116 8307 8433 8324100 7836 8132 8327 8438 8329MAP 7788 8095 8257 8421 8305Standard deviation 03609 04050 05899 01522 02905Confidence interval 7743ndash7833 8044ndash8144 8184ndash8330 8402ndash8440 8269ndash8341Standard error 01614 01811 02638 00680 01299

In order to maintain a balance 50 images are used forthe training and the remaining 50 images are used for thetesting Different sizes of visual vocabulary (20 50 100 200and 400) are constructed from a set of training images MeanAverage Precision (MAP) is calculated by a random selectionof 500 images from the test dataset MAP for top-20 imageretrievals as a function of vocabulary size and percentage ofdense features per image used in vocabulary construction arepresented in Table 1

The additional statistical investigation in Table 1 rein-forces our described results We have calculated standarderror as description and estimated 95 confidence intervalas inferential results From these results we can easilyconclude that both the lower and the upper bounds at 5level of significance exceed 84 of MAP and the smallervalue of standard error further enhances the fact that thevisual vocabulary of size 200 visual words shows preciseand consistent result as compared to other sizes of visual

vocabulary on different features percentages (10 25 5075 and 100)

The MAP obtained by using proposed image representa-tion based on a combination of local and global histogramsof visual words (by using a vocabulary size of 200 visualwords) on the pixel strides of 5 15 and 25 is 84218212 and 7829 respectively The MAP of the proposedimage representation is compared with the standard BoVWrepresentation (without considering the spatial informationof local histogram) The MAP comparison of the proposedresearch and standard BoVWas a function of vocabulary sizeon the pixel strides of 5 15 and 25 is graphically representedin Figure 5 The proposed image representation outperformsthe standard BoVW representation on all vocabulary sizes(pixel strides of 5 15 and 25) According to the experimentalresults the MAP for all vocabulary sizes on a pixel stride of 5is greater than pixel strides of 15 and 25

Mathematical Problems in Engineering 7

Table 2 Comparison of MAP for top-20 image retrievals on Corel-A image benchmark

Class amp method Proposed research Youssef [17] Irtaza et al [4] Poursistani et al [18] Tian et al [19] Wang et al [15]Africa 7303 635 65 7024 746 64Beach 7458 642 60 4444 378 54Buildings 8024 698 62 708 539 53Buses 9584 915 85 763 967 94Dinosaurs 9795 992 93 100 99 98Elephants 8764 781 65 638 66 78Flowers 8513 948 94 924 92 71Horses 8629 952 77 947 87 93Mountains 8243 738 73 562 585 42Food 7896 806 81 745 622 50

Table 3 Comparison of recall for top-20 image retrievals on Corel-A image benchmark

Class amp method Proposed research Youssef [17] Irtaza et al [4] Poursistani et al [18] Tian et al [19] Wang et al [15]Africa 1461 1270 1300 1405 1492 1280Beach 1492 1284 1200 889 756 1080Buildings 1605 1396 1240 1416 1078 1060Buses 1917 1830 1700 1526 1934 1880Dinosaurs 1959 1984 1860 2000 1980 1960Elephants 1753 1562 1300 1276 1320 1560Flowers 1703 1896 1880 1848 1840 1420Horses 1726 1904 1540 1894 1740 1860Mountains 1649 1476 1460 1124 1170 840Food 1579 1612 1620 1490 1244 1000Mean 1684 1621 1510 1457 1455 1394

100 200 300 4000Size of vocabulary

75

80

85

MA

P

Proposed research with a pixel stride of 5BoVW with a pixel stride of 5Proposed research with a pixel stride of 15BoVW with a pixel stride of 15Proposed work with a pixel stride of 25BoVW with a pixel stride of 25

Figure 5 MAP as a function of vocabulary size

In order to present a sustainable performance of theproposed research the image retrieval precision and recallfor the top-20 image retrievals are compared with the state-of-the-art research of CBIR [4 15 17ndash19] Tables 2 and 3 arepresenting the classwise comparison of average precision andrecall of the proposed research (with a vocabulary size of 200words on a pixel stride of 5 and by using 100 features perimage) with the existing state-of-the-art techniques of CBIRThe MAP comparison is shown in Figure 6 while precision-recall curve is shown in Figure 7

The experimental results conducted on Corel-A imagebenchmark prove the robustness of the proposed imagerepresentation The MAP of the proposed research is higherthan the existing state-of-the-art research [4 15 17ndash19] TheMAP obtained from the proposed image representation is8421 (with a vocabulary size of 200 visual words on a pixelstride of 5 and by using 100 features per image)

The image retrieval results obtained by using proposedimage representation for the semantic class ldquoBusesrdquo andldquoBeachrdquo are shown in Figures 8 and 9 respectively in res-ponse to the query images that shows reduction of semanticgap in terms of classifier decision value (score) The classifierdecision label determines the class of the image whileclassifier decision value (score) is used to find the similar

8 Mathematical Problems in Engineering

Proposed researchYoussef et alIrtaza et al

Poursistani et al

Wang et alTian et al

84218107

7550 7434 72776970

60

65

70

75

80

85

90

Mea

n av

erag

e pre

cisio

n (M

AP)

()

Figure 6 Comparison of MAP with the state-of-the-art methodson Corel-A image benchmark

Proposed research with pixel stride of 5SIFT-LBP [32]

20 40 60 80 1000Recall ()

40

60

80

100

Prec

ision

()

Figure 7 Precision-recall curve for Corel-A image benchmark

images The real values shown at the top of each image arethe classifier decision value (score) of the respective imagecomputed by applying the Euclidean distance between scoreof the query image and scores of the retrieved images Top-20retrieved images whose score is close to the score of the queryimage aremore similar to the query image and vice versa (dueto limited space the retrieval results in response to the queryimages are shown for two semantic classes only)

42 Performance on Caltech-256 Image Benchmark Caltech-256 image benchmark (httpwwwvisioncaltechedu) con-sists of 256 semantic classes and each semantic class contains80 to 827 images For the simplicity we randomly selected10 classes from the Caltech-256 image benchmark Theselected classes are Motorbike Faceeasy Fireworks BonsaiButterfly Leopard Airplanes Ketch and Hibiscus as shownin Figure 10

Different sizes of visual vocabulary (20 50 100 200 400and 600) are constructed from a set of training images 50

190792

192121

181437

207091

168814

1897

184608

20671

212677

191177

196715

204835

212566

18671

177793

170756

214148

188865

18053

173708

168421

Figure 8 Image retrieval result shows reduction of semantic gap forthe semantic class ldquoBusesrdquo

209122

206148

200182

194074

182063

211198

202284

194183

186

210937

203655

221345

190409

214339

220837

19114

181279

205414

197978

225197

181495

Figure 9 Image retrieval result shows reduction of semantic gap forthe semantic class ldquoBeachrdquo

images are used for the training and the remaining 50images are used for the testing The MAP of the proposedresearch is compared with standard BoVW representation(without considering the spatial information) on differentpixel strides (5 15 and 25) vocabulary sizes (20 50 100200 400 and 600) and feature percentages (10 2550 75 and 100) for vocabulary construction The MAPobtained from the proposed image representation and thestandard BoVW representation by using different parametersis graphically presented in Figure 11

Mathematical Problems in Engineering 9

Figure 10 Samples of images from 10 semantic classes of Caltech-256 image benchmark

100 200 300 400 500 6000Size of vocabulary

75

80

85

MA

P

Proposed research with pixel stride of 5BoVW with pixel stride of 5Proposed research with pixel stride of 15BoVW with pixel stride of 15Proposed research with pixel stride of 25BoVW with pixel stride of 25

Figure 11 MAP as a function of vocabulary size for the Caltech-256image benchmark

The experimental results conducted on 10 semanticclasses of Caltech-256 image benchmark prove the robustnessof the proposed image representation The MAP obtained byusing the proposed image representation (with a vocabularysize of 400 visualwords andby using 100 features per image)on the pixel strides of 5 15 and 25 is 8650 8398 and8171 respectively while MAP obtained by using the sameexperimental parameters for standard BoVW representationis 8545 8305 and 8095 respectively According tothe experimental results the MAP for all vocabulary sizesdecreases by increasing the pixel stride as shown in Figure 11while precision-recall curve is shown in Figure 12

Proposed research with pixel stride of 5BoVW with pixel stride of 5

50

60

70

80

90Pr

ecisi

on (

)

20 40 60 80 1000Recall ()

Figure 12 Precision-recall curve for Caltech-256 image benchmark

43 Performance on Ground Truth Image BenchmarkGround Truth image benchmark (httpimagedatabasecswashingtonedugroundtruth) is a publicly available imagebenchmark and is used for the evaluation of CBIR research[20 21 33] There are a total of 1109 images that are dividedinto 22 semantic classes We manually selected all the imagesfrom 5 different semantic classes of Ground Truth imagebenchmark The selected classes for the evaluation of theproposed image representation are Abro green CherriesFootball Green Lake and Swiss Mountains as shown inFigure 13 The 5 classes are selected in order to comparethe performance of the proposed image representation withexisting state-of-the-art research [20 21 33] because theseresearchers also reported their results on the same numberof classes of Ground Truth image benchmark

50 images are used for the training and the remaining50 images are used for the testing Different sizes (10 20

10 Mathematical Problems in Engineering

Figure 13 Samples of images from 5 semantic classes of Ground Truth image benchmark

8284 8133

62805909

Proposed researchCardoso et al

Yildizer et alYildizer et al

45

50

55

60

65

70

75

80

85

MA

P (

)

Figure 14 Comparison of MAP with the state-of-the-art methodson Ground Truth image benchmark

Table 4 Classwise comparison of average precision

Class amp method Proposed method [20] [21]Abro green 7525 80 6667Cherries 7194 80 50Football 9018 100 75Green Lake 8423 80 50Swiss Mountains 9263 6067 50

30 40 and 50) of visual vocabulary are constructed andMAP on the vocabulary of these different sizes is calculatedAccording to the experimental results the MAP of 8284 isobtained by using the proposed image representation (with avocabulary size of 40 visual words on pixel stride of 5 andby using 100 features per image) The classwise averageprecision obtained from the proposed image representationis presented in Table 4 while MAP is graphically presentedin Figure 14

Table 5 Computational cost (in seconds) of the proposed algo-rithm

Number of images retrieved Proposed researchTop-5 02872Top-10 03972Top-15 06470Top-20 07837Top-25 09184Top-30 11103

Experimental results and comparisons conducted on theGround Truth image benchmark prove the robustness of theproposed research based on a combination of local and globalhistograms of visual words The MAP obtained from theproposed image representation is higher than the existingstate-of-the-art research [20 21 33]

44 Complexity Performance The computational cost of theproposed algorithm is calculated using Intel(R) Core i3(fourth generation) 17 Ghz CPU with 4GB RAM (DDR3)3MB L3 cache and Windows 7 operating system Theproposed algorithm is implemented in MATLAB and visualvocabulary is constructed offline using a training dataset andtested at run time using a test dataset The average CPUtime (in seconds) required from features extraction to imageretrieval is presented in Table 5

5 Conclusion and Future Directions

In this paper we proposed a novel image representationbased on a combination of local and global histograms ofvisual words The local histogram of visual words adds thespatial information of the central area to the inverted index ofBoVW representation The combination of local and globalhistograms of visual words is a possible solution to capturethe semantic information from the image The performance

Mathematical Problems in Engineering 11

of the proposed image representation is evaluated on threechallenging image datasets The proposed image represen-tation outperforms the state-of-the-art research includingthe standard BoVW representation For the future work weplan to replace the BoVW model with either Fisher kernelframework or the Vector of Locally Aggregated Descriptors(VLAD) model to evaluate a large-scale image retrieval

Competing Interests

The authors declare that they have no competing interests

References

[1] A-M Tousch S Herbin and J-Y Audibert ldquoSemantic hierar-chies for image annotation a surveyrdquo Pattern Recognition vol45 no 1 pp 333ndash345 2012

[2] R Datta D Joshi J Li and J Z Wang ldquoImage retrieval ideasinfluences and trends of the new agerdquoACMComputing Surveysvol 40 no 2 article 5 2008

[3] Y Liu D Zhang G Lu and W-Y Ma ldquoA survey of content-based image retrieval with high-level semanticsrdquo Pattern Recog-nition vol 40 no 1 pp 262ndash282 2007

[4] A Irtaza M A Jaffar E Aleisa and T-S Choi ldquoEmbeddingneural networks for semantic association in content basedimage retrievalrdquoMultimedia Tools and Applications vol 72 no2 pp 1911ndash1931 2014

[5] D ZhangMM Islam andG Lu ldquoA reviewon automatic imageannotation techniquesrdquo Pattern Recognition vol 45 no 1 pp346ndash362 2012

[6] J Sivic and A Zisserman ldquoVideo google a text retrievalapproach to object matching in videosrdquo in Proceedings of the 9thIEEE International Conference on Computer Vision pp 1470ndash1477 IEEE 2003

[7] J PhilbinO ChumM Isard J Sivic andA Zisserman ldquoObjectretrieval with large vocabularies and fast spatial matchingrdquoin Proceedings of the IEEE Computer Society Conference onComputer Vision and Pattern Recognition (CVPR rsquo07) pp 1ndash8IEEE Minneapolis Minn USA June 2007

[8] O Chum J Philbin J Sivic M Isard and A Zisserman ldquoTotalrecall automatic query expansion with a generative featuremodel for object retrievalrdquo in Proceedings of the 2007 IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[9] J PhilbinOChumM Isard J Sivic andA Zisserman ldquoLost inquantization improving particular object retrieval in large scaleimage databasesrdquo in Proceedings of the 26th IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo08) pp 1ndash8IEEE Anchorage Alaska USA June 2008

[10] H Anwar S Zambanini M Kampel and K VondrovecldquoAncient coin classification using reverse motif recognitionimage-based classification of roman republican coinsrdquo IEEESignal Processing Magazine vol 32 no 4 pp 64ndash74 2015

[11] S Zhang Q Tian G Hua Q Huang and W Gao ldquoGeneratingdescriptive visual words and visual phrases for large-scale imageapplicationsrdquo IEEETransactions on Image Processing vol 20 no9 pp 2664ndash2677 2011

[12] Y Su and F Jurie ldquoImproving image classification using seman-tic attributesrdquo International Journal of Computer Vision vol 100no 1 pp 59ndash77 2012

[13] Z Mehmood S M Anwar M Altaf and N Ali ldquoA novelimage retrieval based on rectangular spatial histograms of visualwordsrdquo Kuwait Journal of Science In press

[14] S Lazebnik C Schmid and J Ponce ldquoBeyond bags of featuresspatial pyramid matching for recognizing natural scene cate-goriesrdquo in Proceedings of the IEEE Computer Society Conferenceon Computer Vision and Pattern Recognition (CVPR rsquo06) vol 2pp 2169ndash2178 New York NY USA June 2006

[15] C Wang B Zhang Z Qin and J Xiong ldquoSpatial weighting forbag-of-features based image retrievalrdquo in Integrated Uncertaintyin Knowledge Modelling and Decision Making Z Qin and V-NHuynh Eds vol 8032 of Lecture Notes in Computer Science pp91ndash100 Springer Berlin Germany 2013

[16] N Ali K B Bajwa R Sablatnig and Z Mehmood ldquoImageretrieval by addition of spatial information based on histogramsof triangular regionsrdquoComputers amp Electrical Engineering 2016

[17] SM Youssef ldquoICTEDCT-CBIR integrating curvelet transformwith enhanced dominant colors extraction and texture analysisfor efficient content-based image retrievalrdquo Computers andElectrical Engineering vol 38 no 5 pp 1358ndash1376 2012

[18] P Poursistani H Nezamabadi-pour R Askari Moghadam andM Saeed ldquoImage indexing and retrieval in JPEG compresseddomain based on vector quantizationrdquoMathematical and Com-puter Modelling vol 57 no 5-6 pp 1005ndash1017 2013

[19] X Tian L Jiao X Liu and X Zhang ldquoFeature integration ofEODH and Color-SIFT application to image retrieval based oncodebookrdquo Signal Processing Image Communication vol 29 no4 pp 530ndash545 2014

[20] D N M Cardoso J Muller F Alexandre L A P Neves P MG Trevisani andG A Giraldi ldquoIterative technique for content-based image retrieval using multiple SVM ensemblesrdquo in ATreatise on Electricity andMagnetism J ClerkMaxwell Ed vol2 pp 68ndash73 Cambridge University Press 1873

[21] E Yildizer A M Balci M Hassan and R Alhajj ldquoEfficientcontent-based image retrieval using multiple support vectormachines ensemblerdquo Expert Systems with Applications vol 39no 3 pp 2385ndash2396 2012

[22] N Dalal and B Triggs ldquoHistograms of oriented gradients forhuman detectionrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo05) vol 1 pp 886ndash893 IEEE SanDiego Calif USA June 2005

[23] D G Lowe ldquoDistinctive image features from scale-invariantkeypointsrdquo International Journal of Computer Vision vol 60 no2 pp 91ndash110 2004

[24] J Matas O Chum M Urban and T Pajdla ldquoRobust wide-baseline stereo from maximally stable extremal regionsrdquo Imageand Vision Computing vol 22 no 10 pp 761ndash767 2004

[25] H Bay T Tuytelaars and L Van Gool ldquoSurf speeded uprobust featuresrdquo in Computer VisionmdashECCV 2006 pp 404ndash417Springer 2006

[26] D J Mankowitz and S Ramamoorthy ldquoBRISK-based visualfeature extraction for resource constrained robotsrdquo in RoboCup2013 Robot World Cup XVII S Behnke M Veloso A Visserand R Xiong Eds vol 8371 of Lecture Notes in ComputerScience pp 195ndash206 Springer Berlin Germany 2014

[27] S Krig ldquoInterest point detector and feature descriptor surveyrdquoin Computer Vision Metrics pp 217ndash282 Springer BerlinGermany 2014

[28] R Ashraf K Bashir A Irtaza and M T Mahmood ldquoContentbased image retrieval using embedded neural networks withbandletized regionsrdquo Entropy vol 17 no 6 pp 3552ndash3580 2015

12 Mathematical Problems in Engineering

[29] S Zeng R Huang H Wang and Z Kang ldquoImage retrievalusing spatiograms of colors quantized by Gaussian MixtureModelsrdquo Neurocomputing vol 171 pp 673ndash684 2016

[30] E Walia and A Pal ldquoFusion framework for effective colorimage retrievalrdquo Journal of Visual Communication and ImageRepresentation vol 25 no 6 pp 1335ndash1348 2014

[31] A Irtaza and M A Jaffar ldquoCategorical image retrieval throughgenetically optimized support vector machines (GOSVM) andhybrid texture featuresrdquo Signal Image and Video Processing vol9 no 7 pp 1503ndash1519 2014

[32] J Yu Z Qin T Wan and X Zhang ldquoFeature integrationanalysis of bag-of-features model for image retrievalrdquo Neuro-computing vol 120 pp 355ndash364 2013

[33] E Yildizer A M Balci T N Jarada and R Alhajj ldquoIntegratingwavelets with clustering and indexing for effective content-based image retrievalrdquoKnowledge-Based Systems vol 31 pp 55ndash66 2012

[34] J Shawe-Taylor and N Cristianini Kernel Methods for PatternAnalysis Cambridge University Press New York NY USA2004

[35] AVedaldi andA Zisserman ldquoSparse kernel approximations forefficient classification and detectionrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition (CVPRrsquo12) pp 2320ndash2327 June 2012

[36] E Nowak F Jurie and B Triggs ldquoSampling strategies for bag-of-features image classifiationrdquo in Computer VisionmdashECCV2006 pp 490ndash503 Springer 2006

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings oftheWorkshop on Statistical Learning in Computer Vision (ECCVrsquo04) vol 1 pp 1ndash2 Prague Czech Republic 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: Research Article A Novel Image Retrieval Based on a Combination …downloads.hindawi.com/journals/mpe/2016/8217250.pdf · 2019-07-30 · image retrieval by using a combination of

Mathematical Problems in Engineering 3

(a) (b)

Figure 2 (a) is presenting the procedure for computation of global histogram while (b) is presenting the procedure for the extraction oflocal histogram

and textureThe shape features were extracted by applying anAngular Radial Transform (ART) The retrieval effectivenessof the proposed framework was enhanced by modifyingthe CDH algorithm Irtaza and Jaffar [31] proposed a com-bination of Genetic Algorithm (GA) and Support VectorMachines (SVM) to reduce the semantic gap Images wererepresented as a combination of curvelet wavelet and Gaborfeatures Relevance Feedback (RF) was applied to enhancethe accuracy of the proposed work Tian et al [19] proposeda rotation-invariant and scale-invariant Edge OrientationDifferenceHistogram (EODH) descriptor Steerable filter andvector sumwere applied to obtain themain orientation of thepixelsThe effectiveness of the feature space was improved byselecting a combination of Color SIFT and EODHCodebookwas constructed by applying the weighted average of ColorSIFT and EODH Yu et al [32] proposed an image retrieval byusing a combination of Local Binary Pattern (LBP)withHOGand SIFT The midlevel featuresrsquo combination was selectedto achieve high performance for the complex backgroundVisual features were separately extracted from the images byusing SIFT and LBPWeighted average clustering was appliedfor the codebook construction to obtain the integrationof two midlevel features According to the experimentalresults [32] the best performance of image retrieval wasobtained by applying the feature integration of SIFT and LBPWang et al [15] proposed Spatial Weighting Bag-of-Features(SWBOF) framework and extracted spatial information fromthe subblocks of the images Local entropy local varianceand adjacent blocks distance were selected to calculate thespatial information According to experimental results [15]SWBOF with spatial information performs better than thetraditional BoF representation Yildizer et al [33] proposedCBIR by using multiple Support Vector Machines ensembleSVM is used for regression and Support Vector Regression(SVR) ensemble is selected for classification According tothe experimental results the proposed technique improvesthe accuracy of image retrieval Cardoso et al [20] proposediterative technique for CBIR that is based on Multiple SVMEnsembles Discrete Cosine Transform (DCT) is selected forfeature extraction and multi-SVM is used for classificationYildizer et al [21] integrated wavelets for an effective content-based image retrieval k-means with B+-tree data structure isused for clustering with Daubechies wavelet transform thathas excellent spatial and spectral locality properties whichmake it very useful for CBIR it is applied to partitioning theimages into different levels Youssef [17] proposed a novel

Integrating Curvelet Transform with Enhanced DominantColors extraction and Texture (ICTEDCT) analysis for effi-cientCBIRCurveletmultiscale ridgeletswere integratedwithregion-based vector codebook subband clustering to enhancedominant colors extraction and texture analysis

3 Proposed Methodology

The lack of spatial information is the main problem inthe standard BoVW representation [7 15] Visual wordsare represented in a histogram without considering theirlocations in the 2D image plane The spatial informationcarries discriminating details that enhance the performanceof CBIR [7 15] The block diagram of proposed researchmethodology is presented in Figure 3

(1) In the BoVW representation a raw image 119868 is repre-sented as

119868 = (119886119894119895) (1)

where 119886119894119895is the pixel at the position (119894 119895)

(2) Dense SIFT features are extracted from the image andan image 119868 is represented as

119868 = 1198891 1198892 119889

119898 (2)

where 1198891to 119889119898are image descriptors

(3) Quantization algorithms like k-means clustering isapplied to construct a visual vocabulary (codebook)consisting of 119899 visual words represented as

voc = 1199081 1199082 119908

119899 (3)

where 1199081to 119908119899are visual words

(4) For the construction of global histogram of visualwords mapping of each visual word is done over thewhole image For the construction of local histogramof visual words mapping of each visual word is doneby extracting the image central area by using (5) Thenearest visual words are assigned to the quantizeddescriptors by using the following equation

119908 (119889119896) = argmin119908isinvoc

Dist (119908 119889119896) (4)

where 119908(119889119896) is representing the visual word assigned

to the kth descriptor 119889119896 while Dist(119908 119889

119896) is the

4 Mathematical Problems in Engineering

Query image

Learning ofclassifier

on histogramsof training

images

Retrieved images

Dense features extraction

Clustering

Dense features extraction

Classification

Similarity measure

Clustering

Vocabulary construction

Training images representations by using

a combination of local and global histograms

of visual words

Class-1 Class-N

Figure 3 Block diagram of proposed research based on a combination of local and global histograms of visual words

distance between the descriptor 119889119896and visual word

119908 Each image is represented as a collection of patchesand each patch is represented by visual words

(5) Two histograms of 119873 visual words are extractedfrom a single image The computed local and globalhistograms are concatenated and this information isadded to the inverted index of BoVW representationThe visual words of the local histogram aremapped toa rectangular area that is defined by image width (W)and height (H) Four points are defined to extract theoptimized local rectangular area as follows

1199011= 119882 times 119909

1199012= (119882 times 119910) + 119901

1= (119882 times 119910) + (119882 times 119909)

= 119882 times (119909 + 119910)

1199013= 119867 times 119909

1199014= (119867 times 119910) + 119901

3= (119867 times 119910) + (119867 times 119909)

= 119867 times (119909 + 119910)

(5)

where 119909 is the normalized starting point and 119910 is theending point of the local rectangular area

(6) Consider 119873 as the number of visual words of thevocabulary Let 119863

119894be the set of the descriptors that

are mapped to the visual word 119908119894 then the 119894th bin of

the histogram of visual words 119887119894is the cardinality of

the set119863119894

119887119894= Card (119863

119894)

119863119894= 119889119896 119896 isin (1 119873) | 119908 (119889119896) = 119908119894

(6)

31 Image Classification SVM is a state-of-the-art supervisedlearning classification algorithm [5] The linear SVM sepa-rates two classes by using a hyperplane The dataset with twoclasses is represented as

(119909119894 119910119894)119873

119894=1119910119894= +1 minus1 (7)

where 119909119894and 119910119894are input datasets and +1 and minus1 are the corre-

spondence labels of the classes respectively The hyperplanesare generated by finding the values of the coefficients

119908119879sdot 119909 + 119887 = 0 (8)

where119908 is weight vector and 119887 is biasThemaximummarginis determined by 2119908 hyperplanes and the two classes

Mathematical Problems in Engineering 5

are separable from each other according to the followingequations

119908119879sdot 119909119894+ 119887 = 1

119908119879sdot 119909119894+ 119887 = minus1

(9)

This can be expressed equivalently as

119910119894(119908119879sdot 119909119894+ 119887) ge 1 (10)

The kernel method [34] is used in SVM to compute thedot product in the high-dimensional feature space that pro-vides the ability to generate nonlinear decision boundariesThe kernel function permits using the data with no obviousfixed dimensionsThehistograms of visual words constructedover the local and global areas of the image are normalizedand SVM Hellinger kernel [35] is applied on the normalizedhistograms by using following equation

119870(ℎ ℎ1015840) = sum

119894

radicℎ (119894) ℎ1015840(119894) (11)

where ℎ and ℎ1015840 are the normalized histogramsThe SVM Hellinger kernel is selected because of its low

computational cost and instead of computing the kernelvalues it explicitly computes the featuremap and the classifierremains linear The one-versus-one rule is applied for 119896number of classes 119896 sdot (119896 minus 1)2 classifiers are constructed andeach classifier trains the data by using two classes

4 Experiments and Results

This section is about the details of experiments conducted forthe evaluation of the proposed image representation basedon a combination of local and global histograms of visualwords The proposed research is evaluated on Corel-A imagebenchmark Caltech-256 and Ground Truth image datasetsThe images are randomly divided into training and testdatasets The visual vocabulary (codebook) is constructedfrom the training images and retrieval precision is calculatedby using the test dataset Keeping in view the unsupervisednature of clustering by applying k-means each experimentis repeated 10 times and average values of precision arereported In order to evaluate the performance of proposedresearch we determined the relevant images retrieved inresponse to a query image A computer simulation is used toselect images randomly from test dataset and use them as aquery imageThe response to the query image is evaluated onthe basis of relevant images retrieved Precision determinesthe number of relevant images retrieved in response to aquery image and it shows the specificity of the image retrievalsystem

Precision =Number of relevant images retrievedTotal number of images retrieved

(12)

Recall measures the sensitivity of the image retrievalsystem Recall is calculated by the ratio of correct images

retrieved to the total number of images of that class in theimage benchmark

Recall =Number of relevant images retrievedTotal number of relevant images

(13)

The details about experimental parameters are givenbelow

(1) Vocabulary Size The size of visual vocabulary is amajor parameter that affects the performance of content-based image matching [36 37] increasing the size of visualvocabulary at certain level and increases the performanceand larger size visual vocabulary tends to overfit Differentsizes of visual vocabulary are constructed for the Corel-A (2050 100 200 and 400) Caltech-256 (20 50 100 200 400and 600) and Ground Truth (10 20 30 40 and 50) imagebenchmarks in order to evaluate the best performance of theproposed image representation

(2) Dense Pixel Stride The dense SIFT features are extractedfrom the training and test images For a precise content-based image matching we extracted dense features usingthree different scales The dense pixel stride or the step sizeis used to control the spatial resolution of the dense grid Asmaller pixel stride results in a finer grid while a larger pixelstride makes the grid coarser With a pixel stride of 5 15 and25 we considered every 5th 15th and 25th pixel respectivelyas a features to calculate the SIFT descriptor from local andglobal regions of each image

(3) Feature Percentage for the Vocabulary Construction Thefeature percentage to construct a visual vocabulary froma training dataset is one of the parameters that affect theperformance of image retrieval According to [36] increasingthe percentage of features for visual vocabulary constructionincreases the performance of content-based image matchingand vice versa In experiments different percentages (1025 50 75 and 100) of dense features per image areused to construct visual vocabulary

(4) Value of 119909 and y x and 119910 are used for the extraction oflocal rectangular area After a number of experiments theoptimized normalized starting value of 119909 = 022 and endingvalue of 119910 = 060 are selected for training and test images

41 Retrieval Performance on Corel-A Image Benchmark TheCorel-A image benchmark (httpwangistpsuedudocsrelated) is selected for the evaluation of the proposedimage representation and results are compared with thestandard BoVW representation [6] and the state-of-the-artCBIR research [4 15 17ndash19] The Corel-A image benchmarkcontains 1000 images that are divided into ten semantic cat-egories namely Africa Buildings Beach Dinosaurs BusesElephants Horses Flowers Mountains and Food Eachsemantic category consists of 100 images with a resolution of256 times 384 pixels or 384 times 256 pixels Figure 4 is representingthe images from all the semantic classes of Corel-A imagebenchmark

6 Mathematical Problems in Engineering

Figure 4 Samples of images from each category of Corel-A image benchmark

Table 1 MAP of the proposed image representation on a vocabulary size of 200 visual words and pixel stride of 5

Vocabulary size ampfeatures used 20 50 100 200 400

10 7740 8027 8198 8401 825725 7774 8092 8202 8412 831650 7785 8106 8253 8425 830175 7808 8116 8307 8433 8324100 7836 8132 8327 8438 8329MAP 7788 8095 8257 8421 8305Standard deviation 03609 04050 05899 01522 02905Confidence interval 7743ndash7833 8044ndash8144 8184ndash8330 8402ndash8440 8269ndash8341Standard error 01614 01811 02638 00680 01299

In order to maintain a balance 50 images are used forthe training and the remaining 50 images are used for thetesting Different sizes of visual vocabulary (20 50 100 200and 400) are constructed from a set of training images MeanAverage Precision (MAP) is calculated by a random selectionof 500 images from the test dataset MAP for top-20 imageretrievals as a function of vocabulary size and percentage ofdense features per image used in vocabulary construction arepresented in Table 1

The additional statistical investigation in Table 1 rein-forces our described results We have calculated standarderror as description and estimated 95 confidence intervalas inferential results From these results we can easilyconclude that both the lower and the upper bounds at 5level of significance exceed 84 of MAP and the smallervalue of standard error further enhances the fact that thevisual vocabulary of size 200 visual words shows preciseand consistent result as compared to other sizes of visual

vocabulary on different features percentages (10 25 5075 and 100)

The MAP obtained by using proposed image representa-tion based on a combination of local and global histogramsof visual words (by using a vocabulary size of 200 visualwords) on the pixel strides of 5 15 and 25 is 84218212 and 7829 respectively The MAP of the proposedimage representation is compared with the standard BoVWrepresentation (without considering the spatial informationof local histogram) The MAP comparison of the proposedresearch and standard BoVWas a function of vocabulary sizeon the pixel strides of 5 15 and 25 is graphically representedin Figure 5 The proposed image representation outperformsthe standard BoVW representation on all vocabulary sizes(pixel strides of 5 15 and 25) According to the experimentalresults the MAP for all vocabulary sizes on a pixel stride of 5is greater than pixel strides of 15 and 25

Mathematical Problems in Engineering 7

Table 2 Comparison of MAP for top-20 image retrievals on Corel-A image benchmark

Class amp method Proposed research Youssef [17] Irtaza et al [4] Poursistani et al [18] Tian et al [19] Wang et al [15]Africa 7303 635 65 7024 746 64Beach 7458 642 60 4444 378 54Buildings 8024 698 62 708 539 53Buses 9584 915 85 763 967 94Dinosaurs 9795 992 93 100 99 98Elephants 8764 781 65 638 66 78Flowers 8513 948 94 924 92 71Horses 8629 952 77 947 87 93Mountains 8243 738 73 562 585 42Food 7896 806 81 745 622 50

Table 3 Comparison of recall for top-20 image retrievals on Corel-A image benchmark

Class amp method Proposed research Youssef [17] Irtaza et al [4] Poursistani et al [18] Tian et al [19] Wang et al [15]Africa 1461 1270 1300 1405 1492 1280Beach 1492 1284 1200 889 756 1080Buildings 1605 1396 1240 1416 1078 1060Buses 1917 1830 1700 1526 1934 1880Dinosaurs 1959 1984 1860 2000 1980 1960Elephants 1753 1562 1300 1276 1320 1560Flowers 1703 1896 1880 1848 1840 1420Horses 1726 1904 1540 1894 1740 1860Mountains 1649 1476 1460 1124 1170 840Food 1579 1612 1620 1490 1244 1000Mean 1684 1621 1510 1457 1455 1394

100 200 300 4000Size of vocabulary

75

80

85

MA

P

Proposed research with a pixel stride of 5BoVW with a pixel stride of 5Proposed research with a pixel stride of 15BoVW with a pixel stride of 15Proposed work with a pixel stride of 25BoVW with a pixel stride of 25

Figure 5 MAP as a function of vocabulary size

In order to present a sustainable performance of theproposed research the image retrieval precision and recallfor the top-20 image retrievals are compared with the state-of-the-art research of CBIR [4 15 17ndash19] Tables 2 and 3 arepresenting the classwise comparison of average precision andrecall of the proposed research (with a vocabulary size of 200words on a pixel stride of 5 and by using 100 features perimage) with the existing state-of-the-art techniques of CBIRThe MAP comparison is shown in Figure 6 while precision-recall curve is shown in Figure 7

The experimental results conducted on Corel-A imagebenchmark prove the robustness of the proposed imagerepresentation The MAP of the proposed research is higherthan the existing state-of-the-art research [4 15 17ndash19] TheMAP obtained from the proposed image representation is8421 (with a vocabulary size of 200 visual words on a pixelstride of 5 and by using 100 features per image)

The image retrieval results obtained by using proposedimage representation for the semantic class ldquoBusesrdquo andldquoBeachrdquo are shown in Figures 8 and 9 respectively in res-ponse to the query images that shows reduction of semanticgap in terms of classifier decision value (score) The classifierdecision label determines the class of the image whileclassifier decision value (score) is used to find the similar

8 Mathematical Problems in Engineering

Proposed researchYoussef et alIrtaza et al

Poursistani et al

Wang et alTian et al

84218107

7550 7434 72776970

60

65

70

75

80

85

90

Mea

n av

erag

e pre

cisio

n (M

AP)

()

Figure 6 Comparison of MAP with the state-of-the-art methodson Corel-A image benchmark

Proposed research with pixel stride of 5SIFT-LBP [32]

20 40 60 80 1000Recall ()

40

60

80

100

Prec

ision

()

Figure 7 Precision-recall curve for Corel-A image benchmark

images The real values shown at the top of each image arethe classifier decision value (score) of the respective imagecomputed by applying the Euclidean distance between scoreof the query image and scores of the retrieved images Top-20retrieved images whose score is close to the score of the queryimage aremore similar to the query image and vice versa (dueto limited space the retrieval results in response to the queryimages are shown for two semantic classes only)

42 Performance on Caltech-256 Image Benchmark Caltech-256 image benchmark (httpwwwvisioncaltechedu) con-sists of 256 semantic classes and each semantic class contains80 to 827 images For the simplicity we randomly selected10 classes from the Caltech-256 image benchmark Theselected classes are Motorbike Faceeasy Fireworks BonsaiButterfly Leopard Airplanes Ketch and Hibiscus as shownin Figure 10

Different sizes of visual vocabulary (20 50 100 200 400and 600) are constructed from a set of training images 50

190792

192121

181437

207091

168814

1897

184608

20671

212677

191177

196715

204835

212566

18671

177793

170756

214148

188865

18053

173708

168421

Figure 8 Image retrieval result shows reduction of semantic gap forthe semantic class ldquoBusesrdquo

209122

206148

200182

194074

182063

211198

202284

194183

186

210937

203655

221345

190409

214339

220837

19114

181279

205414

197978

225197

181495

Figure 9 Image retrieval result shows reduction of semantic gap forthe semantic class ldquoBeachrdquo

images are used for the training and the remaining 50images are used for the testing The MAP of the proposedresearch is compared with standard BoVW representation(without considering the spatial information) on differentpixel strides (5 15 and 25) vocabulary sizes (20 50 100200 400 and 600) and feature percentages (10 2550 75 and 100) for vocabulary construction The MAPobtained from the proposed image representation and thestandard BoVW representation by using different parametersis graphically presented in Figure 11

Mathematical Problems in Engineering 9

Figure 10 Samples of images from 10 semantic classes of Caltech-256 image benchmark

100 200 300 400 500 6000Size of vocabulary

75

80

85

MA

P

Proposed research with pixel stride of 5BoVW with pixel stride of 5Proposed research with pixel stride of 15BoVW with pixel stride of 15Proposed research with pixel stride of 25BoVW with pixel stride of 25

Figure 11 MAP as a function of vocabulary size for the Caltech-256image benchmark

The experimental results conducted on 10 semanticclasses of Caltech-256 image benchmark prove the robustnessof the proposed image representation The MAP obtained byusing the proposed image representation (with a vocabularysize of 400 visualwords andby using 100 features per image)on the pixel strides of 5 15 and 25 is 8650 8398 and8171 respectively while MAP obtained by using the sameexperimental parameters for standard BoVW representationis 8545 8305 and 8095 respectively According tothe experimental results the MAP for all vocabulary sizesdecreases by increasing the pixel stride as shown in Figure 11while precision-recall curve is shown in Figure 12

Proposed research with pixel stride of 5BoVW with pixel stride of 5

50

60

70

80

90Pr

ecisi

on (

)

20 40 60 80 1000Recall ()

Figure 12 Precision-recall curve for Caltech-256 image benchmark

43 Performance on Ground Truth Image BenchmarkGround Truth image benchmark (httpimagedatabasecswashingtonedugroundtruth) is a publicly available imagebenchmark and is used for the evaluation of CBIR research[20 21 33] There are a total of 1109 images that are dividedinto 22 semantic classes We manually selected all the imagesfrom 5 different semantic classes of Ground Truth imagebenchmark The selected classes for the evaluation of theproposed image representation are Abro green CherriesFootball Green Lake and Swiss Mountains as shown inFigure 13 The 5 classes are selected in order to comparethe performance of the proposed image representation withexisting state-of-the-art research [20 21 33] because theseresearchers also reported their results on the same numberof classes of Ground Truth image benchmark

50 images are used for the training and the remaining50 images are used for the testing Different sizes (10 20

10 Mathematical Problems in Engineering

Figure 13 Samples of images from 5 semantic classes of Ground Truth image benchmark

8284 8133

62805909

Proposed researchCardoso et al

Yildizer et alYildizer et al

45

50

55

60

65

70

75

80

85

MA

P (

)

Figure 14 Comparison of MAP with the state-of-the-art methodson Ground Truth image benchmark

Table 4 Classwise comparison of average precision

Class amp method Proposed method [20] [21]Abro green 7525 80 6667Cherries 7194 80 50Football 9018 100 75Green Lake 8423 80 50Swiss Mountains 9263 6067 50

30 40 and 50) of visual vocabulary are constructed andMAP on the vocabulary of these different sizes is calculatedAccording to the experimental results the MAP of 8284 isobtained by using the proposed image representation (with avocabulary size of 40 visual words on pixel stride of 5 andby using 100 features per image) The classwise averageprecision obtained from the proposed image representationis presented in Table 4 while MAP is graphically presentedin Figure 14

Table 5 Computational cost (in seconds) of the proposed algo-rithm

Number of images retrieved Proposed researchTop-5 02872Top-10 03972Top-15 06470Top-20 07837Top-25 09184Top-30 11103

Experimental results and comparisons conducted on theGround Truth image benchmark prove the robustness of theproposed research based on a combination of local and globalhistograms of visual words The MAP obtained from theproposed image representation is higher than the existingstate-of-the-art research [20 21 33]

44 Complexity Performance The computational cost of theproposed algorithm is calculated using Intel(R) Core i3(fourth generation) 17 Ghz CPU with 4GB RAM (DDR3)3MB L3 cache and Windows 7 operating system Theproposed algorithm is implemented in MATLAB and visualvocabulary is constructed offline using a training dataset andtested at run time using a test dataset The average CPUtime (in seconds) required from features extraction to imageretrieval is presented in Table 5

5 Conclusion and Future Directions

In this paper we proposed a novel image representationbased on a combination of local and global histograms ofvisual words The local histogram of visual words adds thespatial information of the central area to the inverted index ofBoVW representation The combination of local and globalhistograms of visual words is a possible solution to capturethe semantic information from the image The performance

Mathematical Problems in Engineering 11

of the proposed image representation is evaluated on threechallenging image datasets The proposed image represen-tation outperforms the state-of-the-art research includingthe standard BoVW representation For the future work weplan to replace the BoVW model with either Fisher kernelframework or the Vector of Locally Aggregated Descriptors(VLAD) model to evaluate a large-scale image retrieval

Competing Interests

The authors declare that they have no competing interests

References

[1] A-M Tousch S Herbin and J-Y Audibert ldquoSemantic hierar-chies for image annotation a surveyrdquo Pattern Recognition vol45 no 1 pp 333ndash345 2012

[2] R Datta D Joshi J Li and J Z Wang ldquoImage retrieval ideasinfluences and trends of the new agerdquoACMComputing Surveysvol 40 no 2 article 5 2008

[3] Y Liu D Zhang G Lu and W-Y Ma ldquoA survey of content-based image retrieval with high-level semanticsrdquo Pattern Recog-nition vol 40 no 1 pp 262ndash282 2007

[4] A Irtaza M A Jaffar E Aleisa and T-S Choi ldquoEmbeddingneural networks for semantic association in content basedimage retrievalrdquoMultimedia Tools and Applications vol 72 no2 pp 1911ndash1931 2014

[5] D ZhangMM Islam andG Lu ldquoA reviewon automatic imageannotation techniquesrdquo Pattern Recognition vol 45 no 1 pp346ndash362 2012

[6] J Sivic and A Zisserman ldquoVideo google a text retrievalapproach to object matching in videosrdquo in Proceedings of the 9thIEEE International Conference on Computer Vision pp 1470ndash1477 IEEE 2003

[7] J PhilbinO ChumM Isard J Sivic andA Zisserman ldquoObjectretrieval with large vocabularies and fast spatial matchingrdquoin Proceedings of the IEEE Computer Society Conference onComputer Vision and Pattern Recognition (CVPR rsquo07) pp 1ndash8IEEE Minneapolis Minn USA June 2007

[8] O Chum J Philbin J Sivic M Isard and A Zisserman ldquoTotalrecall automatic query expansion with a generative featuremodel for object retrievalrdquo in Proceedings of the 2007 IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[9] J PhilbinOChumM Isard J Sivic andA Zisserman ldquoLost inquantization improving particular object retrieval in large scaleimage databasesrdquo in Proceedings of the 26th IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo08) pp 1ndash8IEEE Anchorage Alaska USA June 2008

[10] H Anwar S Zambanini M Kampel and K VondrovecldquoAncient coin classification using reverse motif recognitionimage-based classification of roman republican coinsrdquo IEEESignal Processing Magazine vol 32 no 4 pp 64ndash74 2015

[11] S Zhang Q Tian G Hua Q Huang and W Gao ldquoGeneratingdescriptive visual words and visual phrases for large-scale imageapplicationsrdquo IEEETransactions on Image Processing vol 20 no9 pp 2664ndash2677 2011

[12] Y Su and F Jurie ldquoImproving image classification using seman-tic attributesrdquo International Journal of Computer Vision vol 100no 1 pp 59ndash77 2012

[13] Z Mehmood S M Anwar M Altaf and N Ali ldquoA novelimage retrieval based on rectangular spatial histograms of visualwordsrdquo Kuwait Journal of Science In press

[14] S Lazebnik C Schmid and J Ponce ldquoBeyond bags of featuresspatial pyramid matching for recognizing natural scene cate-goriesrdquo in Proceedings of the IEEE Computer Society Conferenceon Computer Vision and Pattern Recognition (CVPR rsquo06) vol 2pp 2169ndash2178 New York NY USA June 2006

[15] C Wang B Zhang Z Qin and J Xiong ldquoSpatial weighting forbag-of-features based image retrievalrdquo in Integrated Uncertaintyin Knowledge Modelling and Decision Making Z Qin and V-NHuynh Eds vol 8032 of Lecture Notes in Computer Science pp91ndash100 Springer Berlin Germany 2013

[16] N Ali K B Bajwa R Sablatnig and Z Mehmood ldquoImageretrieval by addition of spatial information based on histogramsof triangular regionsrdquoComputers amp Electrical Engineering 2016

[17] SM Youssef ldquoICTEDCT-CBIR integrating curvelet transformwith enhanced dominant colors extraction and texture analysisfor efficient content-based image retrievalrdquo Computers andElectrical Engineering vol 38 no 5 pp 1358ndash1376 2012

[18] P Poursistani H Nezamabadi-pour R Askari Moghadam andM Saeed ldquoImage indexing and retrieval in JPEG compresseddomain based on vector quantizationrdquoMathematical and Com-puter Modelling vol 57 no 5-6 pp 1005ndash1017 2013

[19] X Tian L Jiao X Liu and X Zhang ldquoFeature integration ofEODH and Color-SIFT application to image retrieval based oncodebookrdquo Signal Processing Image Communication vol 29 no4 pp 530ndash545 2014

[20] D N M Cardoso J Muller F Alexandre L A P Neves P MG Trevisani andG A Giraldi ldquoIterative technique for content-based image retrieval using multiple SVM ensemblesrdquo in ATreatise on Electricity andMagnetism J ClerkMaxwell Ed vol2 pp 68ndash73 Cambridge University Press 1873

[21] E Yildizer A M Balci M Hassan and R Alhajj ldquoEfficientcontent-based image retrieval using multiple support vectormachines ensemblerdquo Expert Systems with Applications vol 39no 3 pp 2385ndash2396 2012

[22] N Dalal and B Triggs ldquoHistograms of oriented gradients forhuman detectionrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo05) vol 1 pp 886ndash893 IEEE SanDiego Calif USA June 2005

[23] D G Lowe ldquoDistinctive image features from scale-invariantkeypointsrdquo International Journal of Computer Vision vol 60 no2 pp 91ndash110 2004

[24] J Matas O Chum M Urban and T Pajdla ldquoRobust wide-baseline stereo from maximally stable extremal regionsrdquo Imageand Vision Computing vol 22 no 10 pp 761ndash767 2004

[25] H Bay T Tuytelaars and L Van Gool ldquoSurf speeded uprobust featuresrdquo in Computer VisionmdashECCV 2006 pp 404ndash417Springer 2006

[26] D J Mankowitz and S Ramamoorthy ldquoBRISK-based visualfeature extraction for resource constrained robotsrdquo in RoboCup2013 Robot World Cup XVII S Behnke M Veloso A Visserand R Xiong Eds vol 8371 of Lecture Notes in ComputerScience pp 195ndash206 Springer Berlin Germany 2014

[27] S Krig ldquoInterest point detector and feature descriptor surveyrdquoin Computer Vision Metrics pp 217ndash282 Springer BerlinGermany 2014

[28] R Ashraf K Bashir A Irtaza and M T Mahmood ldquoContentbased image retrieval using embedded neural networks withbandletized regionsrdquo Entropy vol 17 no 6 pp 3552ndash3580 2015

12 Mathematical Problems in Engineering

[29] S Zeng R Huang H Wang and Z Kang ldquoImage retrievalusing spatiograms of colors quantized by Gaussian MixtureModelsrdquo Neurocomputing vol 171 pp 673ndash684 2016

[30] E Walia and A Pal ldquoFusion framework for effective colorimage retrievalrdquo Journal of Visual Communication and ImageRepresentation vol 25 no 6 pp 1335ndash1348 2014

[31] A Irtaza and M A Jaffar ldquoCategorical image retrieval throughgenetically optimized support vector machines (GOSVM) andhybrid texture featuresrdquo Signal Image and Video Processing vol9 no 7 pp 1503ndash1519 2014

[32] J Yu Z Qin T Wan and X Zhang ldquoFeature integrationanalysis of bag-of-features model for image retrievalrdquo Neuro-computing vol 120 pp 355ndash364 2013

[33] E Yildizer A M Balci T N Jarada and R Alhajj ldquoIntegratingwavelets with clustering and indexing for effective content-based image retrievalrdquoKnowledge-Based Systems vol 31 pp 55ndash66 2012

[34] J Shawe-Taylor and N Cristianini Kernel Methods for PatternAnalysis Cambridge University Press New York NY USA2004

[35] AVedaldi andA Zisserman ldquoSparse kernel approximations forefficient classification and detectionrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition (CVPRrsquo12) pp 2320ndash2327 June 2012

[36] E Nowak F Jurie and B Triggs ldquoSampling strategies for bag-of-features image classifiationrdquo in Computer VisionmdashECCV2006 pp 490ndash503 Springer 2006

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings oftheWorkshop on Statistical Learning in Computer Vision (ECCVrsquo04) vol 1 pp 1ndash2 Prague Czech Republic 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: Research Article A Novel Image Retrieval Based on a Combination …downloads.hindawi.com/journals/mpe/2016/8217250.pdf · 2019-07-30 · image retrieval by using a combination of

4 Mathematical Problems in Engineering

Query image

Learning ofclassifier

on histogramsof training

images

Retrieved images

Dense features extraction

Clustering

Dense features extraction

Classification

Similarity measure

Clustering

Vocabulary construction

Training images representations by using

a combination of local and global histograms

of visual words

Class-1 Class-N

Figure 3 Block diagram of proposed research based on a combination of local and global histograms of visual words

distance between the descriptor 119889119896and visual word

119908 Each image is represented as a collection of patchesand each patch is represented by visual words

(5) Two histograms of 119873 visual words are extractedfrom a single image The computed local and globalhistograms are concatenated and this information isadded to the inverted index of BoVW representationThe visual words of the local histogram aremapped toa rectangular area that is defined by image width (W)and height (H) Four points are defined to extract theoptimized local rectangular area as follows

1199011= 119882 times 119909

1199012= (119882 times 119910) + 119901

1= (119882 times 119910) + (119882 times 119909)

= 119882 times (119909 + 119910)

1199013= 119867 times 119909

1199014= (119867 times 119910) + 119901

3= (119867 times 119910) + (119867 times 119909)

= 119867 times (119909 + 119910)

(5)

where 119909 is the normalized starting point and 119910 is theending point of the local rectangular area

(6) Consider 119873 as the number of visual words of thevocabulary Let 119863

119894be the set of the descriptors that

are mapped to the visual word 119908119894 then the 119894th bin of

the histogram of visual words 119887119894is the cardinality of

the set119863119894

119887119894= Card (119863

119894)

119863119894= 119889119896 119896 isin (1 119873) | 119908 (119889119896) = 119908119894

(6)

31 Image Classification SVM is a state-of-the-art supervisedlearning classification algorithm [5] The linear SVM sepa-rates two classes by using a hyperplane The dataset with twoclasses is represented as

(119909119894 119910119894)119873

119894=1119910119894= +1 minus1 (7)

where 119909119894and 119910119894are input datasets and +1 and minus1 are the corre-

spondence labels of the classes respectively The hyperplanesare generated by finding the values of the coefficients

119908119879sdot 119909 + 119887 = 0 (8)

where119908 is weight vector and 119887 is biasThemaximummarginis determined by 2119908 hyperplanes and the two classes

Mathematical Problems in Engineering 5

are separable from each other according to the followingequations

119908119879sdot 119909119894+ 119887 = 1

119908119879sdot 119909119894+ 119887 = minus1

(9)

This can be expressed equivalently as

119910119894(119908119879sdot 119909119894+ 119887) ge 1 (10)

The kernel method [34] is used in SVM to compute thedot product in the high-dimensional feature space that pro-vides the ability to generate nonlinear decision boundariesThe kernel function permits using the data with no obviousfixed dimensionsThehistograms of visual words constructedover the local and global areas of the image are normalizedand SVM Hellinger kernel [35] is applied on the normalizedhistograms by using following equation

119870(ℎ ℎ1015840) = sum

119894

radicℎ (119894) ℎ1015840(119894) (11)

where ℎ and ℎ1015840 are the normalized histogramsThe SVM Hellinger kernel is selected because of its low

computational cost and instead of computing the kernelvalues it explicitly computes the featuremap and the classifierremains linear The one-versus-one rule is applied for 119896number of classes 119896 sdot (119896 minus 1)2 classifiers are constructed andeach classifier trains the data by using two classes

4 Experiments and Results

This section is about the details of experiments conducted forthe evaluation of the proposed image representation basedon a combination of local and global histograms of visualwords The proposed research is evaluated on Corel-A imagebenchmark Caltech-256 and Ground Truth image datasetsThe images are randomly divided into training and testdatasets The visual vocabulary (codebook) is constructedfrom the training images and retrieval precision is calculatedby using the test dataset Keeping in view the unsupervisednature of clustering by applying k-means each experimentis repeated 10 times and average values of precision arereported In order to evaluate the performance of proposedresearch we determined the relevant images retrieved inresponse to a query image A computer simulation is used toselect images randomly from test dataset and use them as aquery imageThe response to the query image is evaluated onthe basis of relevant images retrieved Precision determinesthe number of relevant images retrieved in response to aquery image and it shows the specificity of the image retrievalsystem

Precision =Number of relevant images retrievedTotal number of images retrieved

(12)

Recall measures the sensitivity of the image retrievalsystem Recall is calculated by the ratio of correct images

retrieved to the total number of images of that class in theimage benchmark

Recall =Number of relevant images retrievedTotal number of relevant images

(13)

The details about experimental parameters are givenbelow

(1) Vocabulary Size The size of visual vocabulary is amajor parameter that affects the performance of content-based image matching [36 37] increasing the size of visualvocabulary at certain level and increases the performanceand larger size visual vocabulary tends to overfit Differentsizes of visual vocabulary are constructed for the Corel-A (2050 100 200 and 400) Caltech-256 (20 50 100 200 400and 600) and Ground Truth (10 20 30 40 and 50) imagebenchmarks in order to evaluate the best performance of theproposed image representation

(2) Dense Pixel Stride The dense SIFT features are extractedfrom the training and test images For a precise content-based image matching we extracted dense features usingthree different scales The dense pixel stride or the step sizeis used to control the spatial resolution of the dense grid Asmaller pixel stride results in a finer grid while a larger pixelstride makes the grid coarser With a pixel stride of 5 15 and25 we considered every 5th 15th and 25th pixel respectivelyas a features to calculate the SIFT descriptor from local andglobal regions of each image

(3) Feature Percentage for the Vocabulary Construction Thefeature percentage to construct a visual vocabulary froma training dataset is one of the parameters that affect theperformance of image retrieval According to [36] increasingthe percentage of features for visual vocabulary constructionincreases the performance of content-based image matchingand vice versa In experiments different percentages (1025 50 75 and 100) of dense features per image areused to construct visual vocabulary

(4) Value of 119909 and y x and 119910 are used for the extraction oflocal rectangular area After a number of experiments theoptimized normalized starting value of 119909 = 022 and endingvalue of 119910 = 060 are selected for training and test images

41 Retrieval Performance on Corel-A Image Benchmark TheCorel-A image benchmark (httpwangistpsuedudocsrelated) is selected for the evaluation of the proposedimage representation and results are compared with thestandard BoVW representation [6] and the state-of-the-artCBIR research [4 15 17ndash19] The Corel-A image benchmarkcontains 1000 images that are divided into ten semantic cat-egories namely Africa Buildings Beach Dinosaurs BusesElephants Horses Flowers Mountains and Food Eachsemantic category consists of 100 images with a resolution of256 times 384 pixels or 384 times 256 pixels Figure 4 is representingthe images from all the semantic classes of Corel-A imagebenchmark

6 Mathematical Problems in Engineering

Figure 4 Samples of images from each category of Corel-A image benchmark

Table 1 MAP of the proposed image representation on a vocabulary size of 200 visual words and pixel stride of 5

Vocabulary size ampfeatures used 20 50 100 200 400

10 7740 8027 8198 8401 825725 7774 8092 8202 8412 831650 7785 8106 8253 8425 830175 7808 8116 8307 8433 8324100 7836 8132 8327 8438 8329MAP 7788 8095 8257 8421 8305Standard deviation 03609 04050 05899 01522 02905Confidence interval 7743ndash7833 8044ndash8144 8184ndash8330 8402ndash8440 8269ndash8341Standard error 01614 01811 02638 00680 01299

In order to maintain a balance 50 images are used forthe training and the remaining 50 images are used for thetesting Different sizes of visual vocabulary (20 50 100 200and 400) are constructed from a set of training images MeanAverage Precision (MAP) is calculated by a random selectionof 500 images from the test dataset MAP for top-20 imageretrievals as a function of vocabulary size and percentage ofdense features per image used in vocabulary construction arepresented in Table 1

The additional statistical investigation in Table 1 rein-forces our described results We have calculated standarderror as description and estimated 95 confidence intervalas inferential results From these results we can easilyconclude that both the lower and the upper bounds at 5level of significance exceed 84 of MAP and the smallervalue of standard error further enhances the fact that thevisual vocabulary of size 200 visual words shows preciseand consistent result as compared to other sizes of visual

vocabulary on different features percentages (10 25 5075 and 100)

The MAP obtained by using proposed image representa-tion based on a combination of local and global histogramsof visual words (by using a vocabulary size of 200 visualwords) on the pixel strides of 5 15 and 25 is 84218212 and 7829 respectively The MAP of the proposedimage representation is compared with the standard BoVWrepresentation (without considering the spatial informationof local histogram) The MAP comparison of the proposedresearch and standard BoVWas a function of vocabulary sizeon the pixel strides of 5 15 and 25 is graphically representedin Figure 5 The proposed image representation outperformsthe standard BoVW representation on all vocabulary sizes(pixel strides of 5 15 and 25) According to the experimentalresults the MAP for all vocabulary sizes on a pixel stride of 5is greater than pixel strides of 15 and 25

Mathematical Problems in Engineering 7

Table 2 Comparison of MAP for top-20 image retrievals on Corel-A image benchmark

Class amp method Proposed research Youssef [17] Irtaza et al [4] Poursistani et al [18] Tian et al [19] Wang et al [15]Africa 7303 635 65 7024 746 64Beach 7458 642 60 4444 378 54Buildings 8024 698 62 708 539 53Buses 9584 915 85 763 967 94Dinosaurs 9795 992 93 100 99 98Elephants 8764 781 65 638 66 78Flowers 8513 948 94 924 92 71Horses 8629 952 77 947 87 93Mountains 8243 738 73 562 585 42Food 7896 806 81 745 622 50

Table 3 Comparison of recall for top-20 image retrievals on Corel-A image benchmark

Class amp method Proposed research Youssef [17] Irtaza et al [4] Poursistani et al [18] Tian et al [19] Wang et al [15]Africa 1461 1270 1300 1405 1492 1280Beach 1492 1284 1200 889 756 1080Buildings 1605 1396 1240 1416 1078 1060Buses 1917 1830 1700 1526 1934 1880Dinosaurs 1959 1984 1860 2000 1980 1960Elephants 1753 1562 1300 1276 1320 1560Flowers 1703 1896 1880 1848 1840 1420Horses 1726 1904 1540 1894 1740 1860Mountains 1649 1476 1460 1124 1170 840Food 1579 1612 1620 1490 1244 1000Mean 1684 1621 1510 1457 1455 1394

100 200 300 4000Size of vocabulary

75

80

85

MA

P

Proposed research with a pixel stride of 5BoVW with a pixel stride of 5Proposed research with a pixel stride of 15BoVW with a pixel stride of 15Proposed work with a pixel stride of 25BoVW with a pixel stride of 25

Figure 5 MAP as a function of vocabulary size

In order to present a sustainable performance of theproposed research the image retrieval precision and recallfor the top-20 image retrievals are compared with the state-of-the-art research of CBIR [4 15 17ndash19] Tables 2 and 3 arepresenting the classwise comparison of average precision andrecall of the proposed research (with a vocabulary size of 200words on a pixel stride of 5 and by using 100 features perimage) with the existing state-of-the-art techniques of CBIRThe MAP comparison is shown in Figure 6 while precision-recall curve is shown in Figure 7

The experimental results conducted on Corel-A imagebenchmark prove the robustness of the proposed imagerepresentation The MAP of the proposed research is higherthan the existing state-of-the-art research [4 15 17ndash19] TheMAP obtained from the proposed image representation is8421 (with a vocabulary size of 200 visual words on a pixelstride of 5 and by using 100 features per image)

The image retrieval results obtained by using proposedimage representation for the semantic class ldquoBusesrdquo andldquoBeachrdquo are shown in Figures 8 and 9 respectively in res-ponse to the query images that shows reduction of semanticgap in terms of classifier decision value (score) The classifierdecision label determines the class of the image whileclassifier decision value (score) is used to find the similar

8 Mathematical Problems in Engineering

Proposed researchYoussef et alIrtaza et al

Poursistani et al

Wang et alTian et al

84218107

7550 7434 72776970

60

65

70

75

80

85

90

Mea

n av

erag

e pre

cisio

n (M

AP)

()

Figure 6 Comparison of MAP with the state-of-the-art methodson Corel-A image benchmark

Proposed research with pixel stride of 5SIFT-LBP [32]

20 40 60 80 1000Recall ()

40

60

80

100

Prec

ision

()

Figure 7 Precision-recall curve for Corel-A image benchmark

images The real values shown at the top of each image arethe classifier decision value (score) of the respective imagecomputed by applying the Euclidean distance between scoreof the query image and scores of the retrieved images Top-20retrieved images whose score is close to the score of the queryimage aremore similar to the query image and vice versa (dueto limited space the retrieval results in response to the queryimages are shown for two semantic classes only)

42 Performance on Caltech-256 Image Benchmark Caltech-256 image benchmark (httpwwwvisioncaltechedu) con-sists of 256 semantic classes and each semantic class contains80 to 827 images For the simplicity we randomly selected10 classes from the Caltech-256 image benchmark Theselected classes are Motorbike Faceeasy Fireworks BonsaiButterfly Leopard Airplanes Ketch and Hibiscus as shownin Figure 10

Different sizes of visual vocabulary (20 50 100 200 400and 600) are constructed from a set of training images 50

190792

192121

181437

207091

168814

1897

184608

20671

212677

191177

196715

204835

212566

18671

177793

170756

214148

188865

18053

173708

168421

Figure 8 Image retrieval result shows reduction of semantic gap forthe semantic class ldquoBusesrdquo

209122

206148

200182

194074

182063

211198

202284

194183

186

210937

203655

221345

190409

214339

220837

19114

181279

205414

197978

225197

181495

Figure 9 Image retrieval result shows reduction of semantic gap forthe semantic class ldquoBeachrdquo

images are used for the training and the remaining 50images are used for the testing The MAP of the proposedresearch is compared with standard BoVW representation(without considering the spatial information) on differentpixel strides (5 15 and 25) vocabulary sizes (20 50 100200 400 and 600) and feature percentages (10 2550 75 and 100) for vocabulary construction The MAPobtained from the proposed image representation and thestandard BoVW representation by using different parametersis graphically presented in Figure 11

Mathematical Problems in Engineering 9

Figure 10 Samples of images from 10 semantic classes of Caltech-256 image benchmark

100 200 300 400 500 6000Size of vocabulary

75

80

85

MA

P

Proposed research with pixel stride of 5BoVW with pixel stride of 5Proposed research with pixel stride of 15BoVW with pixel stride of 15Proposed research with pixel stride of 25BoVW with pixel stride of 25

Figure 11 MAP as a function of vocabulary size for the Caltech-256image benchmark

The experimental results conducted on 10 semanticclasses of Caltech-256 image benchmark prove the robustnessof the proposed image representation The MAP obtained byusing the proposed image representation (with a vocabularysize of 400 visualwords andby using 100 features per image)on the pixel strides of 5 15 and 25 is 8650 8398 and8171 respectively while MAP obtained by using the sameexperimental parameters for standard BoVW representationis 8545 8305 and 8095 respectively According tothe experimental results the MAP for all vocabulary sizesdecreases by increasing the pixel stride as shown in Figure 11while precision-recall curve is shown in Figure 12

Proposed research with pixel stride of 5BoVW with pixel stride of 5

50

60

70

80

90Pr

ecisi

on (

)

20 40 60 80 1000Recall ()

Figure 12 Precision-recall curve for Caltech-256 image benchmark

43 Performance on Ground Truth Image BenchmarkGround Truth image benchmark (httpimagedatabasecswashingtonedugroundtruth) is a publicly available imagebenchmark and is used for the evaluation of CBIR research[20 21 33] There are a total of 1109 images that are dividedinto 22 semantic classes We manually selected all the imagesfrom 5 different semantic classes of Ground Truth imagebenchmark The selected classes for the evaluation of theproposed image representation are Abro green CherriesFootball Green Lake and Swiss Mountains as shown inFigure 13 The 5 classes are selected in order to comparethe performance of the proposed image representation withexisting state-of-the-art research [20 21 33] because theseresearchers also reported their results on the same numberof classes of Ground Truth image benchmark

50 images are used for the training and the remaining50 images are used for the testing Different sizes (10 20

10 Mathematical Problems in Engineering

Figure 13 Samples of images from 5 semantic classes of Ground Truth image benchmark

8284 8133

62805909

Proposed researchCardoso et al

Yildizer et alYildizer et al

45

50

55

60

65

70

75

80

85

MA

P (

)

Figure 14 Comparison of MAP with the state-of-the-art methodson Ground Truth image benchmark

Table 4 Classwise comparison of average precision

Class amp method Proposed method [20] [21]Abro green 7525 80 6667Cherries 7194 80 50Football 9018 100 75Green Lake 8423 80 50Swiss Mountains 9263 6067 50

30 40 and 50) of visual vocabulary are constructed andMAP on the vocabulary of these different sizes is calculatedAccording to the experimental results the MAP of 8284 isobtained by using the proposed image representation (with avocabulary size of 40 visual words on pixel stride of 5 andby using 100 features per image) The classwise averageprecision obtained from the proposed image representationis presented in Table 4 while MAP is graphically presentedin Figure 14

Table 5 Computational cost (in seconds) of the proposed algo-rithm

Number of images retrieved Proposed researchTop-5 02872Top-10 03972Top-15 06470Top-20 07837Top-25 09184Top-30 11103

Experimental results and comparisons conducted on theGround Truth image benchmark prove the robustness of theproposed research based on a combination of local and globalhistograms of visual words The MAP obtained from theproposed image representation is higher than the existingstate-of-the-art research [20 21 33]

44 Complexity Performance The computational cost of theproposed algorithm is calculated using Intel(R) Core i3(fourth generation) 17 Ghz CPU with 4GB RAM (DDR3)3MB L3 cache and Windows 7 operating system Theproposed algorithm is implemented in MATLAB and visualvocabulary is constructed offline using a training dataset andtested at run time using a test dataset The average CPUtime (in seconds) required from features extraction to imageretrieval is presented in Table 5

5 Conclusion and Future Directions

In this paper we proposed a novel image representationbased on a combination of local and global histograms ofvisual words The local histogram of visual words adds thespatial information of the central area to the inverted index ofBoVW representation The combination of local and globalhistograms of visual words is a possible solution to capturethe semantic information from the image The performance

Mathematical Problems in Engineering 11

of the proposed image representation is evaluated on threechallenging image datasets The proposed image represen-tation outperforms the state-of-the-art research includingthe standard BoVW representation For the future work weplan to replace the BoVW model with either Fisher kernelframework or the Vector of Locally Aggregated Descriptors(VLAD) model to evaluate a large-scale image retrieval

Competing Interests

The authors declare that they have no competing interests

References

[1] A-M Tousch S Herbin and J-Y Audibert ldquoSemantic hierar-chies for image annotation a surveyrdquo Pattern Recognition vol45 no 1 pp 333ndash345 2012

[2] R Datta D Joshi J Li and J Z Wang ldquoImage retrieval ideasinfluences and trends of the new agerdquoACMComputing Surveysvol 40 no 2 article 5 2008

[3] Y Liu D Zhang G Lu and W-Y Ma ldquoA survey of content-based image retrieval with high-level semanticsrdquo Pattern Recog-nition vol 40 no 1 pp 262ndash282 2007

[4] A Irtaza M A Jaffar E Aleisa and T-S Choi ldquoEmbeddingneural networks for semantic association in content basedimage retrievalrdquoMultimedia Tools and Applications vol 72 no2 pp 1911ndash1931 2014

[5] D ZhangMM Islam andG Lu ldquoA reviewon automatic imageannotation techniquesrdquo Pattern Recognition vol 45 no 1 pp346ndash362 2012

[6] J Sivic and A Zisserman ldquoVideo google a text retrievalapproach to object matching in videosrdquo in Proceedings of the 9thIEEE International Conference on Computer Vision pp 1470ndash1477 IEEE 2003

[7] J PhilbinO ChumM Isard J Sivic andA Zisserman ldquoObjectretrieval with large vocabularies and fast spatial matchingrdquoin Proceedings of the IEEE Computer Society Conference onComputer Vision and Pattern Recognition (CVPR rsquo07) pp 1ndash8IEEE Minneapolis Minn USA June 2007

[8] O Chum J Philbin J Sivic M Isard and A Zisserman ldquoTotalrecall automatic query expansion with a generative featuremodel for object retrievalrdquo in Proceedings of the 2007 IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[9] J PhilbinOChumM Isard J Sivic andA Zisserman ldquoLost inquantization improving particular object retrieval in large scaleimage databasesrdquo in Proceedings of the 26th IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo08) pp 1ndash8IEEE Anchorage Alaska USA June 2008

[10] H Anwar S Zambanini M Kampel and K VondrovecldquoAncient coin classification using reverse motif recognitionimage-based classification of roman republican coinsrdquo IEEESignal Processing Magazine vol 32 no 4 pp 64ndash74 2015

[11] S Zhang Q Tian G Hua Q Huang and W Gao ldquoGeneratingdescriptive visual words and visual phrases for large-scale imageapplicationsrdquo IEEETransactions on Image Processing vol 20 no9 pp 2664ndash2677 2011

[12] Y Su and F Jurie ldquoImproving image classification using seman-tic attributesrdquo International Journal of Computer Vision vol 100no 1 pp 59ndash77 2012

[13] Z Mehmood S M Anwar M Altaf and N Ali ldquoA novelimage retrieval based on rectangular spatial histograms of visualwordsrdquo Kuwait Journal of Science In press

[14] S Lazebnik C Schmid and J Ponce ldquoBeyond bags of featuresspatial pyramid matching for recognizing natural scene cate-goriesrdquo in Proceedings of the IEEE Computer Society Conferenceon Computer Vision and Pattern Recognition (CVPR rsquo06) vol 2pp 2169ndash2178 New York NY USA June 2006

[15] C Wang B Zhang Z Qin and J Xiong ldquoSpatial weighting forbag-of-features based image retrievalrdquo in Integrated Uncertaintyin Knowledge Modelling and Decision Making Z Qin and V-NHuynh Eds vol 8032 of Lecture Notes in Computer Science pp91ndash100 Springer Berlin Germany 2013

[16] N Ali K B Bajwa R Sablatnig and Z Mehmood ldquoImageretrieval by addition of spatial information based on histogramsof triangular regionsrdquoComputers amp Electrical Engineering 2016

[17] SM Youssef ldquoICTEDCT-CBIR integrating curvelet transformwith enhanced dominant colors extraction and texture analysisfor efficient content-based image retrievalrdquo Computers andElectrical Engineering vol 38 no 5 pp 1358ndash1376 2012

[18] P Poursistani H Nezamabadi-pour R Askari Moghadam andM Saeed ldquoImage indexing and retrieval in JPEG compresseddomain based on vector quantizationrdquoMathematical and Com-puter Modelling vol 57 no 5-6 pp 1005ndash1017 2013

[19] X Tian L Jiao X Liu and X Zhang ldquoFeature integration ofEODH and Color-SIFT application to image retrieval based oncodebookrdquo Signal Processing Image Communication vol 29 no4 pp 530ndash545 2014

[20] D N M Cardoso J Muller F Alexandre L A P Neves P MG Trevisani andG A Giraldi ldquoIterative technique for content-based image retrieval using multiple SVM ensemblesrdquo in ATreatise on Electricity andMagnetism J ClerkMaxwell Ed vol2 pp 68ndash73 Cambridge University Press 1873

[21] E Yildizer A M Balci M Hassan and R Alhajj ldquoEfficientcontent-based image retrieval using multiple support vectormachines ensemblerdquo Expert Systems with Applications vol 39no 3 pp 2385ndash2396 2012

[22] N Dalal and B Triggs ldquoHistograms of oriented gradients forhuman detectionrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo05) vol 1 pp 886ndash893 IEEE SanDiego Calif USA June 2005

[23] D G Lowe ldquoDistinctive image features from scale-invariantkeypointsrdquo International Journal of Computer Vision vol 60 no2 pp 91ndash110 2004

[24] J Matas O Chum M Urban and T Pajdla ldquoRobust wide-baseline stereo from maximally stable extremal regionsrdquo Imageand Vision Computing vol 22 no 10 pp 761ndash767 2004

[25] H Bay T Tuytelaars and L Van Gool ldquoSurf speeded uprobust featuresrdquo in Computer VisionmdashECCV 2006 pp 404ndash417Springer 2006

[26] D J Mankowitz and S Ramamoorthy ldquoBRISK-based visualfeature extraction for resource constrained robotsrdquo in RoboCup2013 Robot World Cup XVII S Behnke M Veloso A Visserand R Xiong Eds vol 8371 of Lecture Notes in ComputerScience pp 195ndash206 Springer Berlin Germany 2014

[27] S Krig ldquoInterest point detector and feature descriptor surveyrdquoin Computer Vision Metrics pp 217ndash282 Springer BerlinGermany 2014

[28] R Ashraf K Bashir A Irtaza and M T Mahmood ldquoContentbased image retrieval using embedded neural networks withbandletized regionsrdquo Entropy vol 17 no 6 pp 3552ndash3580 2015

12 Mathematical Problems in Engineering

[29] S Zeng R Huang H Wang and Z Kang ldquoImage retrievalusing spatiograms of colors quantized by Gaussian MixtureModelsrdquo Neurocomputing vol 171 pp 673ndash684 2016

[30] E Walia and A Pal ldquoFusion framework for effective colorimage retrievalrdquo Journal of Visual Communication and ImageRepresentation vol 25 no 6 pp 1335ndash1348 2014

[31] A Irtaza and M A Jaffar ldquoCategorical image retrieval throughgenetically optimized support vector machines (GOSVM) andhybrid texture featuresrdquo Signal Image and Video Processing vol9 no 7 pp 1503ndash1519 2014

[32] J Yu Z Qin T Wan and X Zhang ldquoFeature integrationanalysis of bag-of-features model for image retrievalrdquo Neuro-computing vol 120 pp 355ndash364 2013

[33] E Yildizer A M Balci T N Jarada and R Alhajj ldquoIntegratingwavelets with clustering and indexing for effective content-based image retrievalrdquoKnowledge-Based Systems vol 31 pp 55ndash66 2012

[34] J Shawe-Taylor and N Cristianini Kernel Methods for PatternAnalysis Cambridge University Press New York NY USA2004

[35] AVedaldi andA Zisserman ldquoSparse kernel approximations forefficient classification and detectionrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition (CVPRrsquo12) pp 2320ndash2327 June 2012

[36] E Nowak F Jurie and B Triggs ldquoSampling strategies for bag-of-features image classifiationrdquo in Computer VisionmdashECCV2006 pp 490ndash503 Springer 2006

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings oftheWorkshop on Statistical Learning in Computer Vision (ECCVrsquo04) vol 1 pp 1ndash2 Prague Czech Republic 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: Research Article A Novel Image Retrieval Based on a Combination …downloads.hindawi.com/journals/mpe/2016/8217250.pdf · 2019-07-30 · image retrieval by using a combination of

Mathematical Problems in Engineering 5

are separable from each other according to the followingequations

119908119879sdot 119909119894+ 119887 = 1

119908119879sdot 119909119894+ 119887 = minus1

(9)

This can be expressed equivalently as

119910119894(119908119879sdot 119909119894+ 119887) ge 1 (10)

The kernel method [34] is used in SVM to compute thedot product in the high-dimensional feature space that pro-vides the ability to generate nonlinear decision boundariesThe kernel function permits using the data with no obviousfixed dimensionsThehistograms of visual words constructedover the local and global areas of the image are normalizedand SVM Hellinger kernel [35] is applied on the normalizedhistograms by using following equation

119870(ℎ ℎ1015840) = sum

119894

radicℎ (119894) ℎ1015840(119894) (11)

where ℎ and ℎ1015840 are the normalized histogramsThe SVM Hellinger kernel is selected because of its low

computational cost and instead of computing the kernelvalues it explicitly computes the featuremap and the classifierremains linear The one-versus-one rule is applied for 119896number of classes 119896 sdot (119896 minus 1)2 classifiers are constructed andeach classifier trains the data by using two classes

4 Experiments and Results

This section is about the details of experiments conducted forthe evaluation of the proposed image representation basedon a combination of local and global histograms of visualwords The proposed research is evaluated on Corel-A imagebenchmark Caltech-256 and Ground Truth image datasetsThe images are randomly divided into training and testdatasets The visual vocabulary (codebook) is constructedfrom the training images and retrieval precision is calculatedby using the test dataset Keeping in view the unsupervisednature of clustering by applying k-means each experimentis repeated 10 times and average values of precision arereported In order to evaluate the performance of proposedresearch we determined the relevant images retrieved inresponse to a query image A computer simulation is used toselect images randomly from test dataset and use them as aquery imageThe response to the query image is evaluated onthe basis of relevant images retrieved Precision determinesthe number of relevant images retrieved in response to aquery image and it shows the specificity of the image retrievalsystem

Precision =Number of relevant images retrievedTotal number of images retrieved

(12)

Recall measures the sensitivity of the image retrievalsystem Recall is calculated by the ratio of correct images

retrieved to the total number of images of that class in theimage benchmark

Recall =Number of relevant images retrievedTotal number of relevant images

(13)

The details about experimental parameters are givenbelow

(1) Vocabulary Size The size of visual vocabulary is amajor parameter that affects the performance of content-based image matching [36 37] increasing the size of visualvocabulary at certain level and increases the performanceand larger size visual vocabulary tends to overfit Differentsizes of visual vocabulary are constructed for the Corel-A (2050 100 200 and 400) Caltech-256 (20 50 100 200 400and 600) and Ground Truth (10 20 30 40 and 50) imagebenchmarks in order to evaluate the best performance of theproposed image representation

(2) Dense Pixel Stride The dense SIFT features are extractedfrom the training and test images For a precise content-based image matching we extracted dense features usingthree different scales The dense pixel stride or the step sizeis used to control the spatial resolution of the dense grid Asmaller pixel stride results in a finer grid while a larger pixelstride makes the grid coarser With a pixel stride of 5 15 and25 we considered every 5th 15th and 25th pixel respectivelyas a features to calculate the SIFT descriptor from local andglobal regions of each image

(3) Feature Percentage for the Vocabulary Construction Thefeature percentage to construct a visual vocabulary froma training dataset is one of the parameters that affect theperformance of image retrieval According to [36] increasingthe percentage of features for visual vocabulary constructionincreases the performance of content-based image matchingand vice versa In experiments different percentages (1025 50 75 and 100) of dense features per image areused to construct visual vocabulary

(4) Value of 119909 and y x and 119910 are used for the extraction oflocal rectangular area After a number of experiments theoptimized normalized starting value of 119909 = 022 and endingvalue of 119910 = 060 are selected for training and test images

41 Retrieval Performance on Corel-A Image Benchmark TheCorel-A image benchmark (httpwangistpsuedudocsrelated) is selected for the evaluation of the proposedimage representation and results are compared with thestandard BoVW representation [6] and the state-of-the-artCBIR research [4 15 17ndash19] The Corel-A image benchmarkcontains 1000 images that are divided into ten semantic cat-egories namely Africa Buildings Beach Dinosaurs BusesElephants Horses Flowers Mountains and Food Eachsemantic category consists of 100 images with a resolution of256 times 384 pixels or 384 times 256 pixels Figure 4 is representingthe images from all the semantic classes of Corel-A imagebenchmark

6 Mathematical Problems in Engineering

Figure 4 Samples of images from each category of Corel-A image benchmark

Table 1 MAP of the proposed image representation on a vocabulary size of 200 visual words and pixel stride of 5

Vocabulary size ampfeatures used 20 50 100 200 400

10 7740 8027 8198 8401 825725 7774 8092 8202 8412 831650 7785 8106 8253 8425 830175 7808 8116 8307 8433 8324100 7836 8132 8327 8438 8329MAP 7788 8095 8257 8421 8305Standard deviation 03609 04050 05899 01522 02905Confidence interval 7743ndash7833 8044ndash8144 8184ndash8330 8402ndash8440 8269ndash8341Standard error 01614 01811 02638 00680 01299

In order to maintain a balance 50 images are used forthe training and the remaining 50 images are used for thetesting Different sizes of visual vocabulary (20 50 100 200and 400) are constructed from a set of training images MeanAverage Precision (MAP) is calculated by a random selectionof 500 images from the test dataset MAP for top-20 imageretrievals as a function of vocabulary size and percentage ofdense features per image used in vocabulary construction arepresented in Table 1

The additional statistical investigation in Table 1 rein-forces our described results We have calculated standarderror as description and estimated 95 confidence intervalas inferential results From these results we can easilyconclude that both the lower and the upper bounds at 5level of significance exceed 84 of MAP and the smallervalue of standard error further enhances the fact that thevisual vocabulary of size 200 visual words shows preciseand consistent result as compared to other sizes of visual

vocabulary on different features percentages (10 25 5075 and 100)

The MAP obtained by using proposed image representa-tion based on a combination of local and global histogramsof visual words (by using a vocabulary size of 200 visualwords) on the pixel strides of 5 15 and 25 is 84218212 and 7829 respectively The MAP of the proposedimage representation is compared with the standard BoVWrepresentation (without considering the spatial informationof local histogram) The MAP comparison of the proposedresearch and standard BoVWas a function of vocabulary sizeon the pixel strides of 5 15 and 25 is graphically representedin Figure 5 The proposed image representation outperformsthe standard BoVW representation on all vocabulary sizes(pixel strides of 5 15 and 25) According to the experimentalresults the MAP for all vocabulary sizes on a pixel stride of 5is greater than pixel strides of 15 and 25

Mathematical Problems in Engineering 7

Table 2 Comparison of MAP for top-20 image retrievals on Corel-A image benchmark

Class amp method Proposed research Youssef [17] Irtaza et al [4] Poursistani et al [18] Tian et al [19] Wang et al [15]Africa 7303 635 65 7024 746 64Beach 7458 642 60 4444 378 54Buildings 8024 698 62 708 539 53Buses 9584 915 85 763 967 94Dinosaurs 9795 992 93 100 99 98Elephants 8764 781 65 638 66 78Flowers 8513 948 94 924 92 71Horses 8629 952 77 947 87 93Mountains 8243 738 73 562 585 42Food 7896 806 81 745 622 50

Table 3 Comparison of recall for top-20 image retrievals on Corel-A image benchmark

Class amp method Proposed research Youssef [17] Irtaza et al [4] Poursistani et al [18] Tian et al [19] Wang et al [15]Africa 1461 1270 1300 1405 1492 1280Beach 1492 1284 1200 889 756 1080Buildings 1605 1396 1240 1416 1078 1060Buses 1917 1830 1700 1526 1934 1880Dinosaurs 1959 1984 1860 2000 1980 1960Elephants 1753 1562 1300 1276 1320 1560Flowers 1703 1896 1880 1848 1840 1420Horses 1726 1904 1540 1894 1740 1860Mountains 1649 1476 1460 1124 1170 840Food 1579 1612 1620 1490 1244 1000Mean 1684 1621 1510 1457 1455 1394

100 200 300 4000Size of vocabulary

75

80

85

MA

P

Proposed research with a pixel stride of 5BoVW with a pixel stride of 5Proposed research with a pixel stride of 15BoVW with a pixel stride of 15Proposed work with a pixel stride of 25BoVW with a pixel stride of 25

Figure 5 MAP as a function of vocabulary size

In order to present a sustainable performance of theproposed research the image retrieval precision and recallfor the top-20 image retrievals are compared with the state-of-the-art research of CBIR [4 15 17ndash19] Tables 2 and 3 arepresenting the classwise comparison of average precision andrecall of the proposed research (with a vocabulary size of 200words on a pixel stride of 5 and by using 100 features perimage) with the existing state-of-the-art techniques of CBIRThe MAP comparison is shown in Figure 6 while precision-recall curve is shown in Figure 7

The experimental results conducted on Corel-A imagebenchmark prove the robustness of the proposed imagerepresentation The MAP of the proposed research is higherthan the existing state-of-the-art research [4 15 17ndash19] TheMAP obtained from the proposed image representation is8421 (with a vocabulary size of 200 visual words on a pixelstride of 5 and by using 100 features per image)

The image retrieval results obtained by using proposedimage representation for the semantic class ldquoBusesrdquo andldquoBeachrdquo are shown in Figures 8 and 9 respectively in res-ponse to the query images that shows reduction of semanticgap in terms of classifier decision value (score) The classifierdecision label determines the class of the image whileclassifier decision value (score) is used to find the similar

8 Mathematical Problems in Engineering

Proposed researchYoussef et alIrtaza et al

Poursistani et al

Wang et alTian et al

84218107

7550 7434 72776970

60

65

70

75

80

85

90

Mea

n av

erag

e pre

cisio

n (M

AP)

()

Figure 6 Comparison of MAP with the state-of-the-art methodson Corel-A image benchmark

Proposed research with pixel stride of 5SIFT-LBP [32]

20 40 60 80 1000Recall ()

40

60

80

100

Prec

ision

()

Figure 7 Precision-recall curve for Corel-A image benchmark

images The real values shown at the top of each image arethe classifier decision value (score) of the respective imagecomputed by applying the Euclidean distance between scoreof the query image and scores of the retrieved images Top-20retrieved images whose score is close to the score of the queryimage aremore similar to the query image and vice versa (dueto limited space the retrieval results in response to the queryimages are shown for two semantic classes only)

42 Performance on Caltech-256 Image Benchmark Caltech-256 image benchmark (httpwwwvisioncaltechedu) con-sists of 256 semantic classes and each semantic class contains80 to 827 images For the simplicity we randomly selected10 classes from the Caltech-256 image benchmark Theselected classes are Motorbike Faceeasy Fireworks BonsaiButterfly Leopard Airplanes Ketch and Hibiscus as shownin Figure 10

Different sizes of visual vocabulary (20 50 100 200 400and 600) are constructed from a set of training images 50

190792

192121

181437

207091

168814

1897

184608

20671

212677

191177

196715

204835

212566

18671

177793

170756

214148

188865

18053

173708

168421

Figure 8 Image retrieval result shows reduction of semantic gap forthe semantic class ldquoBusesrdquo

209122

206148

200182

194074

182063

211198

202284

194183

186

210937

203655

221345

190409

214339

220837

19114

181279

205414

197978

225197

181495

Figure 9 Image retrieval result shows reduction of semantic gap forthe semantic class ldquoBeachrdquo

images are used for the training and the remaining 50images are used for the testing The MAP of the proposedresearch is compared with standard BoVW representation(without considering the spatial information) on differentpixel strides (5 15 and 25) vocabulary sizes (20 50 100200 400 and 600) and feature percentages (10 2550 75 and 100) for vocabulary construction The MAPobtained from the proposed image representation and thestandard BoVW representation by using different parametersis graphically presented in Figure 11

Mathematical Problems in Engineering 9

Figure 10 Samples of images from 10 semantic classes of Caltech-256 image benchmark

100 200 300 400 500 6000Size of vocabulary

75

80

85

MA

P

Proposed research with pixel stride of 5BoVW with pixel stride of 5Proposed research with pixel stride of 15BoVW with pixel stride of 15Proposed research with pixel stride of 25BoVW with pixel stride of 25

Figure 11 MAP as a function of vocabulary size for the Caltech-256image benchmark

The experimental results conducted on 10 semanticclasses of Caltech-256 image benchmark prove the robustnessof the proposed image representation The MAP obtained byusing the proposed image representation (with a vocabularysize of 400 visualwords andby using 100 features per image)on the pixel strides of 5 15 and 25 is 8650 8398 and8171 respectively while MAP obtained by using the sameexperimental parameters for standard BoVW representationis 8545 8305 and 8095 respectively According tothe experimental results the MAP for all vocabulary sizesdecreases by increasing the pixel stride as shown in Figure 11while precision-recall curve is shown in Figure 12

Proposed research with pixel stride of 5BoVW with pixel stride of 5

50

60

70

80

90Pr

ecisi

on (

)

20 40 60 80 1000Recall ()

Figure 12 Precision-recall curve for Caltech-256 image benchmark

43 Performance on Ground Truth Image BenchmarkGround Truth image benchmark (httpimagedatabasecswashingtonedugroundtruth) is a publicly available imagebenchmark and is used for the evaluation of CBIR research[20 21 33] There are a total of 1109 images that are dividedinto 22 semantic classes We manually selected all the imagesfrom 5 different semantic classes of Ground Truth imagebenchmark The selected classes for the evaluation of theproposed image representation are Abro green CherriesFootball Green Lake and Swiss Mountains as shown inFigure 13 The 5 classes are selected in order to comparethe performance of the proposed image representation withexisting state-of-the-art research [20 21 33] because theseresearchers also reported their results on the same numberof classes of Ground Truth image benchmark

50 images are used for the training and the remaining50 images are used for the testing Different sizes (10 20

10 Mathematical Problems in Engineering

Figure 13 Samples of images from 5 semantic classes of Ground Truth image benchmark

8284 8133

62805909

Proposed researchCardoso et al

Yildizer et alYildizer et al

45

50

55

60

65

70

75

80

85

MA

P (

)

Figure 14 Comparison of MAP with the state-of-the-art methodson Ground Truth image benchmark

Table 4 Classwise comparison of average precision

Class amp method Proposed method [20] [21]Abro green 7525 80 6667Cherries 7194 80 50Football 9018 100 75Green Lake 8423 80 50Swiss Mountains 9263 6067 50

30 40 and 50) of visual vocabulary are constructed andMAP on the vocabulary of these different sizes is calculatedAccording to the experimental results the MAP of 8284 isobtained by using the proposed image representation (with avocabulary size of 40 visual words on pixel stride of 5 andby using 100 features per image) The classwise averageprecision obtained from the proposed image representationis presented in Table 4 while MAP is graphically presentedin Figure 14

Table 5 Computational cost (in seconds) of the proposed algo-rithm

Number of images retrieved Proposed researchTop-5 02872Top-10 03972Top-15 06470Top-20 07837Top-25 09184Top-30 11103

Experimental results and comparisons conducted on theGround Truth image benchmark prove the robustness of theproposed research based on a combination of local and globalhistograms of visual words The MAP obtained from theproposed image representation is higher than the existingstate-of-the-art research [20 21 33]

44 Complexity Performance The computational cost of theproposed algorithm is calculated using Intel(R) Core i3(fourth generation) 17 Ghz CPU with 4GB RAM (DDR3)3MB L3 cache and Windows 7 operating system Theproposed algorithm is implemented in MATLAB and visualvocabulary is constructed offline using a training dataset andtested at run time using a test dataset The average CPUtime (in seconds) required from features extraction to imageretrieval is presented in Table 5

5 Conclusion and Future Directions

In this paper we proposed a novel image representationbased on a combination of local and global histograms ofvisual words The local histogram of visual words adds thespatial information of the central area to the inverted index ofBoVW representation The combination of local and globalhistograms of visual words is a possible solution to capturethe semantic information from the image The performance

Mathematical Problems in Engineering 11

of the proposed image representation is evaluated on threechallenging image datasets The proposed image represen-tation outperforms the state-of-the-art research includingthe standard BoVW representation For the future work weplan to replace the BoVW model with either Fisher kernelframework or the Vector of Locally Aggregated Descriptors(VLAD) model to evaluate a large-scale image retrieval

Competing Interests

The authors declare that they have no competing interests

References

[1] A-M Tousch S Herbin and J-Y Audibert ldquoSemantic hierar-chies for image annotation a surveyrdquo Pattern Recognition vol45 no 1 pp 333ndash345 2012

[2] R Datta D Joshi J Li and J Z Wang ldquoImage retrieval ideasinfluences and trends of the new agerdquoACMComputing Surveysvol 40 no 2 article 5 2008

[3] Y Liu D Zhang G Lu and W-Y Ma ldquoA survey of content-based image retrieval with high-level semanticsrdquo Pattern Recog-nition vol 40 no 1 pp 262ndash282 2007

[4] A Irtaza M A Jaffar E Aleisa and T-S Choi ldquoEmbeddingneural networks for semantic association in content basedimage retrievalrdquoMultimedia Tools and Applications vol 72 no2 pp 1911ndash1931 2014

[5] D ZhangMM Islam andG Lu ldquoA reviewon automatic imageannotation techniquesrdquo Pattern Recognition vol 45 no 1 pp346ndash362 2012

[6] J Sivic and A Zisserman ldquoVideo google a text retrievalapproach to object matching in videosrdquo in Proceedings of the 9thIEEE International Conference on Computer Vision pp 1470ndash1477 IEEE 2003

[7] J PhilbinO ChumM Isard J Sivic andA Zisserman ldquoObjectretrieval with large vocabularies and fast spatial matchingrdquoin Proceedings of the IEEE Computer Society Conference onComputer Vision and Pattern Recognition (CVPR rsquo07) pp 1ndash8IEEE Minneapolis Minn USA June 2007

[8] O Chum J Philbin J Sivic M Isard and A Zisserman ldquoTotalrecall automatic query expansion with a generative featuremodel for object retrievalrdquo in Proceedings of the 2007 IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[9] J PhilbinOChumM Isard J Sivic andA Zisserman ldquoLost inquantization improving particular object retrieval in large scaleimage databasesrdquo in Proceedings of the 26th IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo08) pp 1ndash8IEEE Anchorage Alaska USA June 2008

[10] H Anwar S Zambanini M Kampel and K VondrovecldquoAncient coin classification using reverse motif recognitionimage-based classification of roman republican coinsrdquo IEEESignal Processing Magazine vol 32 no 4 pp 64ndash74 2015

[11] S Zhang Q Tian G Hua Q Huang and W Gao ldquoGeneratingdescriptive visual words and visual phrases for large-scale imageapplicationsrdquo IEEETransactions on Image Processing vol 20 no9 pp 2664ndash2677 2011

[12] Y Su and F Jurie ldquoImproving image classification using seman-tic attributesrdquo International Journal of Computer Vision vol 100no 1 pp 59ndash77 2012

[13] Z Mehmood S M Anwar M Altaf and N Ali ldquoA novelimage retrieval based on rectangular spatial histograms of visualwordsrdquo Kuwait Journal of Science In press

[14] S Lazebnik C Schmid and J Ponce ldquoBeyond bags of featuresspatial pyramid matching for recognizing natural scene cate-goriesrdquo in Proceedings of the IEEE Computer Society Conferenceon Computer Vision and Pattern Recognition (CVPR rsquo06) vol 2pp 2169ndash2178 New York NY USA June 2006

[15] C Wang B Zhang Z Qin and J Xiong ldquoSpatial weighting forbag-of-features based image retrievalrdquo in Integrated Uncertaintyin Knowledge Modelling and Decision Making Z Qin and V-NHuynh Eds vol 8032 of Lecture Notes in Computer Science pp91ndash100 Springer Berlin Germany 2013

[16] N Ali K B Bajwa R Sablatnig and Z Mehmood ldquoImageretrieval by addition of spatial information based on histogramsof triangular regionsrdquoComputers amp Electrical Engineering 2016

[17] SM Youssef ldquoICTEDCT-CBIR integrating curvelet transformwith enhanced dominant colors extraction and texture analysisfor efficient content-based image retrievalrdquo Computers andElectrical Engineering vol 38 no 5 pp 1358ndash1376 2012

[18] P Poursistani H Nezamabadi-pour R Askari Moghadam andM Saeed ldquoImage indexing and retrieval in JPEG compresseddomain based on vector quantizationrdquoMathematical and Com-puter Modelling vol 57 no 5-6 pp 1005ndash1017 2013

[19] X Tian L Jiao X Liu and X Zhang ldquoFeature integration ofEODH and Color-SIFT application to image retrieval based oncodebookrdquo Signal Processing Image Communication vol 29 no4 pp 530ndash545 2014

[20] D N M Cardoso J Muller F Alexandre L A P Neves P MG Trevisani andG A Giraldi ldquoIterative technique for content-based image retrieval using multiple SVM ensemblesrdquo in ATreatise on Electricity andMagnetism J ClerkMaxwell Ed vol2 pp 68ndash73 Cambridge University Press 1873

[21] E Yildizer A M Balci M Hassan and R Alhajj ldquoEfficientcontent-based image retrieval using multiple support vectormachines ensemblerdquo Expert Systems with Applications vol 39no 3 pp 2385ndash2396 2012

[22] N Dalal and B Triggs ldquoHistograms of oriented gradients forhuman detectionrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo05) vol 1 pp 886ndash893 IEEE SanDiego Calif USA June 2005

[23] D G Lowe ldquoDistinctive image features from scale-invariantkeypointsrdquo International Journal of Computer Vision vol 60 no2 pp 91ndash110 2004

[24] J Matas O Chum M Urban and T Pajdla ldquoRobust wide-baseline stereo from maximally stable extremal regionsrdquo Imageand Vision Computing vol 22 no 10 pp 761ndash767 2004

[25] H Bay T Tuytelaars and L Van Gool ldquoSurf speeded uprobust featuresrdquo in Computer VisionmdashECCV 2006 pp 404ndash417Springer 2006

[26] D J Mankowitz and S Ramamoorthy ldquoBRISK-based visualfeature extraction for resource constrained robotsrdquo in RoboCup2013 Robot World Cup XVII S Behnke M Veloso A Visserand R Xiong Eds vol 8371 of Lecture Notes in ComputerScience pp 195ndash206 Springer Berlin Germany 2014

[27] S Krig ldquoInterest point detector and feature descriptor surveyrdquoin Computer Vision Metrics pp 217ndash282 Springer BerlinGermany 2014

[28] R Ashraf K Bashir A Irtaza and M T Mahmood ldquoContentbased image retrieval using embedded neural networks withbandletized regionsrdquo Entropy vol 17 no 6 pp 3552ndash3580 2015

12 Mathematical Problems in Engineering

[29] S Zeng R Huang H Wang and Z Kang ldquoImage retrievalusing spatiograms of colors quantized by Gaussian MixtureModelsrdquo Neurocomputing vol 171 pp 673ndash684 2016

[30] E Walia and A Pal ldquoFusion framework for effective colorimage retrievalrdquo Journal of Visual Communication and ImageRepresentation vol 25 no 6 pp 1335ndash1348 2014

[31] A Irtaza and M A Jaffar ldquoCategorical image retrieval throughgenetically optimized support vector machines (GOSVM) andhybrid texture featuresrdquo Signal Image and Video Processing vol9 no 7 pp 1503ndash1519 2014

[32] J Yu Z Qin T Wan and X Zhang ldquoFeature integrationanalysis of bag-of-features model for image retrievalrdquo Neuro-computing vol 120 pp 355ndash364 2013

[33] E Yildizer A M Balci T N Jarada and R Alhajj ldquoIntegratingwavelets with clustering and indexing for effective content-based image retrievalrdquoKnowledge-Based Systems vol 31 pp 55ndash66 2012

[34] J Shawe-Taylor and N Cristianini Kernel Methods for PatternAnalysis Cambridge University Press New York NY USA2004

[35] AVedaldi andA Zisserman ldquoSparse kernel approximations forefficient classification and detectionrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition (CVPRrsquo12) pp 2320ndash2327 June 2012

[36] E Nowak F Jurie and B Triggs ldquoSampling strategies for bag-of-features image classifiationrdquo in Computer VisionmdashECCV2006 pp 490ndash503 Springer 2006

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings oftheWorkshop on Statistical Learning in Computer Vision (ECCVrsquo04) vol 1 pp 1ndash2 Prague Czech Republic 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: Research Article A Novel Image Retrieval Based on a Combination …downloads.hindawi.com/journals/mpe/2016/8217250.pdf · 2019-07-30 · image retrieval by using a combination of

6 Mathematical Problems in Engineering

Figure 4 Samples of images from each category of Corel-A image benchmark

Table 1 MAP of the proposed image representation on a vocabulary size of 200 visual words and pixel stride of 5

Vocabulary size ampfeatures used 20 50 100 200 400

10 7740 8027 8198 8401 825725 7774 8092 8202 8412 831650 7785 8106 8253 8425 830175 7808 8116 8307 8433 8324100 7836 8132 8327 8438 8329MAP 7788 8095 8257 8421 8305Standard deviation 03609 04050 05899 01522 02905Confidence interval 7743ndash7833 8044ndash8144 8184ndash8330 8402ndash8440 8269ndash8341Standard error 01614 01811 02638 00680 01299

In order to maintain a balance 50 images are used forthe training and the remaining 50 images are used for thetesting Different sizes of visual vocabulary (20 50 100 200and 400) are constructed from a set of training images MeanAverage Precision (MAP) is calculated by a random selectionof 500 images from the test dataset MAP for top-20 imageretrievals as a function of vocabulary size and percentage ofdense features per image used in vocabulary construction arepresented in Table 1

The additional statistical investigation in Table 1 rein-forces our described results We have calculated standarderror as description and estimated 95 confidence intervalas inferential results From these results we can easilyconclude that both the lower and the upper bounds at 5level of significance exceed 84 of MAP and the smallervalue of standard error further enhances the fact that thevisual vocabulary of size 200 visual words shows preciseand consistent result as compared to other sizes of visual

vocabulary on different features percentages (10 25 5075 and 100)

The MAP obtained by using proposed image representa-tion based on a combination of local and global histogramsof visual words (by using a vocabulary size of 200 visualwords) on the pixel strides of 5 15 and 25 is 84218212 and 7829 respectively The MAP of the proposedimage representation is compared with the standard BoVWrepresentation (without considering the spatial informationof local histogram) The MAP comparison of the proposedresearch and standard BoVWas a function of vocabulary sizeon the pixel strides of 5 15 and 25 is graphically representedin Figure 5 The proposed image representation outperformsthe standard BoVW representation on all vocabulary sizes(pixel strides of 5 15 and 25) According to the experimentalresults the MAP for all vocabulary sizes on a pixel stride of 5is greater than pixel strides of 15 and 25

Mathematical Problems in Engineering 7

Table 2 Comparison of MAP for top-20 image retrievals on Corel-A image benchmark

Class amp method Proposed research Youssef [17] Irtaza et al [4] Poursistani et al [18] Tian et al [19] Wang et al [15]Africa 7303 635 65 7024 746 64Beach 7458 642 60 4444 378 54Buildings 8024 698 62 708 539 53Buses 9584 915 85 763 967 94Dinosaurs 9795 992 93 100 99 98Elephants 8764 781 65 638 66 78Flowers 8513 948 94 924 92 71Horses 8629 952 77 947 87 93Mountains 8243 738 73 562 585 42Food 7896 806 81 745 622 50

Table 3 Comparison of recall for top-20 image retrievals on Corel-A image benchmark

Class amp method Proposed research Youssef [17] Irtaza et al [4] Poursistani et al [18] Tian et al [19] Wang et al [15]Africa 1461 1270 1300 1405 1492 1280Beach 1492 1284 1200 889 756 1080Buildings 1605 1396 1240 1416 1078 1060Buses 1917 1830 1700 1526 1934 1880Dinosaurs 1959 1984 1860 2000 1980 1960Elephants 1753 1562 1300 1276 1320 1560Flowers 1703 1896 1880 1848 1840 1420Horses 1726 1904 1540 1894 1740 1860Mountains 1649 1476 1460 1124 1170 840Food 1579 1612 1620 1490 1244 1000Mean 1684 1621 1510 1457 1455 1394

100 200 300 4000Size of vocabulary

75

80

85

MA

P

Proposed research with a pixel stride of 5BoVW with a pixel stride of 5Proposed research with a pixel stride of 15BoVW with a pixel stride of 15Proposed work with a pixel stride of 25BoVW with a pixel stride of 25

Figure 5 MAP as a function of vocabulary size

In order to present a sustainable performance of theproposed research the image retrieval precision and recallfor the top-20 image retrievals are compared with the state-of-the-art research of CBIR [4 15 17ndash19] Tables 2 and 3 arepresenting the classwise comparison of average precision andrecall of the proposed research (with a vocabulary size of 200words on a pixel stride of 5 and by using 100 features perimage) with the existing state-of-the-art techniques of CBIRThe MAP comparison is shown in Figure 6 while precision-recall curve is shown in Figure 7

The experimental results conducted on Corel-A imagebenchmark prove the robustness of the proposed imagerepresentation The MAP of the proposed research is higherthan the existing state-of-the-art research [4 15 17ndash19] TheMAP obtained from the proposed image representation is8421 (with a vocabulary size of 200 visual words on a pixelstride of 5 and by using 100 features per image)

The image retrieval results obtained by using proposedimage representation for the semantic class ldquoBusesrdquo andldquoBeachrdquo are shown in Figures 8 and 9 respectively in res-ponse to the query images that shows reduction of semanticgap in terms of classifier decision value (score) The classifierdecision label determines the class of the image whileclassifier decision value (score) is used to find the similar

8 Mathematical Problems in Engineering

Proposed researchYoussef et alIrtaza et al

Poursistani et al

Wang et alTian et al

84218107

7550 7434 72776970

60

65

70

75

80

85

90

Mea

n av

erag

e pre

cisio

n (M

AP)

()

Figure 6 Comparison of MAP with the state-of-the-art methodson Corel-A image benchmark

Proposed research with pixel stride of 5SIFT-LBP [32]

20 40 60 80 1000Recall ()

40

60

80

100

Prec

ision

()

Figure 7 Precision-recall curve for Corel-A image benchmark

images The real values shown at the top of each image arethe classifier decision value (score) of the respective imagecomputed by applying the Euclidean distance between scoreof the query image and scores of the retrieved images Top-20retrieved images whose score is close to the score of the queryimage aremore similar to the query image and vice versa (dueto limited space the retrieval results in response to the queryimages are shown for two semantic classes only)

42 Performance on Caltech-256 Image Benchmark Caltech-256 image benchmark (httpwwwvisioncaltechedu) con-sists of 256 semantic classes and each semantic class contains80 to 827 images For the simplicity we randomly selected10 classes from the Caltech-256 image benchmark Theselected classes are Motorbike Faceeasy Fireworks BonsaiButterfly Leopard Airplanes Ketch and Hibiscus as shownin Figure 10

Different sizes of visual vocabulary (20 50 100 200 400and 600) are constructed from a set of training images 50

190792

192121

181437

207091

168814

1897

184608

20671

212677

191177

196715

204835

212566

18671

177793

170756

214148

188865

18053

173708

168421

Figure 8 Image retrieval result shows reduction of semantic gap forthe semantic class ldquoBusesrdquo

209122

206148

200182

194074

182063

211198

202284

194183

186

210937

203655

221345

190409

214339

220837

19114

181279

205414

197978

225197

181495

Figure 9 Image retrieval result shows reduction of semantic gap forthe semantic class ldquoBeachrdquo

images are used for the training and the remaining 50images are used for the testing The MAP of the proposedresearch is compared with standard BoVW representation(without considering the spatial information) on differentpixel strides (5 15 and 25) vocabulary sizes (20 50 100200 400 and 600) and feature percentages (10 2550 75 and 100) for vocabulary construction The MAPobtained from the proposed image representation and thestandard BoVW representation by using different parametersis graphically presented in Figure 11

Mathematical Problems in Engineering 9

Figure 10 Samples of images from 10 semantic classes of Caltech-256 image benchmark

100 200 300 400 500 6000Size of vocabulary

75

80

85

MA

P

Proposed research with pixel stride of 5BoVW with pixel stride of 5Proposed research with pixel stride of 15BoVW with pixel stride of 15Proposed research with pixel stride of 25BoVW with pixel stride of 25

Figure 11 MAP as a function of vocabulary size for the Caltech-256image benchmark

The experimental results conducted on 10 semanticclasses of Caltech-256 image benchmark prove the robustnessof the proposed image representation The MAP obtained byusing the proposed image representation (with a vocabularysize of 400 visualwords andby using 100 features per image)on the pixel strides of 5 15 and 25 is 8650 8398 and8171 respectively while MAP obtained by using the sameexperimental parameters for standard BoVW representationis 8545 8305 and 8095 respectively According tothe experimental results the MAP for all vocabulary sizesdecreases by increasing the pixel stride as shown in Figure 11while precision-recall curve is shown in Figure 12

Proposed research with pixel stride of 5BoVW with pixel stride of 5

50

60

70

80

90Pr

ecisi

on (

)

20 40 60 80 1000Recall ()

Figure 12 Precision-recall curve for Caltech-256 image benchmark

43 Performance on Ground Truth Image BenchmarkGround Truth image benchmark (httpimagedatabasecswashingtonedugroundtruth) is a publicly available imagebenchmark and is used for the evaluation of CBIR research[20 21 33] There are a total of 1109 images that are dividedinto 22 semantic classes We manually selected all the imagesfrom 5 different semantic classes of Ground Truth imagebenchmark The selected classes for the evaluation of theproposed image representation are Abro green CherriesFootball Green Lake and Swiss Mountains as shown inFigure 13 The 5 classes are selected in order to comparethe performance of the proposed image representation withexisting state-of-the-art research [20 21 33] because theseresearchers also reported their results on the same numberof classes of Ground Truth image benchmark

50 images are used for the training and the remaining50 images are used for the testing Different sizes (10 20

10 Mathematical Problems in Engineering

Figure 13 Samples of images from 5 semantic classes of Ground Truth image benchmark

8284 8133

62805909

Proposed researchCardoso et al

Yildizer et alYildizer et al

45

50

55

60

65

70

75

80

85

MA

P (

)

Figure 14 Comparison of MAP with the state-of-the-art methodson Ground Truth image benchmark

Table 4 Classwise comparison of average precision

Class amp method Proposed method [20] [21]Abro green 7525 80 6667Cherries 7194 80 50Football 9018 100 75Green Lake 8423 80 50Swiss Mountains 9263 6067 50

30 40 and 50) of visual vocabulary are constructed andMAP on the vocabulary of these different sizes is calculatedAccording to the experimental results the MAP of 8284 isobtained by using the proposed image representation (with avocabulary size of 40 visual words on pixel stride of 5 andby using 100 features per image) The classwise averageprecision obtained from the proposed image representationis presented in Table 4 while MAP is graphically presentedin Figure 14

Table 5 Computational cost (in seconds) of the proposed algo-rithm

Number of images retrieved Proposed researchTop-5 02872Top-10 03972Top-15 06470Top-20 07837Top-25 09184Top-30 11103

Experimental results and comparisons conducted on theGround Truth image benchmark prove the robustness of theproposed research based on a combination of local and globalhistograms of visual words The MAP obtained from theproposed image representation is higher than the existingstate-of-the-art research [20 21 33]

44 Complexity Performance The computational cost of theproposed algorithm is calculated using Intel(R) Core i3(fourth generation) 17 Ghz CPU with 4GB RAM (DDR3)3MB L3 cache and Windows 7 operating system Theproposed algorithm is implemented in MATLAB and visualvocabulary is constructed offline using a training dataset andtested at run time using a test dataset The average CPUtime (in seconds) required from features extraction to imageretrieval is presented in Table 5

5 Conclusion and Future Directions

In this paper we proposed a novel image representationbased on a combination of local and global histograms ofvisual words The local histogram of visual words adds thespatial information of the central area to the inverted index ofBoVW representation The combination of local and globalhistograms of visual words is a possible solution to capturethe semantic information from the image The performance

Mathematical Problems in Engineering 11

of the proposed image representation is evaluated on threechallenging image datasets The proposed image represen-tation outperforms the state-of-the-art research includingthe standard BoVW representation For the future work weplan to replace the BoVW model with either Fisher kernelframework or the Vector of Locally Aggregated Descriptors(VLAD) model to evaluate a large-scale image retrieval

Competing Interests

The authors declare that they have no competing interests

References

[1] A-M Tousch S Herbin and J-Y Audibert ldquoSemantic hierar-chies for image annotation a surveyrdquo Pattern Recognition vol45 no 1 pp 333ndash345 2012

[2] R Datta D Joshi J Li and J Z Wang ldquoImage retrieval ideasinfluences and trends of the new agerdquoACMComputing Surveysvol 40 no 2 article 5 2008

[3] Y Liu D Zhang G Lu and W-Y Ma ldquoA survey of content-based image retrieval with high-level semanticsrdquo Pattern Recog-nition vol 40 no 1 pp 262ndash282 2007

[4] A Irtaza M A Jaffar E Aleisa and T-S Choi ldquoEmbeddingneural networks for semantic association in content basedimage retrievalrdquoMultimedia Tools and Applications vol 72 no2 pp 1911ndash1931 2014

[5] D ZhangMM Islam andG Lu ldquoA reviewon automatic imageannotation techniquesrdquo Pattern Recognition vol 45 no 1 pp346ndash362 2012

[6] J Sivic and A Zisserman ldquoVideo google a text retrievalapproach to object matching in videosrdquo in Proceedings of the 9thIEEE International Conference on Computer Vision pp 1470ndash1477 IEEE 2003

[7] J PhilbinO ChumM Isard J Sivic andA Zisserman ldquoObjectretrieval with large vocabularies and fast spatial matchingrdquoin Proceedings of the IEEE Computer Society Conference onComputer Vision and Pattern Recognition (CVPR rsquo07) pp 1ndash8IEEE Minneapolis Minn USA June 2007

[8] O Chum J Philbin J Sivic M Isard and A Zisserman ldquoTotalrecall automatic query expansion with a generative featuremodel for object retrievalrdquo in Proceedings of the 2007 IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[9] J PhilbinOChumM Isard J Sivic andA Zisserman ldquoLost inquantization improving particular object retrieval in large scaleimage databasesrdquo in Proceedings of the 26th IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo08) pp 1ndash8IEEE Anchorage Alaska USA June 2008

[10] H Anwar S Zambanini M Kampel and K VondrovecldquoAncient coin classification using reverse motif recognitionimage-based classification of roman republican coinsrdquo IEEESignal Processing Magazine vol 32 no 4 pp 64ndash74 2015

[11] S Zhang Q Tian G Hua Q Huang and W Gao ldquoGeneratingdescriptive visual words and visual phrases for large-scale imageapplicationsrdquo IEEETransactions on Image Processing vol 20 no9 pp 2664ndash2677 2011

[12] Y Su and F Jurie ldquoImproving image classification using seman-tic attributesrdquo International Journal of Computer Vision vol 100no 1 pp 59ndash77 2012

[13] Z Mehmood S M Anwar M Altaf and N Ali ldquoA novelimage retrieval based on rectangular spatial histograms of visualwordsrdquo Kuwait Journal of Science In press

[14] S Lazebnik C Schmid and J Ponce ldquoBeyond bags of featuresspatial pyramid matching for recognizing natural scene cate-goriesrdquo in Proceedings of the IEEE Computer Society Conferenceon Computer Vision and Pattern Recognition (CVPR rsquo06) vol 2pp 2169ndash2178 New York NY USA June 2006

[15] C Wang B Zhang Z Qin and J Xiong ldquoSpatial weighting forbag-of-features based image retrievalrdquo in Integrated Uncertaintyin Knowledge Modelling and Decision Making Z Qin and V-NHuynh Eds vol 8032 of Lecture Notes in Computer Science pp91ndash100 Springer Berlin Germany 2013

[16] N Ali K B Bajwa R Sablatnig and Z Mehmood ldquoImageretrieval by addition of spatial information based on histogramsof triangular regionsrdquoComputers amp Electrical Engineering 2016

[17] SM Youssef ldquoICTEDCT-CBIR integrating curvelet transformwith enhanced dominant colors extraction and texture analysisfor efficient content-based image retrievalrdquo Computers andElectrical Engineering vol 38 no 5 pp 1358ndash1376 2012

[18] P Poursistani H Nezamabadi-pour R Askari Moghadam andM Saeed ldquoImage indexing and retrieval in JPEG compresseddomain based on vector quantizationrdquoMathematical and Com-puter Modelling vol 57 no 5-6 pp 1005ndash1017 2013

[19] X Tian L Jiao X Liu and X Zhang ldquoFeature integration ofEODH and Color-SIFT application to image retrieval based oncodebookrdquo Signal Processing Image Communication vol 29 no4 pp 530ndash545 2014

[20] D N M Cardoso J Muller F Alexandre L A P Neves P MG Trevisani andG A Giraldi ldquoIterative technique for content-based image retrieval using multiple SVM ensemblesrdquo in ATreatise on Electricity andMagnetism J ClerkMaxwell Ed vol2 pp 68ndash73 Cambridge University Press 1873

[21] E Yildizer A M Balci M Hassan and R Alhajj ldquoEfficientcontent-based image retrieval using multiple support vectormachines ensemblerdquo Expert Systems with Applications vol 39no 3 pp 2385ndash2396 2012

[22] N Dalal and B Triggs ldquoHistograms of oriented gradients forhuman detectionrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo05) vol 1 pp 886ndash893 IEEE SanDiego Calif USA June 2005

[23] D G Lowe ldquoDistinctive image features from scale-invariantkeypointsrdquo International Journal of Computer Vision vol 60 no2 pp 91ndash110 2004

[24] J Matas O Chum M Urban and T Pajdla ldquoRobust wide-baseline stereo from maximally stable extremal regionsrdquo Imageand Vision Computing vol 22 no 10 pp 761ndash767 2004

[25] H Bay T Tuytelaars and L Van Gool ldquoSurf speeded uprobust featuresrdquo in Computer VisionmdashECCV 2006 pp 404ndash417Springer 2006

[26] D J Mankowitz and S Ramamoorthy ldquoBRISK-based visualfeature extraction for resource constrained robotsrdquo in RoboCup2013 Robot World Cup XVII S Behnke M Veloso A Visserand R Xiong Eds vol 8371 of Lecture Notes in ComputerScience pp 195ndash206 Springer Berlin Germany 2014

[27] S Krig ldquoInterest point detector and feature descriptor surveyrdquoin Computer Vision Metrics pp 217ndash282 Springer BerlinGermany 2014

[28] R Ashraf K Bashir A Irtaza and M T Mahmood ldquoContentbased image retrieval using embedded neural networks withbandletized regionsrdquo Entropy vol 17 no 6 pp 3552ndash3580 2015

12 Mathematical Problems in Engineering

[29] S Zeng R Huang H Wang and Z Kang ldquoImage retrievalusing spatiograms of colors quantized by Gaussian MixtureModelsrdquo Neurocomputing vol 171 pp 673ndash684 2016

[30] E Walia and A Pal ldquoFusion framework for effective colorimage retrievalrdquo Journal of Visual Communication and ImageRepresentation vol 25 no 6 pp 1335ndash1348 2014

[31] A Irtaza and M A Jaffar ldquoCategorical image retrieval throughgenetically optimized support vector machines (GOSVM) andhybrid texture featuresrdquo Signal Image and Video Processing vol9 no 7 pp 1503ndash1519 2014

[32] J Yu Z Qin T Wan and X Zhang ldquoFeature integrationanalysis of bag-of-features model for image retrievalrdquo Neuro-computing vol 120 pp 355ndash364 2013

[33] E Yildizer A M Balci T N Jarada and R Alhajj ldquoIntegratingwavelets with clustering and indexing for effective content-based image retrievalrdquoKnowledge-Based Systems vol 31 pp 55ndash66 2012

[34] J Shawe-Taylor and N Cristianini Kernel Methods for PatternAnalysis Cambridge University Press New York NY USA2004

[35] AVedaldi andA Zisserman ldquoSparse kernel approximations forefficient classification and detectionrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition (CVPRrsquo12) pp 2320ndash2327 June 2012

[36] E Nowak F Jurie and B Triggs ldquoSampling strategies for bag-of-features image classifiationrdquo in Computer VisionmdashECCV2006 pp 490ndash503 Springer 2006

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings oftheWorkshop on Statistical Learning in Computer Vision (ECCVrsquo04) vol 1 pp 1ndash2 Prague Czech Republic 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: Research Article A Novel Image Retrieval Based on a Combination …downloads.hindawi.com/journals/mpe/2016/8217250.pdf · 2019-07-30 · image retrieval by using a combination of

Mathematical Problems in Engineering 7

Table 2 Comparison of MAP for top-20 image retrievals on Corel-A image benchmark

Class amp method Proposed research Youssef [17] Irtaza et al [4] Poursistani et al [18] Tian et al [19] Wang et al [15]Africa 7303 635 65 7024 746 64Beach 7458 642 60 4444 378 54Buildings 8024 698 62 708 539 53Buses 9584 915 85 763 967 94Dinosaurs 9795 992 93 100 99 98Elephants 8764 781 65 638 66 78Flowers 8513 948 94 924 92 71Horses 8629 952 77 947 87 93Mountains 8243 738 73 562 585 42Food 7896 806 81 745 622 50

Table 3 Comparison of recall for top-20 image retrievals on Corel-A image benchmark

Class amp method Proposed research Youssef [17] Irtaza et al [4] Poursistani et al [18] Tian et al [19] Wang et al [15]Africa 1461 1270 1300 1405 1492 1280Beach 1492 1284 1200 889 756 1080Buildings 1605 1396 1240 1416 1078 1060Buses 1917 1830 1700 1526 1934 1880Dinosaurs 1959 1984 1860 2000 1980 1960Elephants 1753 1562 1300 1276 1320 1560Flowers 1703 1896 1880 1848 1840 1420Horses 1726 1904 1540 1894 1740 1860Mountains 1649 1476 1460 1124 1170 840Food 1579 1612 1620 1490 1244 1000Mean 1684 1621 1510 1457 1455 1394

100 200 300 4000Size of vocabulary

75

80

85

MA

P

Proposed research with a pixel stride of 5BoVW with a pixel stride of 5Proposed research with a pixel stride of 15BoVW with a pixel stride of 15Proposed work with a pixel stride of 25BoVW with a pixel stride of 25

Figure 5 MAP as a function of vocabulary size

In order to present a sustainable performance of theproposed research the image retrieval precision and recallfor the top-20 image retrievals are compared with the state-of-the-art research of CBIR [4 15 17ndash19] Tables 2 and 3 arepresenting the classwise comparison of average precision andrecall of the proposed research (with a vocabulary size of 200words on a pixel stride of 5 and by using 100 features perimage) with the existing state-of-the-art techniques of CBIRThe MAP comparison is shown in Figure 6 while precision-recall curve is shown in Figure 7

The experimental results conducted on Corel-A imagebenchmark prove the robustness of the proposed imagerepresentation The MAP of the proposed research is higherthan the existing state-of-the-art research [4 15 17ndash19] TheMAP obtained from the proposed image representation is8421 (with a vocabulary size of 200 visual words on a pixelstride of 5 and by using 100 features per image)

The image retrieval results obtained by using proposedimage representation for the semantic class ldquoBusesrdquo andldquoBeachrdquo are shown in Figures 8 and 9 respectively in res-ponse to the query images that shows reduction of semanticgap in terms of classifier decision value (score) The classifierdecision label determines the class of the image whileclassifier decision value (score) is used to find the similar

8 Mathematical Problems in Engineering

Proposed researchYoussef et alIrtaza et al

Poursistani et al

Wang et alTian et al

84218107

7550 7434 72776970

60

65

70

75

80

85

90

Mea

n av

erag

e pre

cisio

n (M

AP)

()

Figure 6 Comparison of MAP with the state-of-the-art methodson Corel-A image benchmark

Proposed research with pixel stride of 5SIFT-LBP [32]

20 40 60 80 1000Recall ()

40

60

80

100

Prec

ision

()

Figure 7 Precision-recall curve for Corel-A image benchmark

images The real values shown at the top of each image arethe classifier decision value (score) of the respective imagecomputed by applying the Euclidean distance between scoreof the query image and scores of the retrieved images Top-20retrieved images whose score is close to the score of the queryimage aremore similar to the query image and vice versa (dueto limited space the retrieval results in response to the queryimages are shown for two semantic classes only)

42 Performance on Caltech-256 Image Benchmark Caltech-256 image benchmark (httpwwwvisioncaltechedu) con-sists of 256 semantic classes and each semantic class contains80 to 827 images For the simplicity we randomly selected10 classes from the Caltech-256 image benchmark Theselected classes are Motorbike Faceeasy Fireworks BonsaiButterfly Leopard Airplanes Ketch and Hibiscus as shownin Figure 10

Different sizes of visual vocabulary (20 50 100 200 400and 600) are constructed from a set of training images 50

190792

192121

181437

207091

168814

1897

184608

20671

212677

191177

196715

204835

212566

18671

177793

170756

214148

188865

18053

173708

168421

Figure 8 Image retrieval result shows reduction of semantic gap forthe semantic class ldquoBusesrdquo

209122

206148

200182

194074

182063

211198

202284

194183

186

210937

203655

221345

190409

214339

220837

19114

181279

205414

197978

225197

181495

Figure 9 Image retrieval result shows reduction of semantic gap forthe semantic class ldquoBeachrdquo

images are used for the training and the remaining 50images are used for the testing The MAP of the proposedresearch is compared with standard BoVW representation(without considering the spatial information) on differentpixel strides (5 15 and 25) vocabulary sizes (20 50 100200 400 and 600) and feature percentages (10 2550 75 and 100) for vocabulary construction The MAPobtained from the proposed image representation and thestandard BoVW representation by using different parametersis graphically presented in Figure 11

Mathematical Problems in Engineering 9

Figure 10 Samples of images from 10 semantic classes of Caltech-256 image benchmark

100 200 300 400 500 6000Size of vocabulary

75

80

85

MA

P

Proposed research with pixel stride of 5BoVW with pixel stride of 5Proposed research with pixel stride of 15BoVW with pixel stride of 15Proposed research with pixel stride of 25BoVW with pixel stride of 25

Figure 11 MAP as a function of vocabulary size for the Caltech-256image benchmark

The experimental results conducted on 10 semanticclasses of Caltech-256 image benchmark prove the robustnessof the proposed image representation The MAP obtained byusing the proposed image representation (with a vocabularysize of 400 visualwords andby using 100 features per image)on the pixel strides of 5 15 and 25 is 8650 8398 and8171 respectively while MAP obtained by using the sameexperimental parameters for standard BoVW representationis 8545 8305 and 8095 respectively According tothe experimental results the MAP for all vocabulary sizesdecreases by increasing the pixel stride as shown in Figure 11while precision-recall curve is shown in Figure 12

Proposed research with pixel stride of 5BoVW with pixel stride of 5

50

60

70

80

90Pr

ecisi

on (

)

20 40 60 80 1000Recall ()

Figure 12 Precision-recall curve for Caltech-256 image benchmark

43 Performance on Ground Truth Image BenchmarkGround Truth image benchmark (httpimagedatabasecswashingtonedugroundtruth) is a publicly available imagebenchmark and is used for the evaluation of CBIR research[20 21 33] There are a total of 1109 images that are dividedinto 22 semantic classes We manually selected all the imagesfrom 5 different semantic classes of Ground Truth imagebenchmark The selected classes for the evaluation of theproposed image representation are Abro green CherriesFootball Green Lake and Swiss Mountains as shown inFigure 13 The 5 classes are selected in order to comparethe performance of the proposed image representation withexisting state-of-the-art research [20 21 33] because theseresearchers also reported their results on the same numberof classes of Ground Truth image benchmark

50 images are used for the training and the remaining50 images are used for the testing Different sizes (10 20

10 Mathematical Problems in Engineering

Figure 13 Samples of images from 5 semantic classes of Ground Truth image benchmark

8284 8133

62805909

Proposed researchCardoso et al

Yildizer et alYildizer et al

45

50

55

60

65

70

75

80

85

MA

P (

)

Figure 14 Comparison of MAP with the state-of-the-art methodson Ground Truth image benchmark

Table 4 Classwise comparison of average precision

Class amp method Proposed method [20] [21]Abro green 7525 80 6667Cherries 7194 80 50Football 9018 100 75Green Lake 8423 80 50Swiss Mountains 9263 6067 50

30 40 and 50) of visual vocabulary are constructed andMAP on the vocabulary of these different sizes is calculatedAccording to the experimental results the MAP of 8284 isobtained by using the proposed image representation (with avocabulary size of 40 visual words on pixel stride of 5 andby using 100 features per image) The classwise averageprecision obtained from the proposed image representationis presented in Table 4 while MAP is graphically presentedin Figure 14

Table 5 Computational cost (in seconds) of the proposed algo-rithm

Number of images retrieved Proposed researchTop-5 02872Top-10 03972Top-15 06470Top-20 07837Top-25 09184Top-30 11103

Experimental results and comparisons conducted on theGround Truth image benchmark prove the robustness of theproposed research based on a combination of local and globalhistograms of visual words The MAP obtained from theproposed image representation is higher than the existingstate-of-the-art research [20 21 33]

44 Complexity Performance The computational cost of theproposed algorithm is calculated using Intel(R) Core i3(fourth generation) 17 Ghz CPU with 4GB RAM (DDR3)3MB L3 cache and Windows 7 operating system Theproposed algorithm is implemented in MATLAB and visualvocabulary is constructed offline using a training dataset andtested at run time using a test dataset The average CPUtime (in seconds) required from features extraction to imageretrieval is presented in Table 5

5 Conclusion and Future Directions

In this paper we proposed a novel image representationbased on a combination of local and global histograms ofvisual words The local histogram of visual words adds thespatial information of the central area to the inverted index ofBoVW representation The combination of local and globalhistograms of visual words is a possible solution to capturethe semantic information from the image The performance

Mathematical Problems in Engineering 11

of the proposed image representation is evaluated on threechallenging image datasets The proposed image represen-tation outperforms the state-of-the-art research includingthe standard BoVW representation For the future work weplan to replace the BoVW model with either Fisher kernelframework or the Vector of Locally Aggregated Descriptors(VLAD) model to evaluate a large-scale image retrieval

Competing Interests

The authors declare that they have no competing interests

References

[1] A-M Tousch S Herbin and J-Y Audibert ldquoSemantic hierar-chies for image annotation a surveyrdquo Pattern Recognition vol45 no 1 pp 333ndash345 2012

[2] R Datta D Joshi J Li and J Z Wang ldquoImage retrieval ideasinfluences and trends of the new agerdquoACMComputing Surveysvol 40 no 2 article 5 2008

[3] Y Liu D Zhang G Lu and W-Y Ma ldquoA survey of content-based image retrieval with high-level semanticsrdquo Pattern Recog-nition vol 40 no 1 pp 262ndash282 2007

[4] A Irtaza M A Jaffar E Aleisa and T-S Choi ldquoEmbeddingneural networks for semantic association in content basedimage retrievalrdquoMultimedia Tools and Applications vol 72 no2 pp 1911ndash1931 2014

[5] D ZhangMM Islam andG Lu ldquoA reviewon automatic imageannotation techniquesrdquo Pattern Recognition vol 45 no 1 pp346ndash362 2012

[6] J Sivic and A Zisserman ldquoVideo google a text retrievalapproach to object matching in videosrdquo in Proceedings of the 9thIEEE International Conference on Computer Vision pp 1470ndash1477 IEEE 2003

[7] J PhilbinO ChumM Isard J Sivic andA Zisserman ldquoObjectretrieval with large vocabularies and fast spatial matchingrdquoin Proceedings of the IEEE Computer Society Conference onComputer Vision and Pattern Recognition (CVPR rsquo07) pp 1ndash8IEEE Minneapolis Minn USA June 2007

[8] O Chum J Philbin J Sivic M Isard and A Zisserman ldquoTotalrecall automatic query expansion with a generative featuremodel for object retrievalrdquo in Proceedings of the 2007 IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[9] J PhilbinOChumM Isard J Sivic andA Zisserman ldquoLost inquantization improving particular object retrieval in large scaleimage databasesrdquo in Proceedings of the 26th IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo08) pp 1ndash8IEEE Anchorage Alaska USA June 2008

[10] H Anwar S Zambanini M Kampel and K VondrovecldquoAncient coin classification using reverse motif recognitionimage-based classification of roman republican coinsrdquo IEEESignal Processing Magazine vol 32 no 4 pp 64ndash74 2015

[11] S Zhang Q Tian G Hua Q Huang and W Gao ldquoGeneratingdescriptive visual words and visual phrases for large-scale imageapplicationsrdquo IEEETransactions on Image Processing vol 20 no9 pp 2664ndash2677 2011

[12] Y Su and F Jurie ldquoImproving image classification using seman-tic attributesrdquo International Journal of Computer Vision vol 100no 1 pp 59ndash77 2012

[13] Z Mehmood S M Anwar M Altaf and N Ali ldquoA novelimage retrieval based on rectangular spatial histograms of visualwordsrdquo Kuwait Journal of Science In press

[14] S Lazebnik C Schmid and J Ponce ldquoBeyond bags of featuresspatial pyramid matching for recognizing natural scene cate-goriesrdquo in Proceedings of the IEEE Computer Society Conferenceon Computer Vision and Pattern Recognition (CVPR rsquo06) vol 2pp 2169ndash2178 New York NY USA June 2006

[15] C Wang B Zhang Z Qin and J Xiong ldquoSpatial weighting forbag-of-features based image retrievalrdquo in Integrated Uncertaintyin Knowledge Modelling and Decision Making Z Qin and V-NHuynh Eds vol 8032 of Lecture Notes in Computer Science pp91ndash100 Springer Berlin Germany 2013

[16] N Ali K B Bajwa R Sablatnig and Z Mehmood ldquoImageretrieval by addition of spatial information based on histogramsof triangular regionsrdquoComputers amp Electrical Engineering 2016

[17] SM Youssef ldquoICTEDCT-CBIR integrating curvelet transformwith enhanced dominant colors extraction and texture analysisfor efficient content-based image retrievalrdquo Computers andElectrical Engineering vol 38 no 5 pp 1358ndash1376 2012

[18] P Poursistani H Nezamabadi-pour R Askari Moghadam andM Saeed ldquoImage indexing and retrieval in JPEG compresseddomain based on vector quantizationrdquoMathematical and Com-puter Modelling vol 57 no 5-6 pp 1005ndash1017 2013

[19] X Tian L Jiao X Liu and X Zhang ldquoFeature integration ofEODH and Color-SIFT application to image retrieval based oncodebookrdquo Signal Processing Image Communication vol 29 no4 pp 530ndash545 2014

[20] D N M Cardoso J Muller F Alexandre L A P Neves P MG Trevisani andG A Giraldi ldquoIterative technique for content-based image retrieval using multiple SVM ensemblesrdquo in ATreatise on Electricity andMagnetism J ClerkMaxwell Ed vol2 pp 68ndash73 Cambridge University Press 1873

[21] E Yildizer A M Balci M Hassan and R Alhajj ldquoEfficientcontent-based image retrieval using multiple support vectormachines ensemblerdquo Expert Systems with Applications vol 39no 3 pp 2385ndash2396 2012

[22] N Dalal and B Triggs ldquoHistograms of oriented gradients forhuman detectionrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo05) vol 1 pp 886ndash893 IEEE SanDiego Calif USA June 2005

[23] D G Lowe ldquoDistinctive image features from scale-invariantkeypointsrdquo International Journal of Computer Vision vol 60 no2 pp 91ndash110 2004

[24] J Matas O Chum M Urban and T Pajdla ldquoRobust wide-baseline stereo from maximally stable extremal regionsrdquo Imageand Vision Computing vol 22 no 10 pp 761ndash767 2004

[25] H Bay T Tuytelaars and L Van Gool ldquoSurf speeded uprobust featuresrdquo in Computer VisionmdashECCV 2006 pp 404ndash417Springer 2006

[26] D J Mankowitz and S Ramamoorthy ldquoBRISK-based visualfeature extraction for resource constrained robotsrdquo in RoboCup2013 Robot World Cup XVII S Behnke M Veloso A Visserand R Xiong Eds vol 8371 of Lecture Notes in ComputerScience pp 195ndash206 Springer Berlin Germany 2014

[27] S Krig ldquoInterest point detector and feature descriptor surveyrdquoin Computer Vision Metrics pp 217ndash282 Springer BerlinGermany 2014

[28] R Ashraf K Bashir A Irtaza and M T Mahmood ldquoContentbased image retrieval using embedded neural networks withbandletized regionsrdquo Entropy vol 17 no 6 pp 3552ndash3580 2015

12 Mathematical Problems in Engineering

[29] S Zeng R Huang H Wang and Z Kang ldquoImage retrievalusing spatiograms of colors quantized by Gaussian MixtureModelsrdquo Neurocomputing vol 171 pp 673ndash684 2016

[30] E Walia and A Pal ldquoFusion framework for effective colorimage retrievalrdquo Journal of Visual Communication and ImageRepresentation vol 25 no 6 pp 1335ndash1348 2014

[31] A Irtaza and M A Jaffar ldquoCategorical image retrieval throughgenetically optimized support vector machines (GOSVM) andhybrid texture featuresrdquo Signal Image and Video Processing vol9 no 7 pp 1503ndash1519 2014

[32] J Yu Z Qin T Wan and X Zhang ldquoFeature integrationanalysis of bag-of-features model for image retrievalrdquo Neuro-computing vol 120 pp 355ndash364 2013

[33] E Yildizer A M Balci T N Jarada and R Alhajj ldquoIntegratingwavelets with clustering and indexing for effective content-based image retrievalrdquoKnowledge-Based Systems vol 31 pp 55ndash66 2012

[34] J Shawe-Taylor and N Cristianini Kernel Methods for PatternAnalysis Cambridge University Press New York NY USA2004

[35] AVedaldi andA Zisserman ldquoSparse kernel approximations forefficient classification and detectionrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition (CVPRrsquo12) pp 2320ndash2327 June 2012

[36] E Nowak F Jurie and B Triggs ldquoSampling strategies for bag-of-features image classifiationrdquo in Computer VisionmdashECCV2006 pp 490ndash503 Springer 2006

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings oftheWorkshop on Statistical Learning in Computer Vision (ECCVrsquo04) vol 1 pp 1ndash2 Prague Czech Republic 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: Research Article A Novel Image Retrieval Based on a Combination …downloads.hindawi.com/journals/mpe/2016/8217250.pdf · 2019-07-30 · image retrieval by using a combination of

8 Mathematical Problems in Engineering

Proposed researchYoussef et alIrtaza et al

Poursistani et al

Wang et alTian et al

84218107

7550 7434 72776970

60

65

70

75

80

85

90

Mea

n av

erag

e pre

cisio

n (M

AP)

()

Figure 6 Comparison of MAP with the state-of-the-art methodson Corel-A image benchmark

Proposed research with pixel stride of 5SIFT-LBP [32]

20 40 60 80 1000Recall ()

40

60

80

100

Prec

ision

()

Figure 7 Precision-recall curve for Corel-A image benchmark

images The real values shown at the top of each image arethe classifier decision value (score) of the respective imagecomputed by applying the Euclidean distance between scoreof the query image and scores of the retrieved images Top-20retrieved images whose score is close to the score of the queryimage aremore similar to the query image and vice versa (dueto limited space the retrieval results in response to the queryimages are shown for two semantic classes only)

42 Performance on Caltech-256 Image Benchmark Caltech-256 image benchmark (httpwwwvisioncaltechedu) con-sists of 256 semantic classes and each semantic class contains80 to 827 images For the simplicity we randomly selected10 classes from the Caltech-256 image benchmark Theselected classes are Motorbike Faceeasy Fireworks BonsaiButterfly Leopard Airplanes Ketch and Hibiscus as shownin Figure 10

Different sizes of visual vocabulary (20 50 100 200 400and 600) are constructed from a set of training images 50

190792

192121

181437

207091

168814

1897

184608

20671

212677

191177

196715

204835

212566

18671

177793

170756

214148

188865

18053

173708

168421

Figure 8 Image retrieval result shows reduction of semantic gap forthe semantic class ldquoBusesrdquo

209122

206148

200182

194074

182063

211198

202284

194183

186

210937

203655

221345

190409

214339

220837

19114

181279

205414

197978

225197

181495

Figure 9 Image retrieval result shows reduction of semantic gap forthe semantic class ldquoBeachrdquo

images are used for the training and the remaining 50images are used for the testing The MAP of the proposedresearch is compared with standard BoVW representation(without considering the spatial information) on differentpixel strides (5 15 and 25) vocabulary sizes (20 50 100200 400 and 600) and feature percentages (10 2550 75 and 100) for vocabulary construction The MAPobtained from the proposed image representation and thestandard BoVW representation by using different parametersis graphically presented in Figure 11

Mathematical Problems in Engineering 9

Figure 10 Samples of images from 10 semantic classes of Caltech-256 image benchmark

100 200 300 400 500 6000Size of vocabulary

75

80

85

MA

P

Proposed research with pixel stride of 5BoVW with pixel stride of 5Proposed research with pixel stride of 15BoVW with pixel stride of 15Proposed research with pixel stride of 25BoVW with pixel stride of 25

Figure 11 MAP as a function of vocabulary size for the Caltech-256image benchmark

The experimental results conducted on 10 semanticclasses of Caltech-256 image benchmark prove the robustnessof the proposed image representation The MAP obtained byusing the proposed image representation (with a vocabularysize of 400 visualwords andby using 100 features per image)on the pixel strides of 5 15 and 25 is 8650 8398 and8171 respectively while MAP obtained by using the sameexperimental parameters for standard BoVW representationis 8545 8305 and 8095 respectively According tothe experimental results the MAP for all vocabulary sizesdecreases by increasing the pixel stride as shown in Figure 11while precision-recall curve is shown in Figure 12

Proposed research with pixel stride of 5BoVW with pixel stride of 5

50

60

70

80

90Pr

ecisi

on (

)

20 40 60 80 1000Recall ()

Figure 12 Precision-recall curve for Caltech-256 image benchmark

43 Performance on Ground Truth Image BenchmarkGround Truth image benchmark (httpimagedatabasecswashingtonedugroundtruth) is a publicly available imagebenchmark and is used for the evaluation of CBIR research[20 21 33] There are a total of 1109 images that are dividedinto 22 semantic classes We manually selected all the imagesfrom 5 different semantic classes of Ground Truth imagebenchmark The selected classes for the evaluation of theproposed image representation are Abro green CherriesFootball Green Lake and Swiss Mountains as shown inFigure 13 The 5 classes are selected in order to comparethe performance of the proposed image representation withexisting state-of-the-art research [20 21 33] because theseresearchers also reported their results on the same numberof classes of Ground Truth image benchmark

50 images are used for the training and the remaining50 images are used for the testing Different sizes (10 20

10 Mathematical Problems in Engineering

Figure 13 Samples of images from 5 semantic classes of Ground Truth image benchmark

8284 8133

62805909

Proposed researchCardoso et al

Yildizer et alYildizer et al

45

50

55

60

65

70

75

80

85

MA

P (

)

Figure 14 Comparison of MAP with the state-of-the-art methodson Ground Truth image benchmark

Table 4 Classwise comparison of average precision

Class amp method Proposed method [20] [21]Abro green 7525 80 6667Cherries 7194 80 50Football 9018 100 75Green Lake 8423 80 50Swiss Mountains 9263 6067 50

30 40 and 50) of visual vocabulary are constructed andMAP on the vocabulary of these different sizes is calculatedAccording to the experimental results the MAP of 8284 isobtained by using the proposed image representation (with avocabulary size of 40 visual words on pixel stride of 5 andby using 100 features per image) The classwise averageprecision obtained from the proposed image representationis presented in Table 4 while MAP is graphically presentedin Figure 14

Table 5 Computational cost (in seconds) of the proposed algo-rithm

Number of images retrieved Proposed researchTop-5 02872Top-10 03972Top-15 06470Top-20 07837Top-25 09184Top-30 11103

Experimental results and comparisons conducted on theGround Truth image benchmark prove the robustness of theproposed research based on a combination of local and globalhistograms of visual words The MAP obtained from theproposed image representation is higher than the existingstate-of-the-art research [20 21 33]

44 Complexity Performance The computational cost of theproposed algorithm is calculated using Intel(R) Core i3(fourth generation) 17 Ghz CPU with 4GB RAM (DDR3)3MB L3 cache and Windows 7 operating system Theproposed algorithm is implemented in MATLAB and visualvocabulary is constructed offline using a training dataset andtested at run time using a test dataset The average CPUtime (in seconds) required from features extraction to imageretrieval is presented in Table 5

5 Conclusion and Future Directions

In this paper we proposed a novel image representationbased on a combination of local and global histograms ofvisual words The local histogram of visual words adds thespatial information of the central area to the inverted index ofBoVW representation The combination of local and globalhistograms of visual words is a possible solution to capturethe semantic information from the image The performance

Mathematical Problems in Engineering 11

of the proposed image representation is evaluated on threechallenging image datasets The proposed image represen-tation outperforms the state-of-the-art research includingthe standard BoVW representation For the future work weplan to replace the BoVW model with either Fisher kernelframework or the Vector of Locally Aggregated Descriptors(VLAD) model to evaluate a large-scale image retrieval

Competing Interests

The authors declare that they have no competing interests

References

[1] A-M Tousch S Herbin and J-Y Audibert ldquoSemantic hierar-chies for image annotation a surveyrdquo Pattern Recognition vol45 no 1 pp 333ndash345 2012

[2] R Datta D Joshi J Li and J Z Wang ldquoImage retrieval ideasinfluences and trends of the new agerdquoACMComputing Surveysvol 40 no 2 article 5 2008

[3] Y Liu D Zhang G Lu and W-Y Ma ldquoA survey of content-based image retrieval with high-level semanticsrdquo Pattern Recog-nition vol 40 no 1 pp 262ndash282 2007

[4] A Irtaza M A Jaffar E Aleisa and T-S Choi ldquoEmbeddingneural networks for semantic association in content basedimage retrievalrdquoMultimedia Tools and Applications vol 72 no2 pp 1911ndash1931 2014

[5] D ZhangMM Islam andG Lu ldquoA reviewon automatic imageannotation techniquesrdquo Pattern Recognition vol 45 no 1 pp346ndash362 2012

[6] J Sivic and A Zisserman ldquoVideo google a text retrievalapproach to object matching in videosrdquo in Proceedings of the 9thIEEE International Conference on Computer Vision pp 1470ndash1477 IEEE 2003

[7] J PhilbinO ChumM Isard J Sivic andA Zisserman ldquoObjectretrieval with large vocabularies and fast spatial matchingrdquoin Proceedings of the IEEE Computer Society Conference onComputer Vision and Pattern Recognition (CVPR rsquo07) pp 1ndash8IEEE Minneapolis Minn USA June 2007

[8] O Chum J Philbin J Sivic M Isard and A Zisserman ldquoTotalrecall automatic query expansion with a generative featuremodel for object retrievalrdquo in Proceedings of the 2007 IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[9] J PhilbinOChumM Isard J Sivic andA Zisserman ldquoLost inquantization improving particular object retrieval in large scaleimage databasesrdquo in Proceedings of the 26th IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo08) pp 1ndash8IEEE Anchorage Alaska USA June 2008

[10] H Anwar S Zambanini M Kampel and K VondrovecldquoAncient coin classification using reverse motif recognitionimage-based classification of roman republican coinsrdquo IEEESignal Processing Magazine vol 32 no 4 pp 64ndash74 2015

[11] S Zhang Q Tian G Hua Q Huang and W Gao ldquoGeneratingdescriptive visual words and visual phrases for large-scale imageapplicationsrdquo IEEETransactions on Image Processing vol 20 no9 pp 2664ndash2677 2011

[12] Y Su and F Jurie ldquoImproving image classification using seman-tic attributesrdquo International Journal of Computer Vision vol 100no 1 pp 59ndash77 2012

[13] Z Mehmood S M Anwar M Altaf and N Ali ldquoA novelimage retrieval based on rectangular spatial histograms of visualwordsrdquo Kuwait Journal of Science In press

[14] S Lazebnik C Schmid and J Ponce ldquoBeyond bags of featuresspatial pyramid matching for recognizing natural scene cate-goriesrdquo in Proceedings of the IEEE Computer Society Conferenceon Computer Vision and Pattern Recognition (CVPR rsquo06) vol 2pp 2169ndash2178 New York NY USA June 2006

[15] C Wang B Zhang Z Qin and J Xiong ldquoSpatial weighting forbag-of-features based image retrievalrdquo in Integrated Uncertaintyin Knowledge Modelling and Decision Making Z Qin and V-NHuynh Eds vol 8032 of Lecture Notes in Computer Science pp91ndash100 Springer Berlin Germany 2013

[16] N Ali K B Bajwa R Sablatnig and Z Mehmood ldquoImageretrieval by addition of spatial information based on histogramsof triangular regionsrdquoComputers amp Electrical Engineering 2016

[17] SM Youssef ldquoICTEDCT-CBIR integrating curvelet transformwith enhanced dominant colors extraction and texture analysisfor efficient content-based image retrievalrdquo Computers andElectrical Engineering vol 38 no 5 pp 1358ndash1376 2012

[18] P Poursistani H Nezamabadi-pour R Askari Moghadam andM Saeed ldquoImage indexing and retrieval in JPEG compresseddomain based on vector quantizationrdquoMathematical and Com-puter Modelling vol 57 no 5-6 pp 1005ndash1017 2013

[19] X Tian L Jiao X Liu and X Zhang ldquoFeature integration ofEODH and Color-SIFT application to image retrieval based oncodebookrdquo Signal Processing Image Communication vol 29 no4 pp 530ndash545 2014

[20] D N M Cardoso J Muller F Alexandre L A P Neves P MG Trevisani andG A Giraldi ldquoIterative technique for content-based image retrieval using multiple SVM ensemblesrdquo in ATreatise on Electricity andMagnetism J ClerkMaxwell Ed vol2 pp 68ndash73 Cambridge University Press 1873

[21] E Yildizer A M Balci M Hassan and R Alhajj ldquoEfficientcontent-based image retrieval using multiple support vectormachines ensemblerdquo Expert Systems with Applications vol 39no 3 pp 2385ndash2396 2012

[22] N Dalal and B Triggs ldquoHistograms of oriented gradients forhuman detectionrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo05) vol 1 pp 886ndash893 IEEE SanDiego Calif USA June 2005

[23] D G Lowe ldquoDistinctive image features from scale-invariantkeypointsrdquo International Journal of Computer Vision vol 60 no2 pp 91ndash110 2004

[24] J Matas O Chum M Urban and T Pajdla ldquoRobust wide-baseline stereo from maximally stable extremal regionsrdquo Imageand Vision Computing vol 22 no 10 pp 761ndash767 2004

[25] H Bay T Tuytelaars and L Van Gool ldquoSurf speeded uprobust featuresrdquo in Computer VisionmdashECCV 2006 pp 404ndash417Springer 2006

[26] D J Mankowitz and S Ramamoorthy ldquoBRISK-based visualfeature extraction for resource constrained robotsrdquo in RoboCup2013 Robot World Cup XVII S Behnke M Veloso A Visserand R Xiong Eds vol 8371 of Lecture Notes in ComputerScience pp 195ndash206 Springer Berlin Germany 2014

[27] S Krig ldquoInterest point detector and feature descriptor surveyrdquoin Computer Vision Metrics pp 217ndash282 Springer BerlinGermany 2014

[28] R Ashraf K Bashir A Irtaza and M T Mahmood ldquoContentbased image retrieval using embedded neural networks withbandletized regionsrdquo Entropy vol 17 no 6 pp 3552ndash3580 2015

12 Mathematical Problems in Engineering

[29] S Zeng R Huang H Wang and Z Kang ldquoImage retrievalusing spatiograms of colors quantized by Gaussian MixtureModelsrdquo Neurocomputing vol 171 pp 673ndash684 2016

[30] E Walia and A Pal ldquoFusion framework for effective colorimage retrievalrdquo Journal of Visual Communication and ImageRepresentation vol 25 no 6 pp 1335ndash1348 2014

[31] A Irtaza and M A Jaffar ldquoCategorical image retrieval throughgenetically optimized support vector machines (GOSVM) andhybrid texture featuresrdquo Signal Image and Video Processing vol9 no 7 pp 1503ndash1519 2014

[32] J Yu Z Qin T Wan and X Zhang ldquoFeature integrationanalysis of bag-of-features model for image retrievalrdquo Neuro-computing vol 120 pp 355ndash364 2013

[33] E Yildizer A M Balci T N Jarada and R Alhajj ldquoIntegratingwavelets with clustering and indexing for effective content-based image retrievalrdquoKnowledge-Based Systems vol 31 pp 55ndash66 2012

[34] J Shawe-Taylor and N Cristianini Kernel Methods for PatternAnalysis Cambridge University Press New York NY USA2004

[35] AVedaldi andA Zisserman ldquoSparse kernel approximations forefficient classification and detectionrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition (CVPRrsquo12) pp 2320ndash2327 June 2012

[36] E Nowak F Jurie and B Triggs ldquoSampling strategies for bag-of-features image classifiationrdquo in Computer VisionmdashECCV2006 pp 490ndash503 Springer 2006

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings oftheWorkshop on Statistical Learning in Computer Vision (ECCVrsquo04) vol 1 pp 1ndash2 Prague Czech Republic 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 9: Research Article A Novel Image Retrieval Based on a Combination …downloads.hindawi.com/journals/mpe/2016/8217250.pdf · 2019-07-30 · image retrieval by using a combination of

Mathematical Problems in Engineering 9

Figure 10 Samples of images from 10 semantic classes of Caltech-256 image benchmark

100 200 300 400 500 6000Size of vocabulary

75

80

85

MA

P

Proposed research with pixel stride of 5BoVW with pixel stride of 5Proposed research with pixel stride of 15BoVW with pixel stride of 15Proposed research with pixel stride of 25BoVW with pixel stride of 25

Figure 11 MAP as a function of vocabulary size for the Caltech-256image benchmark

The experimental results conducted on 10 semanticclasses of Caltech-256 image benchmark prove the robustnessof the proposed image representation The MAP obtained byusing the proposed image representation (with a vocabularysize of 400 visualwords andby using 100 features per image)on the pixel strides of 5 15 and 25 is 8650 8398 and8171 respectively while MAP obtained by using the sameexperimental parameters for standard BoVW representationis 8545 8305 and 8095 respectively According tothe experimental results the MAP for all vocabulary sizesdecreases by increasing the pixel stride as shown in Figure 11while precision-recall curve is shown in Figure 12

Proposed research with pixel stride of 5BoVW with pixel stride of 5

50

60

70

80

90Pr

ecisi

on (

)

20 40 60 80 1000Recall ()

Figure 12 Precision-recall curve for Caltech-256 image benchmark

43 Performance on Ground Truth Image BenchmarkGround Truth image benchmark (httpimagedatabasecswashingtonedugroundtruth) is a publicly available imagebenchmark and is used for the evaluation of CBIR research[20 21 33] There are a total of 1109 images that are dividedinto 22 semantic classes We manually selected all the imagesfrom 5 different semantic classes of Ground Truth imagebenchmark The selected classes for the evaluation of theproposed image representation are Abro green CherriesFootball Green Lake and Swiss Mountains as shown inFigure 13 The 5 classes are selected in order to comparethe performance of the proposed image representation withexisting state-of-the-art research [20 21 33] because theseresearchers also reported their results on the same numberof classes of Ground Truth image benchmark

50 images are used for the training and the remaining50 images are used for the testing Different sizes (10 20

10 Mathematical Problems in Engineering

Figure 13 Samples of images from 5 semantic classes of Ground Truth image benchmark

8284 8133

62805909

Proposed researchCardoso et al

Yildizer et alYildizer et al

45

50

55

60

65

70

75

80

85

MA

P (

)

Figure 14 Comparison of MAP with the state-of-the-art methodson Ground Truth image benchmark

Table 4 Classwise comparison of average precision

Class amp method Proposed method [20] [21]Abro green 7525 80 6667Cherries 7194 80 50Football 9018 100 75Green Lake 8423 80 50Swiss Mountains 9263 6067 50

30 40 and 50) of visual vocabulary are constructed andMAP on the vocabulary of these different sizes is calculatedAccording to the experimental results the MAP of 8284 isobtained by using the proposed image representation (with avocabulary size of 40 visual words on pixel stride of 5 andby using 100 features per image) The classwise averageprecision obtained from the proposed image representationis presented in Table 4 while MAP is graphically presentedin Figure 14

Table 5 Computational cost (in seconds) of the proposed algo-rithm

Number of images retrieved Proposed researchTop-5 02872Top-10 03972Top-15 06470Top-20 07837Top-25 09184Top-30 11103

Experimental results and comparisons conducted on theGround Truth image benchmark prove the robustness of theproposed research based on a combination of local and globalhistograms of visual words The MAP obtained from theproposed image representation is higher than the existingstate-of-the-art research [20 21 33]

44 Complexity Performance The computational cost of theproposed algorithm is calculated using Intel(R) Core i3(fourth generation) 17 Ghz CPU with 4GB RAM (DDR3)3MB L3 cache and Windows 7 operating system Theproposed algorithm is implemented in MATLAB and visualvocabulary is constructed offline using a training dataset andtested at run time using a test dataset The average CPUtime (in seconds) required from features extraction to imageretrieval is presented in Table 5

5 Conclusion and Future Directions

In this paper we proposed a novel image representationbased on a combination of local and global histograms ofvisual words The local histogram of visual words adds thespatial information of the central area to the inverted index ofBoVW representation The combination of local and globalhistograms of visual words is a possible solution to capturethe semantic information from the image The performance

Mathematical Problems in Engineering 11

of the proposed image representation is evaluated on threechallenging image datasets The proposed image represen-tation outperforms the state-of-the-art research includingthe standard BoVW representation For the future work weplan to replace the BoVW model with either Fisher kernelframework or the Vector of Locally Aggregated Descriptors(VLAD) model to evaluate a large-scale image retrieval

Competing Interests

The authors declare that they have no competing interests

References

[1] A-M Tousch S Herbin and J-Y Audibert ldquoSemantic hierar-chies for image annotation a surveyrdquo Pattern Recognition vol45 no 1 pp 333ndash345 2012

[2] R Datta D Joshi J Li and J Z Wang ldquoImage retrieval ideasinfluences and trends of the new agerdquoACMComputing Surveysvol 40 no 2 article 5 2008

[3] Y Liu D Zhang G Lu and W-Y Ma ldquoA survey of content-based image retrieval with high-level semanticsrdquo Pattern Recog-nition vol 40 no 1 pp 262ndash282 2007

[4] A Irtaza M A Jaffar E Aleisa and T-S Choi ldquoEmbeddingneural networks for semantic association in content basedimage retrievalrdquoMultimedia Tools and Applications vol 72 no2 pp 1911ndash1931 2014

[5] D ZhangMM Islam andG Lu ldquoA reviewon automatic imageannotation techniquesrdquo Pattern Recognition vol 45 no 1 pp346ndash362 2012

[6] J Sivic and A Zisserman ldquoVideo google a text retrievalapproach to object matching in videosrdquo in Proceedings of the 9thIEEE International Conference on Computer Vision pp 1470ndash1477 IEEE 2003

[7] J PhilbinO ChumM Isard J Sivic andA Zisserman ldquoObjectretrieval with large vocabularies and fast spatial matchingrdquoin Proceedings of the IEEE Computer Society Conference onComputer Vision and Pattern Recognition (CVPR rsquo07) pp 1ndash8IEEE Minneapolis Minn USA June 2007

[8] O Chum J Philbin J Sivic M Isard and A Zisserman ldquoTotalrecall automatic query expansion with a generative featuremodel for object retrievalrdquo in Proceedings of the 2007 IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[9] J PhilbinOChumM Isard J Sivic andA Zisserman ldquoLost inquantization improving particular object retrieval in large scaleimage databasesrdquo in Proceedings of the 26th IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo08) pp 1ndash8IEEE Anchorage Alaska USA June 2008

[10] H Anwar S Zambanini M Kampel and K VondrovecldquoAncient coin classification using reverse motif recognitionimage-based classification of roman republican coinsrdquo IEEESignal Processing Magazine vol 32 no 4 pp 64ndash74 2015

[11] S Zhang Q Tian G Hua Q Huang and W Gao ldquoGeneratingdescriptive visual words and visual phrases for large-scale imageapplicationsrdquo IEEETransactions on Image Processing vol 20 no9 pp 2664ndash2677 2011

[12] Y Su and F Jurie ldquoImproving image classification using seman-tic attributesrdquo International Journal of Computer Vision vol 100no 1 pp 59ndash77 2012

[13] Z Mehmood S M Anwar M Altaf and N Ali ldquoA novelimage retrieval based on rectangular spatial histograms of visualwordsrdquo Kuwait Journal of Science In press

[14] S Lazebnik C Schmid and J Ponce ldquoBeyond bags of featuresspatial pyramid matching for recognizing natural scene cate-goriesrdquo in Proceedings of the IEEE Computer Society Conferenceon Computer Vision and Pattern Recognition (CVPR rsquo06) vol 2pp 2169ndash2178 New York NY USA June 2006

[15] C Wang B Zhang Z Qin and J Xiong ldquoSpatial weighting forbag-of-features based image retrievalrdquo in Integrated Uncertaintyin Knowledge Modelling and Decision Making Z Qin and V-NHuynh Eds vol 8032 of Lecture Notes in Computer Science pp91ndash100 Springer Berlin Germany 2013

[16] N Ali K B Bajwa R Sablatnig and Z Mehmood ldquoImageretrieval by addition of spatial information based on histogramsof triangular regionsrdquoComputers amp Electrical Engineering 2016

[17] SM Youssef ldquoICTEDCT-CBIR integrating curvelet transformwith enhanced dominant colors extraction and texture analysisfor efficient content-based image retrievalrdquo Computers andElectrical Engineering vol 38 no 5 pp 1358ndash1376 2012

[18] P Poursistani H Nezamabadi-pour R Askari Moghadam andM Saeed ldquoImage indexing and retrieval in JPEG compresseddomain based on vector quantizationrdquoMathematical and Com-puter Modelling vol 57 no 5-6 pp 1005ndash1017 2013

[19] X Tian L Jiao X Liu and X Zhang ldquoFeature integration ofEODH and Color-SIFT application to image retrieval based oncodebookrdquo Signal Processing Image Communication vol 29 no4 pp 530ndash545 2014

[20] D N M Cardoso J Muller F Alexandre L A P Neves P MG Trevisani andG A Giraldi ldquoIterative technique for content-based image retrieval using multiple SVM ensemblesrdquo in ATreatise on Electricity andMagnetism J ClerkMaxwell Ed vol2 pp 68ndash73 Cambridge University Press 1873

[21] E Yildizer A M Balci M Hassan and R Alhajj ldquoEfficientcontent-based image retrieval using multiple support vectormachines ensemblerdquo Expert Systems with Applications vol 39no 3 pp 2385ndash2396 2012

[22] N Dalal and B Triggs ldquoHistograms of oriented gradients forhuman detectionrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo05) vol 1 pp 886ndash893 IEEE SanDiego Calif USA June 2005

[23] D G Lowe ldquoDistinctive image features from scale-invariantkeypointsrdquo International Journal of Computer Vision vol 60 no2 pp 91ndash110 2004

[24] J Matas O Chum M Urban and T Pajdla ldquoRobust wide-baseline stereo from maximally stable extremal regionsrdquo Imageand Vision Computing vol 22 no 10 pp 761ndash767 2004

[25] H Bay T Tuytelaars and L Van Gool ldquoSurf speeded uprobust featuresrdquo in Computer VisionmdashECCV 2006 pp 404ndash417Springer 2006

[26] D J Mankowitz and S Ramamoorthy ldquoBRISK-based visualfeature extraction for resource constrained robotsrdquo in RoboCup2013 Robot World Cup XVII S Behnke M Veloso A Visserand R Xiong Eds vol 8371 of Lecture Notes in ComputerScience pp 195ndash206 Springer Berlin Germany 2014

[27] S Krig ldquoInterest point detector and feature descriptor surveyrdquoin Computer Vision Metrics pp 217ndash282 Springer BerlinGermany 2014

[28] R Ashraf K Bashir A Irtaza and M T Mahmood ldquoContentbased image retrieval using embedded neural networks withbandletized regionsrdquo Entropy vol 17 no 6 pp 3552ndash3580 2015

12 Mathematical Problems in Engineering

[29] S Zeng R Huang H Wang and Z Kang ldquoImage retrievalusing spatiograms of colors quantized by Gaussian MixtureModelsrdquo Neurocomputing vol 171 pp 673ndash684 2016

[30] E Walia and A Pal ldquoFusion framework for effective colorimage retrievalrdquo Journal of Visual Communication and ImageRepresentation vol 25 no 6 pp 1335ndash1348 2014

[31] A Irtaza and M A Jaffar ldquoCategorical image retrieval throughgenetically optimized support vector machines (GOSVM) andhybrid texture featuresrdquo Signal Image and Video Processing vol9 no 7 pp 1503ndash1519 2014

[32] J Yu Z Qin T Wan and X Zhang ldquoFeature integrationanalysis of bag-of-features model for image retrievalrdquo Neuro-computing vol 120 pp 355ndash364 2013

[33] E Yildizer A M Balci T N Jarada and R Alhajj ldquoIntegratingwavelets with clustering and indexing for effective content-based image retrievalrdquoKnowledge-Based Systems vol 31 pp 55ndash66 2012

[34] J Shawe-Taylor and N Cristianini Kernel Methods for PatternAnalysis Cambridge University Press New York NY USA2004

[35] AVedaldi andA Zisserman ldquoSparse kernel approximations forefficient classification and detectionrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition (CVPRrsquo12) pp 2320ndash2327 June 2012

[36] E Nowak F Jurie and B Triggs ldquoSampling strategies for bag-of-features image classifiationrdquo in Computer VisionmdashECCV2006 pp 490ndash503 Springer 2006

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings oftheWorkshop on Statistical Learning in Computer Vision (ECCVrsquo04) vol 1 pp 1ndash2 Prague Czech Republic 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 10: Research Article A Novel Image Retrieval Based on a Combination …downloads.hindawi.com/journals/mpe/2016/8217250.pdf · 2019-07-30 · image retrieval by using a combination of

10 Mathematical Problems in Engineering

Figure 13 Samples of images from 5 semantic classes of Ground Truth image benchmark

8284 8133

62805909

Proposed researchCardoso et al

Yildizer et alYildizer et al

45

50

55

60

65

70

75

80

85

MA

P (

)

Figure 14 Comparison of MAP with the state-of-the-art methodson Ground Truth image benchmark

Table 4 Classwise comparison of average precision

Class amp method Proposed method [20] [21]Abro green 7525 80 6667Cherries 7194 80 50Football 9018 100 75Green Lake 8423 80 50Swiss Mountains 9263 6067 50

30 40 and 50) of visual vocabulary are constructed andMAP on the vocabulary of these different sizes is calculatedAccording to the experimental results the MAP of 8284 isobtained by using the proposed image representation (with avocabulary size of 40 visual words on pixel stride of 5 andby using 100 features per image) The classwise averageprecision obtained from the proposed image representationis presented in Table 4 while MAP is graphically presentedin Figure 14

Table 5 Computational cost (in seconds) of the proposed algo-rithm

Number of images retrieved Proposed researchTop-5 02872Top-10 03972Top-15 06470Top-20 07837Top-25 09184Top-30 11103

Experimental results and comparisons conducted on theGround Truth image benchmark prove the robustness of theproposed research based on a combination of local and globalhistograms of visual words The MAP obtained from theproposed image representation is higher than the existingstate-of-the-art research [20 21 33]

44 Complexity Performance The computational cost of theproposed algorithm is calculated using Intel(R) Core i3(fourth generation) 17 Ghz CPU with 4GB RAM (DDR3)3MB L3 cache and Windows 7 operating system Theproposed algorithm is implemented in MATLAB and visualvocabulary is constructed offline using a training dataset andtested at run time using a test dataset The average CPUtime (in seconds) required from features extraction to imageretrieval is presented in Table 5

5 Conclusion and Future Directions

In this paper we proposed a novel image representationbased on a combination of local and global histograms ofvisual words The local histogram of visual words adds thespatial information of the central area to the inverted index ofBoVW representation The combination of local and globalhistograms of visual words is a possible solution to capturethe semantic information from the image The performance

Mathematical Problems in Engineering 11

of the proposed image representation is evaluated on threechallenging image datasets The proposed image represen-tation outperforms the state-of-the-art research includingthe standard BoVW representation For the future work weplan to replace the BoVW model with either Fisher kernelframework or the Vector of Locally Aggregated Descriptors(VLAD) model to evaluate a large-scale image retrieval

Competing Interests

The authors declare that they have no competing interests

References

[1] A-M Tousch S Herbin and J-Y Audibert ldquoSemantic hierar-chies for image annotation a surveyrdquo Pattern Recognition vol45 no 1 pp 333ndash345 2012

[2] R Datta D Joshi J Li and J Z Wang ldquoImage retrieval ideasinfluences and trends of the new agerdquoACMComputing Surveysvol 40 no 2 article 5 2008

[3] Y Liu D Zhang G Lu and W-Y Ma ldquoA survey of content-based image retrieval with high-level semanticsrdquo Pattern Recog-nition vol 40 no 1 pp 262ndash282 2007

[4] A Irtaza M A Jaffar E Aleisa and T-S Choi ldquoEmbeddingneural networks for semantic association in content basedimage retrievalrdquoMultimedia Tools and Applications vol 72 no2 pp 1911ndash1931 2014

[5] D ZhangMM Islam andG Lu ldquoA reviewon automatic imageannotation techniquesrdquo Pattern Recognition vol 45 no 1 pp346ndash362 2012

[6] J Sivic and A Zisserman ldquoVideo google a text retrievalapproach to object matching in videosrdquo in Proceedings of the 9thIEEE International Conference on Computer Vision pp 1470ndash1477 IEEE 2003

[7] J PhilbinO ChumM Isard J Sivic andA Zisserman ldquoObjectretrieval with large vocabularies and fast spatial matchingrdquoin Proceedings of the IEEE Computer Society Conference onComputer Vision and Pattern Recognition (CVPR rsquo07) pp 1ndash8IEEE Minneapolis Minn USA June 2007

[8] O Chum J Philbin J Sivic M Isard and A Zisserman ldquoTotalrecall automatic query expansion with a generative featuremodel for object retrievalrdquo in Proceedings of the 2007 IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[9] J PhilbinOChumM Isard J Sivic andA Zisserman ldquoLost inquantization improving particular object retrieval in large scaleimage databasesrdquo in Proceedings of the 26th IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo08) pp 1ndash8IEEE Anchorage Alaska USA June 2008

[10] H Anwar S Zambanini M Kampel and K VondrovecldquoAncient coin classification using reverse motif recognitionimage-based classification of roman republican coinsrdquo IEEESignal Processing Magazine vol 32 no 4 pp 64ndash74 2015

[11] S Zhang Q Tian G Hua Q Huang and W Gao ldquoGeneratingdescriptive visual words and visual phrases for large-scale imageapplicationsrdquo IEEETransactions on Image Processing vol 20 no9 pp 2664ndash2677 2011

[12] Y Su and F Jurie ldquoImproving image classification using seman-tic attributesrdquo International Journal of Computer Vision vol 100no 1 pp 59ndash77 2012

[13] Z Mehmood S M Anwar M Altaf and N Ali ldquoA novelimage retrieval based on rectangular spatial histograms of visualwordsrdquo Kuwait Journal of Science In press

[14] S Lazebnik C Schmid and J Ponce ldquoBeyond bags of featuresspatial pyramid matching for recognizing natural scene cate-goriesrdquo in Proceedings of the IEEE Computer Society Conferenceon Computer Vision and Pattern Recognition (CVPR rsquo06) vol 2pp 2169ndash2178 New York NY USA June 2006

[15] C Wang B Zhang Z Qin and J Xiong ldquoSpatial weighting forbag-of-features based image retrievalrdquo in Integrated Uncertaintyin Knowledge Modelling and Decision Making Z Qin and V-NHuynh Eds vol 8032 of Lecture Notes in Computer Science pp91ndash100 Springer Berlin Germany 2013

[16] N Ali K B Bajwa R Sablatnig and Z Mehmood ldquoImageretrieval by addition of spatial information based on histogramsof triangular regionsrdquoComputers amp Electrical Engineering 2016

[17] SM Youssef ldquoICTEDCT-CBIR integrating curvelet transformwith enhanced dominant colors extraction and texture analysisfor efficient content-based image retrievalrdquo Computers andElectrical Engineering vol 38 no 5 pp 1358ndash1376 2012

[18] P Poursistani H Nezamabadi-pour R Askari Moghadam andM Saeed ldquoImage indexing and retrieval in JPEG compresseddomain based on vector quantizationrdquoMathematical and Com-puter Modelling vol 57 no 5-6 pp 1005ndash1017 2013

[19] X Tian L Jiao X Liu and X Zhang ldquoFeature integration ofEODH and Color-SIFT application to image retrieval based oncodebookrdquo Signal Processing Image Communication vol 29 no4 pp 530ndash545 2014

[20] D N M Cardoso J Muller F Alexandre L A P Neves P MG Trevisani andG A Giraldi ldquoIterative technique for content-based image retrieval using multiple SVM ensemblesrdquo in ATreatise on Electricity andMagnetism J ClerkMaxwell Ed vol2 pp 68ndash73 Cambridge University Press 1873

[21] E Yildizer A M Balci M Hassan and R Alhajj ldquoEfficientcontent-based image retrieval using multiple support vectormachines ensemblerdquo Expert Systems with Applications vol 39no 3 pp 2385ndash2396 2012

[22] N Dalal and B Triggs ldquoHistograms of oriented gradients forhuman detectionrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo05) vol 1 pp 886ndash893 IEEE SanDiego Calif USA June 2005

[23] D G Lowe ldquoDistinctive image features from scale-invariantkeypointsrdquo International Journal of Computer Vision vol 60 no2 pp 91ndash110 2004

[24] J Matas O Chum M Urban and T Pajdla ldquoRobust wide-baseline stereo from maximally stable extremal regionsrdquo Imageand Vision Computing vol 22 no 10 pp 761ndash767 2004

[25] H Bay T Tuytelaars and L Van Gool ldquoSurf speeded uprobust featuresrdquo in Computer VisionmdashECCV 2006 pp 404ndash417Springer 2006

[26] D J Mankowitz and S Ramamoorthy ldquoBRISK-based visualfeature extraction for resource constrained robotsrdquo in RoboCup2013 Robot World Cup XVII S Behnke M Veloso A Visserand R Xiong Eds vol 8371 of Lecture Notes in ComputerScience pp 195ndash206 Springer Berlin Germany 2014

[27] S Krig ldquoInterest point detector and feature descriptor surveyrdquoin Computer Vision Metrics pp 217ndash282 Springer BerlinGermany 2014

[28] R Ashraf K Bashir A Irtaza and M T Mahmood ldquoContentbased image retrieval using embedded neural networks withbandletized regionsrdquo Entropy vol 17 no 6 pp 3552ndash3580 2015

12 Mathematical Problems in Engineering

[29] S Zeng R Huang H Wang and Z Kang ldquoImage retrievalusing spatiograms of colors quantized by Gaussian MixtureModelsrdquo Neurocomputing vol 171 pp 673ndash684 2016

[30] E Walia and A Pal ldquoFusion framework for effective colorimage retrievalrdquo Journal of Visual Communication and ImageRepresentation vol 25 no 6 pp 1335ndash1348 2014

[31] A Irtaza and M A Jaffar ldquoCategorical image retrieval throughgenetically optimized support vector machines (GOSVM) andhybrid texture featuresrdquo Signal Image and Video Processing vol9 no 7 pp 1503ndash1519 2014

[32] J Yu Z Qin T Wan and X Zhang ldquoFeature integrationanalysis of bag-of-features model for image retrievalrdquo Neuro-computing vol 120 pp 355ndash364 2013

[33] E Yildizer A M Balci T N Jarada and R Alhajj ldquoIntegratingwavelets with clustering and indexing for effective content-based image retrievalrdquoKnowledge-Based Systems vol 31 pp 55ndash66 2012

[34] J Shawe-Taylor and N Cristianini Kernel Methods for PatternAnalysis Cambridge University Press New York NY USA2004

[35] AVedaldi andA Zisserman ldquoSparse kernel approximations forefficient classification and detectionrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition (CVPRrsquo12) pp 2320ndash2327 June 2012

[36] E Nowak F Jurie and B Triggs ldquoSampling strategies for bag-of-features image classifiationrdquo in Computer VisionmdashECCV2006 pp 490ndash503 Springer 2006

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings oftheWorkshop on Statistical Learning in Computer Vision (ECCVrsquo04) vol 1 pp 1ndash2 Prague Czech Republic 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 11: Research Article A Novel Image Retrieval Based on a Combination …downloads.hindawi.com/journals/mpe/2016/8217250.pdf · 2019-07-30 · image retrieval by using a combination of

Mathematical Problems in Engineering 11

of the proposed image representation is evaluated on threechallenging image datasets The proposed image represen-tation outperforms the state-of-the-art research includingthe standard BoVW representation For the future work weplan to replace the BoVW model with either Fisher kernelframework or the Vector of Locally Aggregated Descriptors(VLAD) model to evaluate a large-scale image retrieval

Competing Interests

The authors declare that they have no competing interests

References

[1] A-M Tousch S Herbin and J-Y Audibert ldquoSemantic hierar-chies for image annotation a surveyrdquo Pattern Recognition vol45 no 1 pp 333ndash345 2012

[2] R Datta D Joshi J Li and J Z Wang ldquoImage retrieval ideasinfluences and trends of the new agerdquoACMComputing Surveysvol 40 no 2 article 5 2008

[3] Y Liu D Zhang G Lu and W-Y Ma ldquoA survey of content-based image retrieval with high-level semanticsrdquo Pattern Recog-nition vol 40 no 1 pp 262ndash282 2007

[4] A Irtaza M A Jaffar E Aleisa and T-S Choi ldquoEmbeddingneural networks for semantic association in content basedimage retrievalrdquoMultimedia Tools and Applications vol 72 no2 pp 1911ndash1931 2014

[5] D ZhangMM Islam andG Lu ldquoA reviewon automatic imageannotation techniquesrdquo Pattern Recognition vol 45 no 1 pp346ndash362 2012

[6] J Sivic and A Zisserman ldquoVideo google a text retrievalapproach to object matching in videosrdquo in Proceedings of the 9thIEEE International Conference on Computer Vision pp 1470ndash1477 IEEE 2003

[7] J PhilbinO ChumM Isard J Sivic andA Zisserman ldquoObjectretrieval with large vocabularies and fast spatial matchingrdquoin Proceedings of the IEEE Computer Society Conference onComputer Vision and Pattern Recognition (CVPR rsquo07) pp 1ndash8IEEE Minneapolis Minn USA June 2007

[8] O Chum J Philbin J Sivic M Isard and A Zisserman ldquoTotalrecall automatic query expansion with a generative featuremodel for object retrievalrdquo in Proceedings of the 2007 IEEE 11thInternational Conference on Computer Vision (ICCV rsquo07) pp 1ndash8 IEEE Rio de Janeiro Brazil October 2007

[9] J PhilbinOChumM Isard J Sivic andA Zisserman ldquoLost inquantization improving particular object retrieval in large scaleimage databasesrdquo in Proceedings of the 26th IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo08) pp 1ndash8IEEE Anchorage Alaska USA June 2008

[10] H Anwar S Zambanini M Kampel and K VondrovecldquoAncient coin classification using reverse motif recognitionimage-based classification of roman republican coinsrdquo IEEESignal Processing Magazine vol 32 no 4 pp 64ndash74 2015

[11] S Zhang Q Tian G Hua Q Huang and W Gao ldquoGeneratingdescriptive visual words and visual phrases for large-scale imageapplicationsrdquo IEEETransactions on Image Processing vol 20 no9 pp 2664ndash2677 2011

[12] Y Su and F Jurie ldquoImproving image classification using seman-tic attributesrdquo International Journal of Computer Vision vol 100no 1 pp 59ndash77 2012

[13] Z Mehmood S M Anwar M Altaf and N Ali ldquoA novelimage retrieval based on rectangular spatial histograms of visualwordsrdquo Kuwait Journal of Science In press

[14] S Lazebnik C Schmid and J Ponce ldquoBeyond bags of featuresspatial pyramid matching for recognizing natural scene cate-goriesrdquo in Proceedings of the IEEE Computer Society Conferenceon Computer Vision and Pattern Recognition (CVPR rsquo06) vol 2pp 2169ndash2178 New York NY USA June 2006

[15] C Wang B Zhang Z Qin and J Xiong ldquoSpatial weighting forbag-of-features based image retrievalrdquo in Integrated Uncertaintyin Knowledge Modelling and Decision Making Z Qin and V-NHuynh Eds vol 8032 of Lecture Notes in Computer Science pp91ndash100 Springer Berlin Germany 2013

[16] N Ali K B Bajwa R Sablatnig and Z Mehmood ldquoImageretrieval by addition of spatial information based on histogramsof triangular regionsrdquoComputers amp Electrical Engineering 2016

[17] SM Youssef ldquoICTEDCT-CBIR integrating curvelet transformwith enhanced dominant colors extraction and texture analysisfor efficient content-based image retrievalrdquo Computers andElectrical Engineering vol 38 no 5 pp 1358ndash1376 2012

[18] P Poursistani H Nezamabadi-pour R Askari Moghadam andM Saeed ldquoImage indexing and retrieval in JPEG compresseddomain based on vector quantizationrdquoMathematical and Com-puter Modelling vol 57 no 5-6 pp 1005ndash1017 2013

[19] X Tian L Jiao X Liu and X Zhang ldquoFeature integration ofEODH and Color-SIFT application to image retrieval based oncodebookrdquo Signal Processing Image Communication vol 29 no4 pp 530ndash545 2014

[20] D N M Cardoso J Muller F Alexandre L A P Neves P MG Trevisani andG A Giraldi ldquoIterative technique for content-based image retrieval using multiple SVM ensemblesrdquo in ATreatise on Electricity andMagnetism J ClerkMaxwell Ed vol2 pp 68ndash73 Cambridge University Press 1873

[21] E Yildizer A M Balci M Hassan and R Alhajj ldquoEfficientcontent-based image retrieval using multiple support vectormachines ensemblerdquo Expert Systems with Applications vol 39no 3 pp 2385ndash2396 2012

[22] N Dalal and B Triggs ldquoHistograms of oriented gradients forhuman detectionrdquo in Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPRrsquo05) vol 1 pp 886ndash893 IEEE SanDiego Calif USA June 2005

[23] D G Lowe ldquoDistinctive image features from scale-invariantkeypointsrdquo International Journal of Computer Vision vol 60 no2 pp 91ndash110 2004

[24] J Matas O Chum M Urban and T Pajdla ldquoRobust wide-baseline stereo from maximally stable extremal regionsrdquo Imageand Vision Computing vol 22 no 10 pp 761ndash767 2004

[25] H Bay T Tuytelaars and L Van Gool ldquoSurf speeded uprobust featuresrdquo in Computer VisionmdashECCV 2006 pp 404ndash417Springer 2006

[26] D J Mankowitz and S Ramamoorthy ldquoBRISK-based visualfeature extraction for resource constrained robotsrdquo in RoboCup2013 Robot World Cup XVII S Behnke M Veloso A Visserand R Xiong Eds vol 8371 of Lecture Notes in ComputerScience pp 195ndash206 Springer Berlin Germany 2014

[27] S Krig ldquoInterest point detector and feature descriptor surveyrdquoin Computer Vision Metrics pp 217ndash282 Springer BerlinGermany 2014

[28] R Ashraf K Bashir A Irtaza and M T Mahmood ldquoContentbased image retrieval using embedded neural networks withbandletized regionsrdquo Entropy vol 17 no 6 pp 3552ndash3580 2015

12 Mathematical Problems in Engineering

[29] S Zeng R Huang H Wang and Z Kang ldquoImage retrievalusing spatiograms of colors quantized by Gaussian MixtureModelsrdquo Neurocomputing vol 171 pp 673ndash684 2016

[30] E Walia and A Pal ldquoFusion framework for effective colorimage retrievalrdquo Journal of Visual Communication and ImageRepresentation vol 25 no 6 pp 1335ndash1348 2014

[31] A Irtaza and M A Jaffar ldquoCategorical image retrieval throughgenetically optimized support vector machines (GOSVM) andhybrid texture featuresrdquo Signal Image and Video Processing vol9 no 7 pp 1503ndash1519 2014

[32] J Yu Z Qin T Wan and X Zhang ldquoFeature integrationanalysis of bag-of-features model for image retrievalrdquo Neuro-computing vol 120 pp 355ndash364 2013

[33] E Yildizer A M Balci T N Jarada and R Alhajj ldquoIntegratingwavelets with clustering and indexing for effective content-based image retrievalrdquoKnowledge-Based Systems vol 31 pp 55ndash66 2012

[34] J Shawe-Taylor and N Cristianini Kernel Methods for PatternAnalysis Cambridge University Press New York NY USA2004

[35] AVedaldi andA Zisserman ldquoSparse kernel approximations forefficient classification and detectionrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition (CVPRrsquo12) pp 2320ndash2327 June 2012

[36] E Nowak F Jurie and B Triggs ldquoSampling strategies for bag-of-features image classifiationrdquo in Computer VisionmdashECCV2006 pp 490ndash503 Springer 2006

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings oftheWorkshop on Statistical Learning in Computer Vision (ECCVrsquo04) vol 1 pp 1ndash2 Prague Czech Republic 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 12: Research Article A Novel Image Retrieval Based on a Combination …downloads.hindawi.com/journals/mpe/2016/8217250.pdf · 2019-07-30 · image retrieval by using a combination of

12 Mathematical Problems in Engineering

[29] S Zeng R Huang H Wang and Z Kang ldquoImage retrievalusing spatiograms of colors quantized by Gaussian MixtureModelsrdquo Neurocomputing vol 171 pp 673ndash684 2016

[30] E Walia and A Pal ldquoFusion framework for effective colorimage retrievalrdquo Journal of Visual Communication and ImageRepresentation vol 25 no 6 pp 1335ndash1348 2014

[31] A Irtaza and M A Jaffar ldquoCategorical image retrieval throughgenetically optimized support vector machines (GOSVM) andhybrid texture featuresrdquo Signal Image and Video Processing vol9 no 7 pp 1503ndash1519 2014

[32] J Yu Z Qin T Wan and X Zhang ldquoFeature integrationanalysis of bag-of-features model for image retrievalrdquo Neuro-computing vol 120 pp 355ndash364 2013

[33] E Yildizer A M Balci T N Jarada and R Alhajj ldquoIntegratingwavelets with clustering and indexing for effective content-based image retrievalrdquoKnowledge-Based Systems vol 31 pp 55ndash66 2012

[34] J Shawe-Taylor and N Cristianini Kernel Methods for PatternAnalysis Cambridge University Press New York NY USA2004

[35] AVedaldi andA Zisserman ldquoSparse kernel approximations forefficient classification and detectionrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition (CVPRrsquo12) pp 2320ndash2327 June 2012

[36] E Nowak F Jurie and B Triggs ldquoSampling strategies for bag-of-features image classifiationrdquo in Computer VisionmdashECCV2006 pp 490ndash503 Springer 2006

[37] G Csurka C Dance L Fan J Willamowski and C BrayldquoVisual categorization with bags of keypointsrdquo in Proceedings oftheWorkshop on Statistical Learning in Computer Vision (ECCVrsquo04) vol 1 pp 1ndash2 Prague Czech Republic 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 13: Research Article A Novel Image Retrieval Based on a Combination …downloads.hindawi.com/journals/mpe/2016/8217250.pdf · 2019-07-30 · image retrieval by using a combination of

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of