CBIR has long been an active area of research within the multimedia information retrieval community. Over the years, CBIR systems have evolved in complexity and performance. Among several open research issues in CBIR, use of multiple image features in conjunction is the most important one. Studies reveal that it is difficult to find a single feature which performs well across all possible scenarios, whereas using multiple features in conjunction has proven to be a good strategy. The main concern when using multiple features is how to weigh these features. One obvious method is giving equal weights. In tasks like image classification and object recognition, there is usually a training phase when feature weights can be ascertained. However, in image retrieval we have the practical problem of not knowing the relevance class of an image as not having a relevant training set. This limitation has forced the development of different relevance feedback techniques, which has its own very rich literature.
1.1 Previous works on CBIR
In the literature there is a plethora of work which deals with CBIR. However, in this paper, we are highlighting the ones which are relevant to our work.
A probabilistic framework for efficient retrieval and indexing of image collections was proposed in . This framework uncovers the hierarchical structure underlying the collection from image features based on a hybrid model that combines both generative and discriminative learning. The generalized Dirichlet mixture and maximum likelihood model was explored in 
for the generative learning in order to estimate accurately the statistical model of the data. In
, a model for content-based image retrieval CBIR is proposed which depends only on extracting the most relevant features according to a feature selection technique. The suggested feature selection technique aims at selecting the optimal features that not only maximize the detection rate but also simplify the computation of the image retrieval process. In, a novel CBIR scheme was proposed that exploits statistical features computed using the Multi-scale Geometric Analysis (MGA) of Non-subsampled Contourlet Transform (NSCT).
A Comparative Study on feature extraction using texture and shape for CBIR is given in. The significance of the Local Binary Pattern (LBP) feature was discussed in . In , a novel image feature representation method using color difference histograms (CDH) was proposed for image retrieval. A novel CBIR approach was proposed in 
, which uses a well-known clustering algorithm k-means and a database indexing structure B+ tree to facilitate efficient retrieval of relevant images. Cluster validity analysis indexes combined with majority voting are employed to verify the appropriate number of clusters. Minimum distance criteria was used to identify image cluster(s) closer to the query image. Daubechies wavelet transformation was used for extracting feature vectors from the images. The work proposed in and  also gives weights to features and improves the accuracy of CBIR system. The work proposed in , presents a hybrid approach to reduce the semantic gap between low level visual features and high level semantics, through simultaneous feature adaptation and feature selection.A mixed gravitational search algorithm (MGSA) is proposed by the authors of . Feature extraction parameters are optimized to reach a maximum precision of the CBIR systems. An extremely fast CBIR system was proposed in 
, which uses Multiple Support Vector Machines Ensemble.
RF is an effective approach 
to bridge the gap between low-level visual features and high-level semantic meanings in CBIR. To reduce the computational complexity of traditional SVM based RF, an active SVM-based RF using multiple classifiers ensemble and features re-weighting is proposed in
. At first the most informative images are selected by using active learning method, then the feature space is modified dynamically by appropriately weighting the descriptive features according to a set of statistical characteristics. Finally the weight vectors of component SVM classifiers are computed dynamically by using the parameters for positive and negative samples. A comparative study of major challenges in RF for CBIR is given in. Authors of  have proposed a long term learning scheme in relevance feedback for CBIR.
A novel perspective to retrieval partial-duplicate images with Contented-based Saliency Region (CSR) is proposed in . The content of CSR is represented with the BOV model while saliency analysis is employed to ensure the high visual attention of CSR. A relative saliency ordering constraint is also proposed by the authors that captures a weak saliency relative layout among interest points in the saliency regions.
The primary contribution of this paper are: (i) design of a multi-feature CBIR system, (ii) automatic weight selection of individual features, (iii) extensive experiments of four publicly available datasets and comparison with state-of-the-art techniques.
1.3 Organization of the paper
The organization of the paper is as follows: Section 2 briefly describes our proposed CBIR framework. In Sec. 3, we describe the four different feature descriptors used in this paper, along with the appropriate distance metric. Our proposed indexing technique is discussed in Sec. 4. Next in Sec. 5, we describe the two different automatic feature weight selection methods. The experimental results are discussed in Sec. 6. The positives and drawbacks of the proposed framework is given in Sec. 7, followed by the concluding remarks in Sec. 8.
2 Brief description of work
Figure 1 depicts the flow diagram of our proposed CBIR framework. We extract features of images in dataset and create a database of feature descriptors. Then we use features from database to train classifier model. We have used of features for training set and for testing set. Once we have trained classifier, we select a query image and predict top categories (empirically determined). By doing this we have reduced search space for similar images. To search for similar images we select one or all feature descriptors. In case we selected all feature descriptors we assign them initial weights and then search for similar images using initial combination of weights. Then, we iteratively obtain the optimal weights for the individual feature descriptors. We retrieve similar images with optimally assigned weights and compute the PR curve to analyse the performance.
3 Feature Descriptors
A brief description of the feature descriptors used here, is given in the following sub-sections.
3.1 Color Difference Histogram (CDH)
CDH counts the perceptually uniform color difference between two points under different backgrounds with regard to the colors and edge orientations in L*a*b* color space. Implementing CDH includes conversion from RGB to CLE L*a*b* colour space, detection of edge orientation, colour quantization in CLE L*a*b* colour space. This method pays more attention to color, edge orientation and perceptually uniform color differences, and encodes color, orientation and perceptually uniform color difference via feature representation in a similar manner to the human visual system.
3.2 Local Binary Pattern (LBP)
The generic local binary pattern operator is derived from joint distribution. As in the case of basic LBP is obtained by summing the thresholded differences weighted by powers of two. Theoperator is defined as
In this equation the signs of the differences in a neighbourhood are interpreted as a P-bit binary number, resulting in distinct values for the LBP code as:
It computes the distribution for a given image. sample.
3.3 Color Layout Descriptor (CLD)
CLD is designed to capture the spatial distribution of color in an image. The CLD is a very compact and resolution-invariant Representation of color for high-speed image retrieval and was Designed to efficiently represent the spatial distribution of colors. The extraction procedure of CLD consist of stages: image partitioning, representative color selection, DCT transformation, and zigzag scanning. Finally we obtain the zigzag scanned matrices . The distance between two CLD descriptors is computed as:
Where, are the weight matrices and is the element of the matrices respectively.
3.4 Edge Oriented Histogram (EOH)
The primary objective of EOH is to first detect the edges present in an image and then construct a histogram of the directions of the gradients of those edges. First edges are detected through the Sobel operator. An image is convoluted using these masks and results in matrices indicating the edge strength of the edge for a given orientation. A histogram is created from these images by counting the maximum gradient in the five different directions.
Image indexing facilitates reducing the search space for similar images for a given query image. Given number of image categories, our proposed technique reduces the search space to categories, where . We have used multiclass support vector machines (SVM) to classify the top categories having images similar to query image (see Alg. 2). For each category we have created a binary SVM to predict whether a test image belongs to category of binary support vector machine or not. The SVM classification score for classifying observation is the signed distance from to the decision boundary ranging from to . A positive score for a class indicates that is predicted to be in that class, a negative score indicates otherwise. We computed SVM classification scores for query image using all binary support vector machines. Then we search for similar images in reduced search space.
5 Automatic Weight Selection
For a given image importance of a feature descriptor is different. Therefore, we used a combination of feature descriptors to find the images similar to a query. We have used CDH, LBP, CLD and EOH as feature descriptors (see Sec. 3 for details). The task is to assign appropriate weight to the individual descriptors. In this paper we are proposing two methods to achieve automatic weight selection. Following sub-sections give details of those two methods.
5.1 Method 1: Relevant Ratio Technique
Initial weights are assigned to the descriptors by unit normalising a set of area under the PR curve for a particular query image. Then we updated these assigned weights in each iteration as:
where, is the iteration number, is the number of relevant images in top similar images when one feature descriptor is used only, is the number of relevant images in top similar images when combination of feature descriptor is used with recently updated weights, is increment factor (any positive value greater than ). Algorithm 3 outlines the steps to be followed for the proposed automatic feature weight update method using the relevance ratio.
5.2 Method 2: Mean Difference Method
Algorithm 4 gives the steps of the proposed mean difference technique. Based on the total value of all the descriptors, their initial value is allocated according to their individual weighs over the total weight. By this we increase the contribution of the feature descriptor having greater initial weight allocated to it. This technique converges till the best feature descriptor (one with maximum initial weight) is allocated . Until that we store the weights allocated at each iteration in a matrix and finally calculate that weights which gives maximum P-R value among all iterations.
|Database Name||Image dimension||No. of Class||Samples/ Class||Total|
|Oxford Flower (D1)||17||80||1360|
|Natural Images (D2)||16||40||640|
|Corel 1 (D3)||16||40||640|
|Corel 2 (D4)||16||40||640|
6.1 Experimental setup
We have used three image data sets namely, natural image data set, two subsets of Corel image data sets . Each data set (see Tab. 1 ) is comprised of total images of different categories. Images have been resized to dimension for feature extraction. Retrieval accuracy depends not only on strong feature representation but also on good similarity measures or distance metrics. With these distance metrics we compute content similarity of images. For each image in dataset an dimensional feature vector is extracted and stored in database. Let be a feature vector of the query image. Table 2 lists out the different distance metric used to calculate the distance between a pair of images. Here and .
6.2 Performance metrics
Precision-Recall (PR) graph is used for measuring the accuracy of our proposed CBIR system. Precision and Recall are based on understanding and measure of relevance. Given a set of retrieved images, the measure of precision shows the percentage of relevant samples retrieved by a search engine. On the other hand, recall signifies that out of the total number of the relevant samples for a given class, how many samples are retrieved by the system. For an ideal system for higher value of recall the system should yield higher precision. In our experiments, we have calculated the precision recall values forrandomly selected query samples from the test set. After that the recall values are normalized to a scale of to
and the corresponding precision values are interpolated. This is repeated for all the distance metrics used in our system.
6.3 Qualitative Results
The figures below shows top 8 images(similar) retrieved with respect to our Query image shown on left. Results obtained by employing different distance metrices are shown hierarchically one after the other. The images are arranged sequentially according to their ranking of similarity.Correctly retrieved images are outlined as green whereas false images are highlighted with red border.
|Feature Metric||Dataset 1||Dataset 2||Dataset 3|
6.4 Quantitative Results
We took random images from each image dataset. We plotted average precision recall curve for each individual feature descriptors and then for combined feature descriptors. We did this using Canberra (M1), chi-square (M2) and euclidean distance (M3) metrics. Table 4 shows the value of area under the curve of precision recall curve for three datasets that we used. Value of area under the precision recall curve is the measure of accuracy of retrieval systems. Highest accuracy was achieved when we combined feature descriptors and assigned optimal weights using relevant ratio technique.
For each of the dataset average precision recall curve for individual feature descriptors and combined feature descriptors using the three different distance metrics are shown in Fig. 6. On horizontal axis we have recall and vertical axis we have precision. Individually color difference histogram is the best feature descriptor. Curve corresponding to combined feature descriptors have highest value of area under the precision curve in each plot.
We have implemented image-indexing and automated weight selection. Using image indexing we are predicting top 3 categories of images that have images similar to query image. By doing so we are reducing our search space by 81.25%. This prediction have impact while we search for top 10 similar images. In automated weight selection, we are initially assigning weights to feature descriptors by unit normalizing the values of area under the precision recall curves for corresponding feature descriptors. We have used two techniques to automate these weights.
With automatically assigned weights to feature descriptors, we are getting much accurate results than individual feature descriptors. We can see in precision recall curves(above) the area under the precision recall curve is more for automatically assigned weights than individual feature descriptors. We are able to find the most important feature descriptor for a particular query image out of four feature descriptors that we have used. We can see this in tables in Qualitative results section. Most important feature descriptors have been assigned the maximum weights.
With the advent of various search engines, image searching has become an easier task. But search engines mostly use text based retrieval techniques. Though CBIR is a happening topic, we cannot expect the entire upheaval of existing techniques with CBIR. But certainly, CBIR can be used to complement the existing machinery to provide better results. The CBIR method presented herein use combination of local and global features . The purpose of this paper was to improve the accuracy of a CBIR application by allowing the system to retrieve more images similar to the query image. The proposed methodology first use image indexing to reduce the search space for similar images to query image, then assign optimal weights to individual feature to use feature combination. It also predict the most important feature( out of used ones ) to a query image.
Liu, Guang-Hai, and Jing-Yu Yang (2013)
Content-based image retrieval using color difference histogram. In: Pattern Recognition 46.1:188-198.
-  Dataset : Nilsback, Maria-Elena, and Andrew Zisserman.(2006) A visual vocabulary for flower classification. In: IEEE Computer Society Conference on. Vol. 2. IEEE, 2006.
Pietikäinen, Matti (2011) Local binary patterns for still images. In: Computer Vision Using Local Binary Patterns. Springer London, 13-47.
-  L. Li, Z. Wu, Z. J. Zha, S. Jiang and Q. Huang (2011) Matching Content-based Saliency Regions for partial-duplicate image retrieval. In: IEEE International Conference on Multimedia and Expo, Barcelona, Spain.
-  Ziou, Djemel, Touati Hamri, and Sabri Boutemedjet (2009) A hybrid probabilistic framework for content-based image retrieval with feature weighting. In: Pattern Recognition 42.7:1511-1519.
-  ElAlami, M. Esmel (2011) A novel image retrieval model based on the most relevant features. In: Knowledge-Based Systems 24.1:23-32.
-  Yildizer, Ela, (2012) Integrating wavelets with clustering and indexing for effective content-based image retrieval. In: Knowledge-Based Systems 31: 55-66.
-  Rashedi, Esmat, Hossein Nezamabadi-Pour, and Saeid Saryazdi (2013) A simultaneous feature adaptation and feature selection method for content-based image retrieval systems.In: Knowledge-Based Systems 39:85-94.
-  Kundu, Malay Kumar, Manish Chowdhury, and Samuel Rota Bulò (2015) A graph-based relevance feedback mechanism in content-based image retrieval. In: Knowledge-Based Systems 73:254-264.
Wang, Xiang-Yang, Bei-Bei Zhang, and Hong-Ying Yang (2013) Active SVM-based relevance feedback using multiple classifiers ensemble and features reweighting. In: Engineering Applications of Artificial Intelligence 26.1:368-381.
-  Yildizer, Ela (2012) Efficient content-based image retrieval using multiple support vector machines ensemble. In: Expert Systems with Applications 39.3:2385-2396.
-  Wang, Dong Yue, and Taeg Keun Whangbo (2014) A study on Content-based Image Retrieval System Using Relevance Feedback. In: Advanced Science and Technology Letters, 67: 5-8.
-  Bagri, Neelima, and Punit Kumar Johari (2015) A Comparative Study on Feature Extraction using Texture and Shape for Content Based Image Retrieval. In: International Journal of Advanced Science and Technology 80: 41-52.
-  Belattar, Khadidja, and Sihem Mostefai (2013) CBIR using Relevance Feedback: Comparative analysis and major challenges. In: Computer Science and Information Technology (CSIT), 2013 5th International Conference on. IEEE.
-  Lakshmi, A., Malay Nema, and Subrata Rakshit (2015) Long Term Relevance Feedback: A Probabilistic Axis Re-Weighting Update Scheme. In: Signal Processing Letters, IEEE 22.7:852-856.
-  Li, Haojie (2013) Combining global and local matching of multiple features for precise item image retrieval. In: Multimedia systems 19.1:37-49.
-  Sami, Tasnim, Nabeel Mohammed, and Sifat Momen (2016) Learning “initial feature weights” for CBIR using query augmentation. In: International Journal of Multimedia Information Retrieval 1-8.
-  Nilsback, Maria-Elena, and Andrew Zisserman (2006) A visual vocabulary for flower classification. In: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on. Vol.2.
-  J. Z. Wang, J. Li, and G. Wiederhold (2001) SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture Libraries. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 23, no. 9, pp. 947-963.