1 Introduction
Nearest neighbor (NN) search has attracted increasing interest due to the evergrowing largescale data on the web, which is a fundamental requirement in image retrieval [2]. Recently, similaritypreserving hashing methods that encode images into binary codes have been widely studied. Learning good hash functions should require two principles: 1) powerful image representation and 2) efficient searching in representation space. In this paper, we focus on deepnetworkbased hashing for efficient searching and keep the good performance.
In recent year, we have witnessed great success of deep neural networks, which the success mainly comes from the powerful representation learned from the deep network architectures. The deepnetworksbased hashing methods learn the image representations as well as the binary hash codes. Lin et al.
[13] proposed a method that learns hash codes and image representations in a pointwised manner. Li et al. [12] proposed a novel deep hashing method called deep pairwisesupervised hashing (DPSH) to perform simultaneous hash code learning and feature learning. Zhao et al. [37] presented a deep semantic ranking based method for learning hash functions that preserve multilevel semantic similarity for multilabel images. Further, Zhuang [38] proposed a fast deep network for triplet supervised hashing.Although the powerful binary codes have been learned from the deep networks, linear scan of Hamming distance is also time consuming in front of largescale dataset (e.g., millions or billions images). Many methods have been proposed for efficient searching in Hamming space. One popular approach is to use binary codes as the indices into a hash table [24]. The problem is that the number of buckets grows nearexponentially. Norouzi et al. [22] proposed multiindex hashing (MIH) method for fast searching, which divides the binary codes into smaller substrings and build multiple hash tables. MIH assumes that binary codes are uniformly distributed over the Hamming space, which is always not true. Liu et al. [16] and Wan et al. [28] proposed dataoriented multiindex hashing, respectively. They firstly calculated the correlation matrix between bits and then rearranged the indices of bits to make a more uniform distribution in each hash table. Ong et al. [23] relaxed the equalsize constraint in MIH and proposed multiple hash tables with variable length hash keys. Wang et al. [31] used repeatbits in Hamming space to accelerate the searching but need more storage space. Song et al. [26] proposed a distancecomputationfree search scheme for hashing.
Most of the existing works firstly use the hashing models (e.g., LSH [2], MLH [21]) to encode the image into the binary codes, followed by separate methods to rebalance the binary codes distribution. Such fixed hashing models may result in suboptimal searching. Ideally, it is expected that hash models and balanced procedure can be learned simultaneously during the hash learning process.
In this paper, we propose a deep architecture for fast searching and efficient image representation by incorporating the MIH approach into the network. As shown in Figure. 1, our architecture consists of three main building blocks. The first block is for learning the good image representation by the stacked convolutional and fullyconnected layers followed by a slice layer which divides intermediate image features into multiple substrings, each substring corresponding to one hash table as the MIH approach. And the second and third blocks learn the uniform codes distribution, which balances the binary codes in featurelevel and instancelevel, respectively. In featurelevel, we make the bits distributed as uniform as possible in each substring hash table by adding a new balanced constraint in the objective. And the instancelevel is used to punish the buckets contain too many items, which will cost much time for checking many candidate codes. Finally, a similaritypreserving objective with two balanced constraints is proposed to capture the similarities among the images, and a fast hash model is learned to encode all the images into more uniformed binary codes.
The main contributions of this work are twofolds.

We propose a deep multiindex hashing, which learns the hash functions for both the powerful image representation and fast searching.

We conduct extensive evaluations on several benchmark datasets. The empirical results demonstrate the superiority of the proposed method over the stateoftheart baseline methods.
2 Related Work
The learningtohash methods learn the hash functions from the training data for generating better binary representation. The representative methods include Iterative Quantization (ITQ) [3], Kernerlized LSH (KLSH) [10], Anchor Graph Hashing (AGH) [18], Spectral Hashing (SH) [32], SemiSupervised Hashing (SSH) [29], Kernelbased Supervised Hashing (KSH) [17], Minimal Loss Hashing (MIH) [21], Binary Reconstruction Embedding (BRE) [9] and so on. The comprehensive survey can be found in [30].
Deepnetworkbased hashing method has been emerged as one of the leading approaches. Many algorithms [7, 19, 15, 36, 38, 13, 34, 35, 37]
have been proposed, including the pointwise approach, the pairwise approach and the rankingbased approach. The pointwise methods take a single image as input and the loss function is built on individual data. For example, Lin et al.
[13] showed that the binary codes can be learned by employing a hidden layer for representing the latent concepts that dominate the class labels, thus they proposed to learn the hash codes and image representations in a pointwised manner. Yang et al. [34] proposed a loss function defined on classification error and other desirable properties of hash codes to learn the hash functions. The pairwise methods take the image pairs as input and the loss functions are used to characterize the relationship (i.e., similar or dissimilar) between a pair of two images. Specifically, if two images are similar, then the hamming distance between the two images should be small, otherwise, the distance should be large. Representative methods include deep pairwisesupervised hashing (DPSH) [12], deep supervised hashing (DSH) [14] and so on. The rankingbased methods cast learningtohash as the ranking problem. Zhao et al. [37] proposed a deep semantic rankingbased method for learning hash functions that preserve multilevel semantic similarity between multilabel images. Zhuang [38] proposed a fast deep network for triplet supervised hashing.Although obtaining the powerful image representation via the deep learningtohash methods, existing works always do not consider the fast searching in the learned codes space. Multiindex hashing
[4, 22] is an efficient method for finding all neighbors of a query by dividing the binary codes into multiple substrings. While, binary codes learned from the deep network always not be uniformly distributed in practice, e.g., all images with the same label indices with a similar key as shown in Figure 2, which will cost much time to check many candidate codes. In this paper, we solve this problem by adding two balanced constraints in our network, and learn more uniformly distributed binary codes.3 Background: MultiIndex Hashing
In this section, we briefly review MIH [22].
MIH is a method for fast searching in largescale datasets, which the binary code is partitioned into disjoint substring, , each substring consists of bits, where is the length of bits and we assume is divisible by for convenience presentation. One hash table is built for each of the substrings.
The neighbor of a query is denoted as which differ from in bits or less from all codes in the database. To search the neighbor of a query with substrings , MIH searches the all substring hash tables for entries that are within a Hamming distance of ^{1}^{1}1For ease of presentation, here we assume is divisible by . In practice, if with , we can set the search radii of the first hash tables to be and the rests to be .. The set of candidates from the th substring hash table is denoted as . Then, the union of all the sets, , is the superset of the neighbors of . The false positives that are not true neighbors of are removed by computing the full Hamming distance.
The NN search problem can be formulated as the near neighbor problem. By initializing integer , we can progressive increment of the search radius until the specified number of neighbors is found.
4 Deep MultiIndex Hashing
This section describes deep multiindex hashing architecture that allows us to 1) obtain powerful binary codes and 2) efficient search inside the binary codes space.
We firstly introduce notations. There is a labeled training set , where is the th image, is the class name/label of the th image, and the number of training samples is . Suppose that each binary code comprises bits, the goal of deep multiindex hashing is to learn a deep hash model, in which the similarities among the binary codes should be preserved and also quick searching in largescale binary codes space.
As shown in Figure 1, the purposes of the proposed architecture are two: 1) a deep network with multiple convolutionpooling layers to capture an efficient representation of images, and followed by a slice layer to partition the feature into disjoint substrings, and 2) a balanced codes module designed to address the ability to quickly search inside the binary codes space. It generates the binary codes distributed as uniform as possible in each substring hash table from two aspects: featurelevel and instancelevel. In the following, we will present the details of these parts, respectively.
4.1 Efficient Representation via Deep Network
The deep network, e.g., AlexNet [8], VGG [25], GoogleLeNet [27] and residual network [5], is used for learning the powerful efficient image representation, which is made following structural modifications for image retrieval task. The first modification is to remove the last fullyconnected layer (e.g., fc8). The second is to add a fullyconnected layer with dimensional intermediate features. The intermediate features are then fed into a tanh layer that restricts the values in the range . The MIH contains separate hash tables. Inspired by that, the third modification is to add a slice layer to divide the features into slices with equal length . According to the suggestion of the MIH, the number of substring hash tables is setted to be , which shows the best empirical performance shown in [22]. Finally, the output of network is denoted as , where is the input image and is the deep network.
The deepnetwork based methods can learn a very powerful image representation for the image retrieval, while they do not consider the ability to efficiently search inside the representation space. An example of powerful image representation for binary codes while bad searching is shown in Figure 2. Here the substring of length . Suppose that there are 2 class labels, and each class consists of 50,000 images. Without loss of generality, the first 50,000 images whose labels are , the labels of the rest 50,000 images are . The hash table is built for the 100,000 learned binary codes shown in Figure 2, where the similar codes locate in the same bucket (with a similar key) and the dissimilar codes have largest Hamming distance, i.e., , in the hash table. The learned binary codes are very good for accuracy while they are very bad for searching. Given a query, it needs to check so many candidate items (e.g., 50,000 items). It is necessary for finding a new way to generate more balanced binary codes.
4.2 Fast Searching via the Deep MultiIndex Hashing
We first give the following proposition.
Proposition 1.
When the buckets in the substring hash tables that differ from within bits, i.e., , then we have , where .
For example, suppose that , when searching in the first substring hash table, we obtain a set of candidates , then the neighbor (i.e., ) of query is the subset of the candidates, that is . Similar, we have , and etc. When searching in the substring hash tables differs by bits or less, we can obtain all neighbor of the query, where .
According to the above proposition, we can see that the running time of MIH for NN mainly contains two parts: index lookups and candidate codes checking. To achieve faster searching, we should reduce 1) the number of distinct hash buckets to examine, i.e., the smaller , the better. 2) the number of candidate codes to check, i.e., the smaller , the better.
4.2.1 Balanced binary codes in Instancelevel
To reduce the running time for index lookups, the binary codes of similar images should be indices with a similar key as shown in Figure 2. In such case, . Unfortunately, we need to check so many candidate binary codes, making the inefficient searching. Thus, the number of each bucket should be not too small and not too large. Balanced binary codes in instancelevel are learned for addressing the problem, which require that each bucket in the hash table contains at most items.
Formally, all the buckets in all substring hash tables that have more than items were found. Let is denoted as the items in the th bucket of the th substring hash table. We use the following steps to rebalance these items as shown in Figure 3.
1) The full Hamming distance is used to split these items into several groups, each group contains the samples which have the same binary codes. If the number of all groups are less than , stop the procedure.
2) Otherwise, if the number of the group is more than , we further randomly split it into subgroups with the equal sizes, making sure each subgroup consists of at most items, where is the number of items in the group.
A key principle should be ensured is that do not change the similarities among these images, that is the distance between and , i.e., , should preserve relative similarities of the form “( in the same subgroup) ( in the same group) ( in the different groups)”. Thus, the objective can be formulated as:
where we let the Hamming distances of th substring between the examples in from 0 to be 0,1,2 to rebalance the items.
4.2.2 Balanced binary codes in Featurelevel
To reduce the running time for candidate codes checking, the false positives in candidate set should be small, that is to minimize . To achieve it, the should not contains too many items which are not true neighbors of query. That is, when the substring and differ by bits, the full Hamming distant between and should differ by bits or less. This leads to the follow proposition:
Proposition 2.
Suppose that for all and , we have , and , where , then .
According to Proposition 2, we add the following new balanced constraint in our objective
(1)  
where . The above formulation requires the almost equal distance in each substring, which distance of each substring should be less or equal to and larger or equal to .
Overall, the similaritypreserving loss function for balanced codes can be formulated as :
(2)  
where are parameters, is the distance between two binary codes. For ease of optimization, we replace the Hamming distance with the euclidean distance. In all our experiments, the is setted to be , , and . The first term of the objective is to preserves relative similarities of the form “ is more similar to than to ”. The second term is for generating balanced codes in featurelevel and the third term is for balanced codes in instantlevel.
Dataset  nbits  Method  1NN  2NN  5NN  10NN  20NN  50NN  100NN 

NUSWIDE  64  DMIH  119.23  100.37  87.62  77.59  66.75  50.01  34.68 
DeepHash  56.17  55.28  49.34  43.79  36.82  30.70  22.71  
96  DMIH  74.59  60.51  52.78  44.02  41.93  30.96  26.08  
DeepHash  40.83  34.25  33.08  20.06  21.18  16.61  12.08  
128  DMIH  60.90  47.66  38.79  35.56  19.49  18.85  17.36  
DeepHash  30.95  24.00  21.30  19.97  11.82  10.11  9.37  
256  DMIH  30.31  23.91  22.28  20.62  20.72  16.21  14.69  
DeepHash  15.24  12.75  12.45  13.42  12.45  9.51  8.62  
SVHN  64  DMIH  96.39  63.86  65.74  65.76  64.15  43.83  31.95 
DeepHash  10.06  10.33  10.29  10.25  10.03  10.93  9.79  
96  DMIH  92.46  64.02  58.08  54.09  45.82  25.48  20.44  
DeepHash  9.96  10.01  9.79  10.24  9.71  9.06  10.06  
128  DMIH  56.82  49.35  22.17  18.33  16.33  17.94  17.59  
DeepHash  9.12  9.34  7.56  6.21  6.16  6.97  6.39  
256  DMIH  30.27  23.94  21.14  19.46  17.91  14.87  13.19  
DeepHash  7.53  7.60  7.02  7.44  7.62  7.20  7.34 
5 Experiments
In this section, we evaluate and compare the performance of the proposed method with several stateoftheart algorithms.
5.1 Datasets and Experimental Setting

SVHN [20] ^{2}^{2}2http://ufldl.stanford.edu/housenumbers/ is obtained from house numbers in Google Street View images, which contains over 600,000 images and 10 classes.

NUSWIDE [1] ^{3}^{3}3http://lms.comp.nus.edu.sg/research/NUSWIDE.htm consists of 269,648 images and the associated tags from Flickr. The labels extracted from the tags associated to the images for the 81 concepts.
In NUSWIDE, we follow the settings in [33, 18] for fair comparison. The 21 most frequent labels are selected, where each label associates with at least 5,000 images. We randomly select 100 images from each of the selected 21 classes to form the query set of 2,100 images. The rest images are used as the retrieval database. In the retrieval database, 500 images from each of the selected 21 classes are randomly chosen as the training set.
In SVHN, we randomly select 1,000 images (100 images per class) as the query set, and 5,000 images (500 images per class) from the rest images as the training set.
We implement the proposed method using the opensource Caffe [6] framework. In this paper, we use AlexNet [8] as our basic network. The weights of layers are firstly initialized by the pretrained AlexNet model ^{4}^{4}4http://dl.caffe.berkeleyvision.org/bvlc_alexnet.caffemodel.
5.2 Results
In this subsection, we evaluate the query time of our method by comparing it with the existing deepnetworkbased method. To make a fair comparison, we compare two methods:

DeepHash. The hash functions are learned without the assistance of the balanced constraints, i.e., only use the first term of the objective (2).

Deep MultiIndex Hashing (DMIH). The hash functions are learned with the assistance of the balanced constraints, i.e., using all terms in the objective (2).
Since the two methods use the same network and the only difference is that using or not using the proposed balanced constraints in featurelevel and instancelevel, these comparisons can show us whether the balanced constraints can contribute to the speed or not.
After obtaining the binary codes, we use the implementation of MIH ^{5}^{5}5https://github.com/norouzi/mih provided by the authors to report the accelerated ratios of the two methods compared to linear scan on all the above databases. The speedup factors of MIH over linear scan of both the proposed method and DeepHash for different NN problems are shown in Table 1. Note that the linear scan does not depend on the underlying distribution of the binary codes, thus the running times of linear scan of two methods are the same.
The results show that DMIH is more efficient than DeepHash, especially for the small NN problems. For instance, for NN in 96bits codes on SVHN, the speedup factor for DMIH is 92.46, compared to 9.96 for DeepHash. In NUSWIDE, our method shows about speedup ratio in comparison with DeepHash.
The main reason is that the proposed method can learn the more balanced hash codes than DeepHash. To give an intuitive understanding of our method, we utilize an entropy based measurement that is defined as
(3) 
where is the dimension of the th hash table, thus there are buckets in this table. And
is the probability of codes assigned to bucket
, which is defined as , where is the number of codes in bucket and is the size of database. Note that the higher entropy value means better distribution of data items in hash tables.Method  64 bits  96 bits  128 bits  256 bits 

SVHN  
DeepHash  4.06  4.32  4.14  4.11 
DMIH  10.40  10.94  9.10  8.17 
NUSWIDE  
DeepHash  9.23  9.59  8.99  8.97 
DMIH  9.72  9.84  9.51  9.39 
Again, for all databases and bits, our method yields the higher entropy and beats the baseline. This is also can explain why our method can obtain the faster searching.
Further, we evaluate and compare the performance of the proposed method with several stateoftheart algorithms. LSH [2], ITQ [3], ITQCCA [3], SH [32] and DeepHash are selected as the baselines. The results of LSH, ITQ, ITQCCA and SH are obtained by the implementations provided by their authors, respectively. Note that DeepHash is very similar to the existing work OneStage Hash [11]
, which also divides the feature into several slices and uses the triplet ranking loss for preserving the similarities. Since the results of DeepHash and Onestage are almost the same, thus we only report the results of DeepHash. To evaluate the quality of hashing, we use Mean Average Precision (MAP) and Precision curves w.r.t. different numbers of top returned samples as the evaluation metrics. For a fair comparison, all of the methods use identical training and test sets, and the AlexNet model is used to extract deep features (i.e., 4096 dimensional features from fc7 layer) for LSH, ITQ, ITQCCA and SH.
Figure 4 shows the comparison results on the two datasets. We can see that 1) the deep network based methods show improvements over the baselines using fixed deep features. 2) DMIH shows comparable performance against the most related baseline DeepHash. These results verify that adding new balanced constraints does not drop the performance.
In summary, our method performs 2 to 10 times faster than DeepHash with the comparable performance.
5.2.1 Effects of the Featurelevel and Instancelevel Constraints
In this set of experiments, we show the advantages of the proposed two balanced constraints. To give an intuitive comparison, we show the results of using only the featurelevel/instancelevel constraint, respectively.
Method  1NN  10NN  100NN 
NUSWIDE  
Both  74.59  44.02  26.08 
Featurelevel  68.00  37.64  22.29 
Instancelevel  45.48  30.22  17.82 
SVHN  
Both  92.46  54.09  20.44 
Featurelevel  11.73  10.43  9.99 
Instancelevel  89.73  50.23  19.94 
Table 3 show the comparison results. The results show that instancelevel constraint is very useful for SVHN while the featurelevel constraint is helpful in NUSWIDE dataset. It depends on the data distributions of the learned binary codes.
5.2.2 Effect of the Endtoend Learning
Our framework is an endtoend framework. To show the advantages of the endtoend framework, we compare to the following baseline, which adopts a twostage strategy. In the first stage, DeepHash is learned and the images are encoded into binary codes. In the second stage, we rebalance the binary codes by the datadriven multiindex hashing [28].
Method  1NN  10NN  100NN 
NUSWIDE  
DMIH  74.58  44.02  26.08 
DeepHash  40.83  20.06  12.08 
Twostage  60.55  29.97  17.95 
SVHN  
DMIH  92.46  54.09  20.44 
DeepHash  9.96  10.24  10.06 
Twostage  11.50  10.40  10.33 
Table 4 shows the comparison results. We can observe that our method performs better than DeepHash and twostage method. It is desirable to learn the hash function and balanced procedure in the endtoend framework.
6 Conclusion
In this paper, we proposed a deepnetworkbased multiindex hashing method for fast searching and good performance. In the proposed deep architecture, an image goes through the deep network with stacked convolutional layers and is encoded into high level image representation with several substrings. Then, we proposed to learn more balanced binary codes by adding two constraints. One is the featurelevel constraint, which is used to make the binary codes distributed as balance as possible in each hash table. Another is the instancelevel constraint, which is used to let the buckets in each substrings hash table contain balanced items. Finally, the deep hash model for both the powerful image representation and fast searing is learned simultaneously. Empirical evaluations on two datasets show that the proposed method runs faster than the baseline and achieve comparable performance.
In future work, we plan to apply DMIH in different networks and methods to exploit the effect of the proposed balanced constraints. We also plan to accelerate the running times of extracting the features from the deep network.
Appendices
Proof of Proposition 1
Suppose that there exists one binary code , and . According to Proposition 1 in paper [22], we have for a substring. We discuss in two situations: 1) if , then according to the definition of , which contradicts the premise. 2) if and all the first substrings is strictly greater than , then the total number of bits that differ in the last is at most . Using Proposition 1 in paper [22] again, we have , thus , which contradicts the premise.
Proof of Proposition 2
Suppose that , we have at least one binary code satisfies and . Since is not the neighbor of query , then and at least differ by bits. Since , we have , thus . According to the assumption and , we have , thus by the definition of , which contradicts the premise.
References
 [1] T.S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y.T. Zheng. Nuswide: A realworld web image database from national university of singapore. In CIVR, Santorini, Greece., July 810, 2009.
 [2] A. Gionis, P. Indyk, and R. Motwani. Similarity search in high dimensions via hashing. In VLDB, pages 518–529, 1999.
 [3] Y. Gong and S. Lazebnik. Iterative quantization: A procrustean approach to learning binary codes. In CVPR, pages 817–824, 2011.
 [4] D. Greene, M. Parnas, and F. Yao. Multiindex hashing for information retrieval. In FSKD.
 [5] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, pages 770–778, 2016.
 [6] Y. Jia. Caffe: An open source convolutional architecture for fast feature embedding. http://caffe. berkeleyvision. org, 2013.
 [7] Q.Y. Jiang and W.J. Li. Deep crossmodal hashing. CVPR, 2017.
 [8] A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, pages 1106–1114, 2012.
 [9] B. Kulis and T. Darrell. Learning to hash with binary reconstructive embeddings. In NIPS, pages 1042–1050, 2009.
 [10] B. Kulis and K. Grauman. Kernelized localitysensitive hashing for scalable image search. In ICCV, pages 2130–2137, 2009.
 [11] H. Lai, Y. Pan, Y. Liu, and S. Yan. Simultaneous feature learning and hash coding with deep neural networks. In CVPR, pages 3270–3278, 2015.
 [12] W. Li. Feature learning based deep supervised hashing with pairwise labels. In IJCAI, pages 3485–3492, 2016.
 [13] K. Lin, H.F. Yang, J.H. Hsiao, and C.S. Chen. Deep learning of binary hash codes for fast image retrieval. In CVPR, pages 27–35, 2015.
 [14] H. Liu, R. Wang, S. Shan, and X. Chen. Deep supervised hashing for fast image retrieval. In CVPR, pages 2064–2072, 2016.
 [15] L. Liu, F. Shen, Y. Shen, X. Liu, and L. Shao. Deep sketch hashing: Fast freehand sketchbased image retrieval. CVPR, 2017.
 [16] Q. Liu, H. Xie, Y. Liu, C. Zhang, and L. Guo. Dataoriented multiindex hashing. In ICME, pages 1–6. IEEE, 2015.
 [17] W. Liu, J. Wang, R. Ji, Y.G. Jiang, and S.F. Chang. Supervised hashing with kernels. In CVPR, pages 2074–2081, 2012.
 [18] W. Liu, J. Wang, S. Kumar, and S.F. Chang. Hashing with graphs. In ICML, pages 1–8, 2011.
 [19] D. Mandal, K. Chaudhury, and S. Biswas. Generalized semantic preserving hashing for nlabel crossmodal retrieval. CVPR, 2017.
 [20] Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng. Reading digits in natural images with unsupervised feature learning. In NIPS, volume 2011, page 5, 2011.
 [21] M. Norouzi and D. M. Blei. Minimal loss hashing for compact binary codes. In ICML, pages 353–360, 2011.
 [22] M. Norouzi, A. Punjani, and D. J. Fleet. Fast exact search in hamming space with multiindex hashing. TPAMI, 36(6):1107–1119, 2014.
 [23] E.J. Ong and M. Bober. Improved hamming distance search using variable length substrings. In CVPR, pages 2000–2008, 2016.
 [24] R. Salakhutdinov and G. Hinton. Learning a nonlinear embedding by preserving class neighbourhood structure. In AISTATS, pages 412–419, 2007.
 [25] K. Simonyan and A. Zisserman. Very deep convolutional networks for largescale image recognition. arXiv preprint arXiv:1409.1556, 2014.
 [26] J. Song, H. Shen, J. Wang, Z. Huang, N. Sebe, and J. Wang. A distancecomputationfree search scheme for binary code databases. IEEE Transactions on Multimedia, 18(3):484–495, 2016.
 [27] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. arXiv preprint arXiv:1409.4842, 2014.
 [28] J. Wan, S. Tang, Y. Zhang, L. Huang, and J. Li. Data driven multiindex hashing. In ICIP, pages 2670–2673. IEEE, 2013.
 [29] J. Wang, S. Kumar, and S.F. Chang. Semisupervised hashing for scalable image retrieval. In CVPR, pages 3424–3431, 2010.
 [30] J. Wang, T. Zhang, N. Sebe, and et al. A survey on learning to hash. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017.
 [31] M. Wang, X. Feng, and J. Cui. Multiindex hashing with repeatbits in hamming space. In FSKD, pages 1307–1313. IEEE, 2015.
 [32] Y. Weiss, A. Torralba, and R. Fergus. Spectral hashing. In NIPS, pages 1753–1760, 2008.
 [33] R. Xia, Y. Pan, H. Lai, C. Liu, and S. Yan. Supervised hashing for image retrieval via image representation learning. In AAAI, pages 2156–2162, 2014.
 [34] H.F. Yang, K. Lin, and C.S. Chen. Supervised learning of semanticspreserving hashing via deep neural networks for largescale image search. arXiv preprint arXiv:1507.00101, 2015.
 [35] R. Zhang, L. Lin, R. Zhang, W. Zuo, and L. Zhang. Bitscalable deep hashing with regularized similarity learning for image retrieval and person reidentification. TIP, 24(12):4766–4779, 2015.
 [36] Z. Zhang, Y. Chen, and V. Saligrama. Efficient training of very deep neural networks for supervised hashing. In CVPR, pages 1487–1495, 2016.
 [37] F. Zhao, Y. Huang, L. Wang, and T. Tan. Deep semantic ranking based hashing for multilabel image retrieval. arXiv preprint arXiv:1501.06272, 2015.
 [38] B. Zhuang, G. Lin, C. Shen, and I. Reid. Fast training of tripletbased deep binary embedding networks. In CVPR, pages 5955–5964, 2016.
Comments
There are no comments yet.