1 Introduction
Person reidentification (ReID) DBLP:conf/eccv/SunZYTW18 ; DBLP:conf/cvpr/ChangHX18 ; DBLP:conf/nips/GeLZYYWL18
has attracted much attention in computer vision. For a given probe person, the goal of ReID is to retrieve (search) in the gallery set for pedestrian images containing the same individual in a crosscamera mode. Recently, ReID has been widely used in many real applications including contentbased video retrieval, video surveillance, and so on.
Existing ReID methods can be divided into two main categories DBLP:journals/corr/ZhengYH16 . One category focuses on utilizing handcrafted features to represent person images, especially for most early ReID approaches DBLP:conf/cvpr/KostingerHWRB12 ; DBLP:conf/cvpr/ZhaoOW14 ; DBLP:conf/cvpr/LiaoHZL15 . The other category DBLP:conf/mm/WangYCLZ18 ; DBLP:conf/eccv/SunZYTW18 ; DBLP:conf/cvpr/ChangHX18
adopts deep learning architectures to extract features. Most of these existing methods, including both deep methods and nondeep methods, typically represent person images as realvalued features. This realvalued feature representation makes ReID inefficient when the gallery set is extremely large, due to high computation and storage cost during the retrieval (search) procedure.
Recently, hashing DBLP:conf/nips/WeissTF08 ; DBLP:conf/nips/LiSHT17 ; DBLP:conf/nips/SuZHT18 ; DBLP:conf/nips/LiuMKC14 ; DBLP:conf/icml/LiLSHD13 ; DBLP:conf/icml/YuKGC14 ; DBLP:conf/icml/LiuWKC11 ; DBLP:conf/icml/DaiGKHS17 ; DBLP:conf/icml/NorouziF11 ; DBLP:conf/icml/WangKC10 has been introduced into ReID community for efficiency improvement due to its low storage cost and fast query speed. The goal of hashing is to embed data points into a Hamming space of binary codes where the similarity in the original space is preserved. Several hashing methods have been proposed for ReID DBLP:journals/tip/ZhangLZZZ15 ; DBLP:conf/ijcai/ZhengS16 ; DBLP:conf/cvpr/ChenWQLS17 ; DBLP:journals/tip/ZhuKZFT17 . In DBLP:conf/ijcai/ZhengS16 , deep regularized similarity comparison hashing (DRSCH) was designed by combining tripletbased formulation and bitscalable binary codes generation. In DBLP:conf/ijcai/ZhengS16 , crossview binary identifies (CBI) was learned by constructing two sets of discriminative hash functions. In DBLP:conf/cvpr/ChenWQLS17 , crosscamera semantic binary transformation (CSBT) employed subspace projection to mitigate crosscamera variations. In DBLP:journals/tip/ZhuKZFT17
, partbased deep hashing (PDH) was proposed to incorporate tripletbased formulation and image partitions to learn partbased binary codes. Among these methods, CBI and CSBT focus on designing models to learn binary codes by using handcrafted features. DRSCH and PDH are deep hashing methods which try to integrate deep feature learning and hash code learning into an endtoend framework. Recent efforts
DBLP:conf/nips/LiSHT17 ; DBLP:conf/nips/SuZHT18 ; DBLP:journals/tip/ZhuKZFT17 show that the deep hashing methods can achieve better performance than handcrafted feature based hashing methods.However, existing hashing methods usually cannot achieve satisfactory performance for ReID. Exhaustive linear search based on Hamming ranking cannot handle largescale dataset. More specifically, although one can adopt hash lookup to achieve sublinear query speed, they DBLP:journals/tip/ZhuKZFT17 ; DBLP:conf/cvpr/ChenWQLS17 usually need long binary codes to achieve reasonable accuracy due to the high complexity in ReID. In this situation, the retrieval speed will become extremely slow because the number of hash bins that need to be retrieved increases exponentially as code length increases. Hence, although existing hashing methods can achieve faster speed than traditional realvalued ReID methods, these hashing methods will typically deteriorate the accuracy because the binary code cannot be too long. Furthermore, the efficiency of existing hashing methods is still not high enough.
In this paper, we propose a novel hashing method, called deep multiindex hashing (DMIH), to improve both retrieval efficiency and accuracy for ReID. Our main contributions are summarized as follows. 1) DMIH seamlessly integrates multiindex hashing (MIH) DBLP:journals/pami/0002PF14 and multibranch based networks into the same framework. In DMIH, feature learning procedure and hash code learning procedure can facilitate each other. To the best of our knowledge, DMIH is the first hashing based ReID method to integrate multiindex hashing and deep feature learning into the same framework. 2) In DMIH, a novel blockwise multiindex hashing table construction approach and a searchaware multiindex (SAMI) loss are proposed to improve the retrieval efficiency. 3) Experiments on three widely used datasets show that DMIH can outperform other stateoftheart baselines, including both hashing methods and realvalued methods, in terms of both efficiency and accuracy.
2 Related Work
MultiIndex Hashing
In real applications, when facing long binary codes, hash lookup will suffer from low retrieval speed due to large number of hash bins that need to be retrieved. Multiindex hashing (MIH) DBLP:journals/pami/0002PF14 , which can enable efficient nearest neighbors search^{1}^{1}1Please note that here the nearest neighbors are defined based on Hamming distance DBLP:journals/pami/0002PF14 . for long codes is proposed to deal with this situation. MIH divides the long binary codes into several disjoint but consecutive subbinary codes and builds multiple hash tables on shorter code substrings, which can enormously reduce the number of hash bins to be retrieved and improve the search efficiency.
However, MIH is based on the assumption that the binary codes should be distributed balanced between subbinary codes, which is usually not satisfied in real applications DBLP:conf/mm/ZhangGZL11 . So the time performance of MIH will be adversely affected when dealing with unbalanced distributed codes. Our method learns to adjust the distribution of binary codes by minimizing SAMI loss to enhance the time performance of MIH.
MultiBranch Architectures
Multibranch based networks DBLP:conf/cvpr/SzegedyLJSRAEVR15 ; DBLP:conf/cvpr/HeZRS16 have been widely exploited in computer vision tasks. Recently, “grouped convolution” DBLP:conf/cvpr/XieGDTH17 ; DBLP:conf/cvpr/HuangLMW18 has been proposed to construct multibranch architectures. These building blocks can achieve stronger modeling capacity. In ReID, due to the crosscamera variations, the partial information is significant to improve the discriminative performances. Multibranch based networks have been used to learn discriminative information with various granularities in previous works DBLP:journals/tip/ZhuKZFT17 ; DBLP:conf/mm/WangYCLZ18 ; DBLP:conf/eccv/SunZYTW18 ; DBLP:conf/cvpr/ChangHX18 .
3 Notation and Problem Definition
3.1 Notation
We use boldface lowercase letters like
to denote vectors and boldface uppercase letters like
to denote matrices. denotes the norm for the vector . is defined as . For an integer , we use to denote the set . is an elementwise sign function where if else . Furthermore, denotes the Hamming distance between two binary vectors and , i.e., . Here, is the code length of and .3.2 Hashing based ReID
Assume that we have training samples which are denoted as . Furthermore, person identities for images are also available and denoted as , where denotes the number of persons in the training set. Our target is to learn a deep hash function , which can transform the person images to binary codes with bits.
4 Deep MultiIndex Hashing for ReID
4.1 Model
The DMIH model is illustrated in Figure 1, which is an endtoend deep learning framework containing two components, i.e., multibranch network part and binary codes learning part. Furthermore, a novel blockwise multiindex hashing table construction approach and a novel searchaware multiindex (SAMI) loss are developed to improve search efficiency.
MultiBranch Network Part
As a variety of deep methods for ReID DBLP:journals/tip/ZhuKZFT17 ; DBLP:conf/mm/WangYCLZ18 ; DBLP:conf/eccv/SunZYTW18 ; DBLP:conf/cvpr/ChangHX18 have demonstrated that multibranch architectures based network can learn more discriminative features, we adopt the multigranularity network (MGN) architecture DBLP:conf/mm/WangYCLZ18 as the feature learning part of DMIH. This architecture integrates global and local features to get more powerful pedestrian representations. The MGN architecture is shown in the left part of Figure 1, which contains a ResNet50 DBLP:conf/cvpr/HeZRS16 network and three branches. The upper branch without any partition information learns the global feature representations. The middle and lower branches uniformly split feature maps into several stripes in horizontal orientation to learn the local feature representations. For a given pedestrian image , the output of all the three branches are denoted as , which contains the global and local feature representations. Please note that DMIH is general enough to adopt other multibranch architectures since our objective is to improve the retrieval efficiency and accuracy, rather than designing a new multibranch building block. In other words, our method is an extensive learning algorithm, which is independent of specific network architectures.
Binary Codes Learning Part
The principle of binary codes learning is to preserve the similarity of samples. We use tripletbased loss to achieve this goal, which has been proved to be effective in deep ReID tasks DBLP:journals/tip/ZhangLZZZ15 ; DBLP:journals/tip/ZhuKZFT17 ; DBLP:conf/mm/WangYCLZ18 . Specifically, for the th input , we add a fullyconnected layer after each branch of network as a hash layer to project the global and local features, i.e., , into . Then we employ the function to get its corresponding binary codes , where and denotes the code length of each subbinary code.
Then a triplet loss function is imposed on
. For example, the loss function for a minibatch with samples can be formulated as follows:where respectively represent the generated binary codes from anchor, positive and negative samples, is the margin hyperparameter. Here the pedestrian who has the same/different identity with the anchor is the positive/negative sample. For each pedestrian image in a minibatch, we treat it as an anchor and build the corresponding triplet input by choosing the furthest positive and the closest negative samples in the same batch. This improved version of the batchhard triplet loss enhances the robustness in metric learning DBLP:journals/corr/HermansBL17 , and improves the accuracy at the same time. Then we can get the following total triplet loss function:
(1) 
In order to learn more discriminative binary codes, we explore cross entropy loss for classification on the outputs of each branch. Specifically, we utilize the formula: .
Then we can get the total classification loss function:
(2) 
MultiIndex Hashing Tables Construction
MIH supposes that each binary code with bits is partitioned into disjoint isometric subbinary codes . Given a query code , we aim to find all binary codes with the Hamming distance from being . We call them neighbors. Let and . According to the Proposition 1 proved in DBLP:journals/pami/0002PF14 , we only need to search the first hash tables at the radius of and the remaining hash tables at the radius of to construct a candidate set when performing retrieval procedure for a given query. After that, we remove the points which are not neighbors from the candidate set by measuring full Hamming distance.
Proposition 1.
if , then or
Once we get the learned binary codes, one way to construct MIH tables is to divide the total binary codes into hash tables, where the total binary codes is defined as . By doing so, each hash table might suffer from singlegranularity binary codes problem, and thus leads to large difference between different hash tables. To alleviate this situation, we design a novel blockwise MIH tables construction strategy, which is shown in Figure 2. Specifically, we divide the learned binary codes into disjoint subbinary codes separately. Then we concatenate all the th subbinary codes to construct for the th hash table, i.e., . That is to say, this blockwise partition strategy can ensure that each hash table contains multigranularity binary codes.
SearchAware MultiIndex Loss
Based on the retrieval procedure of MIH tables, we propose a novel loss, called searchaware multiindex (SAMI) loss, to give feedback to the training procedure. Firstly, we define the binary codes for and as and , respectively. And we use and to denote the corresponding subbinary codes in the th hash tables. Then we define the Hamming distance between and as . As the first hash tables will be searched firstly, we hope is larger than as much as possible. Then we can get the following SAMI loss function:
(3) 
By minimizing the above loss function, we can avoid the situation where the false data points are chosen into the candidate set too early when we utilize MIH tables to perform the retrieval procedure. Here the false data points are those data points whose Hamming distance to a given query point is larger than the distance we need to retrieval, but the Hamming distance between its subbinary codes and the subbinary codes of the query in the firstsearched hash tables is small. Then we can reduce the number of points which are actually not neighbors but are added into the candidate set. As a result, the time for measuring full Hamming distance and removing the points which are not neighbors will be saved.
4.2 Learning
The objective function in (4) is NPhard due to the binary constraint. One common approach to avoid this problem is to use relaxation strategy DBLP:conf/ijcai/LiWK16 ; DBLP:conf/iccv/CaoLWY17 . In this paper, we also adopt this strategy to avoid this NPhard problem. Specifically, we utilize to approximate the function.
Then we can reformulate the problem in (4) as follows:
(5) 
where denote the continuous codes after relaxation.
5 Experiments
In this section, we conduct extensive evaluation of the proposed method on three widely used ReID datasets: Market1501 DBLP:conf/iccv/ZhengSTWWT15 , DukeMTMCReID DBLP:conf/iccv/ZhengZY17 and CUHK03 DBLP:conf/cvpr/LiZXW14
in a singlequery mode. DMIH is implemented with PyTorch
paszke2017automatic on a NVIDIA M40 GPU server. We use the C++ implementation of MIH provided by the authors of DBLP:journals/pami/0002PF14 ^{2}^{2}2https://github.com/norouzi/mih and conduct the retrieval experiments on a server with Intel Core CPU (2.2GHz) and 96GB RAM.5.1 Datasets
Market1501
Market1501 dataset consists of 32,688 bounding boxes of 1,501 persons from 6 cameras. These bounding boxes are cropped by the deformablepartmodel (DPM) detector DBLP:journals/pami/FelzenszwalbGMR10 . 12,936 images of 751 persons are selected from the dataset as training set, and the remaining 750 persons are divided into test set with 3,368 query images and 19,732 gallery images.
DukeMTMCReID
DukeMTMCReID dataset is a subset of the DukeMTMC dataset DBLP:conf/eccv/RistaniSZCT16 for imagebased reidentification. It consists of 36,411 images of 1,812 persons from 8 highresolution cameras. The whole dataset is divided into training set with 16,522 images of 702 persons and test set with 2,228 query images and 17,661 gallery images of the remaining 702 persons.
Cuhk03
CUHK03 dataset contains 14,097 images of 1,467 persons from 6 surveillance cameras. This dataset provides both manually labeled pedestrian bounding boxes and bounding boxes detected by the DPM detector. For this dataset, we choose the labeled images for evaluation. To be more consistent with real application, we adopt the widely recognized DBLP:conf/mm/WangYCLZ18 ; DBLP:conf/cvpr/ChangHX18 ; DBLP:conf/eccv/SunZYTW18 protocol proposed in DBLP:conf/cvpr/ZhongZCL17 .
5.2 Experimental Setup
Baselines and Evaluation Protocol
Both hashing methods and realvalued methods are adopted as baselines for comparison. The stateoftheart hashing methods for comparison include: 1) nondeep hashing methods: COSDISH DBLP:conf/aaai/KangLZ16 , SDH DBLP:conf/cvpr/ShenSLS15 , KSH DBLP:conf/cvpr/LiuWJJC12 , ITQ DBLP:conf/cvpr/GongL11 , LSH DBLP:conf/compgeom/DatarIIM04 ; 2) deep hashing methods: PDH DBLP:journals/tip/ZhuKZFT17 , HashNet DBLP:conf/iccv/CaoLWY17 , DPSH DBLP:conf/ijcai/LiWK16 . Among these baselines, PDH is designed specifically for ReID. DRSCH DBLP:journals/tip/ZhangLZZZ15 is not adopted for comparison because it has been found to be outperformed by PDH. The stateoftheart realvalued ReID methods for comparison include: 1) metric learning methods: KISSME DBLP:conf/cvpr/KostingerHWRB12 ; 2) deep learning methods: posedriven deep convolutional (PDC) DBLP:conf/iccv/SuLZX0T17 , Spindle DBLP:conf/cvpr/ZhaoTSSYYWT17 , MGN DBLP:conf/mm/WangYCLZ18 .
Following the standard evaluation protocol on ReID tasks DBLP:conf/mm/WangYCLZ18 , we report the mean average precision (mAP) and Cumulated Matching Characteristic (CMC) to verify the effectiveness of our proposed method. Furthermore, to verify the high accuracy and fast query speed DMIH can achieve, we adopt precisiontime and recalltime DBLP:journals/corr/Cai16b ; DBLP:journals/corr/abs171106016 to evaluate DMIH and baselines. Specifically, after constructing the MIH tables, we conduct nearest neighbor search by performing multiindex hash lookup with different
and then do reranking on the selected nearest neighbors according to the corresponding deep features (2048dim) before the hash layer. Based on the reranking results, we choose the top20 nearest neighbors which have the minimum Euclidean distance to the query image and then calculate the precision and recall. At last, we summarize the time of multiindex hash lookup and reranking to draw the precisiontime and recalltime curves
DBLP:journals/corr/Cai16b ; DBLP:journals/corr/abs171106016 .Implementation Details
For DMIH, we set and for all the experiments based on crossvalidation strategy. We use Adam algorithm DBLP:journals/corr/KingmaB14 for learning and choose the learning rate from . The initial learning rate is set to and the weight decay parameter is set to
. The multibranch network based on ResNet50 is pretrained on ImageNet dataset
DBLP:conf/cvpr/DengDSLL009 . The input for the image modality is raw pixels with the size of . We fix the size of each minibatch to be 64, which is made up of 16 pedestrians and 4 images for each pedestrian.For nondeep hashing methods, we use two image features. The first one is the Local Maximal Occurrence (LOMO) feature DBLP:conf/cvpr/LiaoHZL15
. After getting the LOMO feature, we use PCA to reduce the dimensionality to 3,000. The second one is the deep features extracted by ResNet50 pretrained on ImageNet. Among hashing baselines, KSH and SDH are kernelbased methods. For these methods, 1,000 data points are randomly selected from training set as anchors to construct kernels by following the suggestion of the original authors. For deep hashing methods, we adopt the same multibranch network for a fair comparison. For all hashing based ReID methods, other hyperparameters are set by following the suggestion of the corresponding authors. The source code is available for all baselines except PDH and MGN. We carefully reimplement PDH and MGN using PyTorch.
5.3 Accuracy
Comparison with Hashing Methods
We report the mAP on three datasets in Table 1, where “COSDISH”/“COSDISH+CNN” denotes COSDISH with LOMO/deep features, respectively. Other notations are defined similarly. The CMC results are moved to supplementary materials due to space limitation. From Table 1, we can see that DMIH outperforms all baselines including deep hashing based ReID methods, deep hashing methods and nondeep hashing methods in all cases.
´  
Method  Market1501  DukeMTMCReID  CUHK03  
32 bits  64 bits  96 bits  128 bits  32 bits  64 bits  96 bits  128 bits  32 bits  64 bits  96 bits  128 bits  
DMIH  31.41  49.80  58.14  62.24  24.29  40.88  47.70  52.35  24.95  40.28  44.82  48.75 
PDH  24.75  38.73  46.07  50.76  16.30  30.79  39.16  43.48  18.72  31.27  35.10  41.02 
HashNet  13.10  22.23  25.53  26.26  8.10  13.53  15.74  18.41  12.79  16.39  17.83  18.27 
DPSH  12.34  20.27  24.68  29.42  12.87  20.02  26.45  29.42  11.22  16.11  19.72  20.82 
COSDISH  1.89  3.68  4.83  5.94  1.02  2.39  3.81  5.11  0.82  1.54  2.59  3.01 
SDH  1.65  2.93  3.78  4.06  0.98  1.89  2.25  2.42  1.00  1.24  1.32  1.65 
KSH  4.66  5.62  6.16  6.20  2.13  2.67  3.31  3.34  2.86  2.53  2.11  1.75 
ITQ  1.70  3.00  3.83  4.43  0.91  1.41  1.77  2.16  0.68  0.76  0.82  0.95 
LSH  0.44  0.83  1.18  1.68  0.40  0.58  0.83  1.06  0.37  0.46  0.44  0.68 
COSDISH+CNN  0.79  1.06  1.47  1.82  0.62  1.09  1.42  1.79  0.39  0.57  0.63  0.62 
SDH+CNN  0.73  1.26  1.55  1.67  0.66  0.89  1.06  1.40  0.44  0.65  0.63  0.63 
KSH+CNN  0.77  0.74  0.54  0.68  0.30  0.37  0.44  0.46  0.49  0.41  0.33  0.41 
ITQ+CNN  0.77  1.07  1.21  1.29  0.56  0.92  1.21  1.34  0.38  0.38  0.43  0.45 
LSH+CNN  0.50  0.77  1.04  1.27  0.48  0.74  0.99  1.17  0.33  0.35  0.35  0.42 
´ 
Furthermore, we also present the precisiontime curves on three datasets in Figure 3. Due to the mAP results in Table 1, nondeep hashing methods utilize LOMO features. The recalltime curves and more precisiontime curves with other bits are moved to the supplementary materials due to space limitation. From Figure 3, we can find that DMIH can achieve the highest precision among all the hash methods while costs less time in all cases. Hence, DMIH can significantly outperform existing nondeep hashing methods and deep hashing methods in terms of both efficiency and accuracy. In addition, we find that the methods with higher mAP or CMC do not necessarily have better precisiontime and recalltime. That is to say, only using mAP and CMC to evaluate hashing methods may not be comprehensive. So we adopt mAP, CMC, precisiontime and recalltime to comprehensively verify the promising efficiency and accuracy of our method.
Comparison with RealValued ReID Methods
We also compare our DMIH with some representative realvalued ReID methods. As the dimension of the realvalued features is usually high, we increase the binary code length of DMIH for fair comparison. We report CMC@20 and the corresponding retrieval time in Table 2, where the results of BoW+KISSME DBLP:conf/cvpr/KostingerHWRB12 , Spindle DBLP:conf/cvpr/ZhaoTSSYYWT17 and PDC DBLP:conf/iccv/SuLZX0T17 are directly copied from the original papers and “–" denotes that the result of the corresponding setting is not reported in the original papers. “DMIH(RK) (32 bits)” denotes the DMIH method of 32 bits with reranking after hash lookup. Other variants of DMIH are named similarly. The retrieval time includes the time for both hash lookup and reranking. Furthermore, we also report the speedup of DMIH relative to the best baseline MGN.
From Table 2, we can see that DMIH can still achieve the best performance in all cases when compared with stateoftheart realvalued ReID methods. In particular, DMIH can outperform the best realvalued baseline MGN in terms of both accuracy and efficiency with suitable code length. We can also find that DMIH can achieve higher accuracy by increasing binary code length. However, longer binary code typically leads to worse retrieval efficiency. In real applications, one can choose proper binary code length to get a good tradeoff between efficiency and accuracy.
´  
Method  Market1501  DukeMTMCReID  CUHK03  
CMC@20  Time  Speedup  CMC@20  Time  Speedup  CMC@20  Time  Speedup  
DMIH(RK) (32 bits)  97.4  2.34s  48  91.8  1.92s  43  85.6  0.48s  32 
DMIH(RK) (96 bits)  98.3  4.16s  27  94.3  3.44s  24  92.1  0.93s  17 
DMIH(RK) (256 bits)  98.4  7.66s  15  94.8  5.39s  15  92.9  1.54s  10 
DMIH(RK) (512 bits)  98.6  11.26s  10  95.4  9.07s  9  93.2  2.17s  7 
DMIH(RK) (1024 bits)  98.8  14.34s  8  95.8  11.00s  8  93.4  2.75s  6 
DMIH(RK) (2048 bits)  98.7  20.12s  6  96.3  15.86s  5  93.5  4.19s  4 
BoW+KISSME DBLP:conf/cvpr/KostingerHWRB12  78.5  –  –  –  –  –  –  –  – 
Spindle DBLP:conf/cvpr/ZhaoTSSYYWT17  96.7  –  –  –  –  –  –  –  – 
PDC DBLP:conf/iccv/SuLZX0T17  96.8  –  –  –  –  –  –  –  – 
MGN DBLP:conf/mm/WangYCLZ18  98.7  113.26s  1  95.9  82.79s  1  92.2  15.73s  1 
´ 
5.4 Ablation Study
We conduct experiments to study whether all the loss terms in DMIH are necessary by removing these loss terms separately. The result on Market1501 dataset with 32 bits and 96 bits is presented in Figure 5. Here, “DMIH/SAMI” denotes the DMIH variant without the SAMI loss , i.e., , and other notations are defined similarly. We can find that softmax loss and triplet loss can significantly improve accuracy. Furthermore, by comparing DMIH with DMIH/SAMI, we can find that SAMI loss can further accelerate the query speed without losing accuracy. As the time shown in Figure 5 contains the hash table lookup time and reranking time, we compare the hash lookup time separately of DMIH with DMIH/SAMI in Figure 5. From Figure 5, we can see that DMIH can achieve times acceleration of hash lookup with the help of SAMI loss.
5.5 Effect of BlockWise MIH Table Construction
(a) Market1501@32, 96 bits  (b) Market1501@32, 96 bits 
To verify the effectiveness of the proposed blockwise MIH tables construction strategy, we compare DMIH with a variant without the blockwise MIH table construction strategy. More specifically, for the variant without the blockwise MIH table construction strategy, we directly divide the learned binary code into subbinary codes to construct MIH tables. For fair comparison, we utilize the same learned binary code for both DMIH and the variant. The precisiontime and recalltime curves on Market1501 dataset with binary code length being 32 bits and 96 bits are shown in Figure 6, where “DMIH(32)/BW” denotes the variant without the blockwise MIH construction strategy and other notations are defined similarly. From Figure 6, we can see that using blockwise strategy can get a speedup in retrieval efficiency without losing accuracy.
6 Conclusion
In this paper, we propose a novel deep hashing method, called DMIH, for person ReID. DMIH is an endtoend deep learning framework, which integrates multiindex hashing and multibranch based networks into the same framework. Furthermore, we propose a novel blockwise multiindex hashing tables construction strategy and a searchaware multiindex loss to further improve the search efficiency. Experiments on real datasets show that DMIH can outperform other baselines to achieve the stateoftheart retrieval performance in terms of both efficiency and accuracy.
References
 (1) D. Cai. A revisit of hashing algorithms for approximate nearest neighbor search. CoRR, abs/1612.07545, 2016.
 (2) D. Cai, X. Gu, and C. Wang. A revisit on deep hashings for largescale content based image retrieval. CoRR, abs/1711.06016, 2017.
 (3) Z. Cao, M. Long, J. Wang, and P. S. Yu. Hashnet: Deep learning to hash by continuation. In ICCV, pages 5609–5618, 2017.
 (4) X. Chang, T. M. Hospedales, and T. Xiang. Multilevel factorisation net for person reidentification. In CVPR, pages 2109–2118, 2018.
 (5) J. Chen, Y. Wang, J. Qin, L. Liu, and L. Shao. Fast person reidentification via crosscamera semantic binary transformation. In CVPR, pages 5330–5339, 2017.
 (6) B. Dai, R. Guo, S. Kumar, N. He, and L. Song. Stochastic generative hashing. In ICML, pages 913–922, 2017.
 (7) M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni. Localitysensitive hashing scheme based on pstable distributions. In SCG, pages 253–262, 2004.
 (8) J. Deng, W. Dong, R. Socher, L. Li, K. Li, and F. Li. Imagenet: A largescale hierarchical image database. In CVPR, pages 248–255, 2009.
 (9) P. F. Felzenszwalb, R. B. Girshick, D. A. McAllester, and D. Ramanan. Object detection with discriminatively trained partbased models. TPAMI, 32(9):1627–1645, 2010.
 (10) Y. Ge, Z. Li, H. Zhao, G. Yin, S. Yi, X. Wang, and H. Li. FDGAN: poseguided feature distilling GAN for robust person reidentification. In NeurIPS, pages 1230–1241, 2018.
 (11) Y. Gong and S. Lazebnik. Iterative quantization: A procrustean approach to learning binary codes. In CVPR, pages 817–824, 2011.
 (12) K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, pages 770–778, 2016.
 (13) A. Hermans, L. Beyer, and B. Leibe. In defense of the triplet loss for person reidentification. CoRR, abs/1703.07737, 2017.
 (14) G. Huang, S. Liu, L. van der Maaten, and K. Q. Weinberger. Condensenet: An efficient densenet using learned group convolutions. In CVPR, pages 2752–2761, 2018.
 (15) W.C. Kang, W.J. Li, and Z.H. Zhou. Column sampling based discrete supervised hashing. In AAAI, pages 1230–1236, 2016.
 (16) D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. CoRR, abs/1412.6980, 2014.
 (17) M. Köstinger, M. Hirzer, P. Wohlhart, P. M. Roth, and H. Bischof. Large scale metric learning from equivalence constraints. In CVPR, pages 2288–2295, 2012.
 (18) Q. Li, Z. Sun, R. He, and T. Tan. Deep supervised discrete hashing. In NeurIPS, pages 2479–2488, 2017.
 (19) W. Li, R. Zhao, T. Xiao, and X. Wang. Deepreid: Deep filter pairing neural network for person reidentification. In CVPR, pages 152–159, 2014.
 (20) W.J. Li, S. Wang, and W. Kang. Feature learning based deep supervised hashing with pairwise labels. In IJCAI, pages 1711–1717, 2016.
 (21) X. Li, G. Lin, C. Shen, A. van den Hengel, and A. R. Dick. Learning hash functions using column generation. In ICML, pages 142–150, 2013.
 (22) S. Liao, Y. Hu, X. Zhu, and S. Z. Li. Person reidentification by local maximal occurrence representation and metric learning. In CVPR, pages 2197–2206, 2015.
 (23) W. Liu, C. Mu, S. Kumar, and S. Chang. Discrete graph hashing. In NeurIPS, pages 3419–3427, 2014.
 (24) W. Liu, J. Wang, R. Ji, Y. Jiang, and S. Chang. Supervised hashing with kernels. In CVPR, pages 2074–2081, 2012.
 (25) W. Liu, J. Wang, S. Kumar, and S. Chang. Hashing with graphs. In ICML, pages 1–8, 2011.
 (26) M. Norouzi and D. J. Fleet. Minimal loss hashing for compact binary codes. In ICML, pages 353–360, 2011.
 (27) M. Norouzi, A. Punjani, and D. J. Fleet. Fast exact search in hamming space with multiindex hashing. TPAMI, 36(6):1107–1119, 2014.
 (28) A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer. Automatic differentiation in pytorch. In NeurIPSW, 2017.
 (29) E. Ristani, F. Solera, R. S. Zou, R. Cucchiara, and C. Tomasi. Performance measures and a data set for multitarget, multicamera tracking. In ECCV, pages 17–35, 2016.
 (30) F. Shen, C. Shen, W. Liu, and H. T. Shen. Supervised discrete hashing. In CVPR, pages 37–45, 2015.
 (31) C. Su, J. Li, S. Zhang, J. Xing, W. Gao, and Q. Tian. Posedriven deep convolutional model for person reidentification. In ICCV, pages 3980–3989, 2017.
 (32) S. Su, C. Zhang, K. Han, and Y. Tian. Greedy hash: Towards fast optimization for accurate hash coding in CNN. In NeurIPS, pages 806–815, 2018.
 (33) Y. Sun, L. Zheng, Y. Yang, Q. Tian, and S. Wang. Beyond part models: Person retrieval with refined part pooling (and A strong convolutional baseline). In ECCV, pages 501–518, 2018.
 (34) C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. E. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In CVPR, pages 1–9, 2015.
 (35) G. Wang, Y. Yuan, X. Chen, J. Li, and X. Zhou. Learning discriminative features with multiple granularities for person reidentification. In MM, pages 274–282, 2018.
 (36) J. Wang, S. Kumar, and S. Chang. Sequential projection learning for hashing with compact codes. In ICML, pages 1127–1134, 2010.
 (37) Y. Weiss, A. Torralba, and R. Fergus. Spectral hashing. In NeurIPS, pages 1753–1760, 2008.
 (38) S. Xie, R. B. Girshick, P. Dollár, Z. Tu, and K. He. Aggregated residual transformations for deep neural networks. In CVPR, pages 5987–5995, 2017.
 (39) F. X. Yu, S. Kumar, Y. Gong, and S. Chang. Circulant binary embedding. In ICML, pages 946–954, 2014.

(40)
R. Zhang, L. Lin, R. Zhang, W. Zuo, and L. Zhang.
Bitscalable deep hashing with regularized similarity learning for image retrieval and person reidentification.
TIP, 24(12):4766–4779, 2015.  (41) W. Zhang, K. Gao, Y. Zhang, and J. Li. Efficient approximate nearest neighbor search with integrated binary codes. In MM, pages 1189–1192, 2011.
 (42) H. Zhao, M. Tian, S. Sun, J. Shao, J. Yan, S. Yi, X. Wang, and X. Tang. Spindle net: Person reidentification with human body region guided feature decomposition and fusion. In CVPR, pages 907–915, 2017.
 (43) R. Zhao, W. Ouyang, and X. Wang. Learning midlevel filters for person reidentification. In CVPR, pages 144–151, 2014.
 (44) F. Zheng and L. Shao. Learning crossview binary identities for fast person reidentification. In IJCAI, pages 2399–2406, 2016.
 (45) L. Zheng, L. Shen, L. Tian, S. Wang, J. Wang, and Q. Tian. Scalable person reidentification: A benchmark. In ICCV, pages 1116–1124, 2015.
 (46) L. Zheng, Y. Yang, and A. G. Hauptmann. Person reidentification: Past, present and future. CoRR, abs/1610.02984, 2016.
 (47) Z. Zheng, L. Zheng, and Y. Yang. Unlabeled samples generated by GAN improve the person reidentification baseline in vitro. In ICCV, pages 3774–3782, 2017.
 (48) Z. Zhong, L. Zheng, D. Cao, and S. Li. Reranking person reidentification with kreciprocal encoding. In CVPR, pages 3652–3661, 2017.
 (49) F. Zhu, X. Kong, L. Zheng, H. Fu, and Q. Tian. Partbased deep hashing for largescale person reidentification. TIP, 26(10):4806–4817, 2017.
Comments
There are no comments yet.