Face recognition as an active research topic in pattern recognition has been intensively studied for more than two decades[31, 2, 32, 1, 42, 33, 6, 29]. Face recognition works in two essentially different modes: face verification and face identification. Face verification performs 1:1 matching to provide a binary decision on the claimed identity. The performance of face verification in controlled scenarios has reached a rather high accuracy . Recently, more research efforts have been devoted to face verification in unconstrained conditions. By utilizing strong alignment approaches [4, 36, 7], pose-robust matching schemes [14, 25, 18]
, and advanced deep-learning techniques[16, 29, 28], significant improvement in the performance of face verification has been achieved. The performance on the benchmark, the Labeled Face in the Wild (LFW) dataset , is close or even go beyond that of human beings .
In contrast, face identification is more difficult. It performs 1:N matching to sort out the gallery images based on the pair wise similarity measurement. Obviously, the operating requirement in face identification is vastly more demanding than operating merely in verification: an identifier needs to be roughly N
times better than a verifier to achieve comparable odds against making false matches
. This is probably the reason why the progress that has been made in face identification was not significant in the last five years. Even though the proposed approaches were more and more complex, the recognition performance on the benchmark evaluation is almost the same[34, 35, 3, 5]. To make a breakthrough in face identification, it seems that we have to revisit the basis of face recognition, and have a fresh look at the fundamental building blocks of face recognition.
In face identification, one of the most basic building blocks is the construction of features for measuring the similarity between two face images. The construction of features consists of two steps: (1) selection of a suitable representation for face images; (2) feature extraction from the representation. There is a large collection of research papers on how to extract stable, local or global discriminative features, e.g. the commonly used SIFT , HOG , and LBP . Recently, these features are criticized because they are ‘hand-crafted’. It was claimed that better features can learned from big data collections through deep-learning approaches [16, 29, 28]. But due to inherent complexity of face identification, there is no literature on using deep-learning to learn features for face identification. On the other hand, a significant progress has been made in the selection of more suitable representation of faces for face identification. Compared to using the raw image pixels, feature extraction from the amplitude or phase spectrum of the Gabor filtered images has shown a significant improvement in the performance of face identification. Although the complexity is raised 40 times or even more, the identification rate in benchmark evaluations is improved for more than 20%, reaching around 90% . This is due to the so-called ‘blessing of dimensionality’. Now the question is how to achieve face identification rate from 90% to 95%, or even higher. In this paper we argue that Gabor phase could enable such performance improvement for face identification.
Most existing face identification algorithms extract discriminative features from the Gabor amplitude. The major advantage of the Gabor amplitude is that it varies slowly with the spatial position so that it is robust to texture variations caused by dynamic expressions and imprecise alignment. But the problem is Gabor amplitude depends on the imaging contrast and illumination . This means that if two photos were taken at two times in different environments (unfortunately, this is the most common practice of face identification that the probe and gallery images were taken in two occasions), it is hard to extract consistent features from the Gabor amplitude.
Alternatively, the Gabor phase can be utilized since it is robust to light change. In fact, it has been well-known for a long time that phase is more important than amplitude from the signal reconstruction point of view (see ). Gabor phase should have played more important role in face identification. However, the use of Gabor phase in face recognition is far from being common and successful: it often had worse or nearly the same performance as the amplitude in comparative experiments [12, 40, 34, 3]. This is largely due to two challenging issues: (1) Gabor phase is a periodic function and a hard quantization occurs for every period; (2) it is very sensitive to spatial shift [32, 40], which imposes a rigid requirement on face image alignment. The first issue was partly solved by introducing the phase-quadrant demodulation technique , but the second one is still far from being solved. The state-of-the-art Gabor phase approach (LGXP ) extracts varied LBP from the phase spectrum. Since the combination of the phase and LBP is also sensitive to the spatial shift, the power of Gabor phase does not demonstrate in face identification.
In this paper, we propose a method that merely leverages the power of Gabor phase to conduct face matching for face identification. By using the Block Matching method , our approach does not depend on training process or fusion of any other features. Benchmark evaluations on the FERET show that the identification rate reaches 95% on the hardest ‘Dup2’, which is the best result reported in the literature so far.
2 Related work
In this section, we first briefly review the Gabor representation and recent Gabor based methods that utilizing the Gabor amplitude or phase in different ways. Then we introduce the Block Matching method, which features a different matching strategy from the most existing approaches, used in this work.
2.1 Gabor Wavelet and Related Approaches
A Gabor face is obtained by filtering the image with the Gabor filter function, which is defined as:
define the orientation and scale of the Gabor kernels respectively, and the wave vector is defined as:
where , ; is the maximum frequency, is the relative width of the Gaussian envelop, and is the spacing factor between kernels in the frequency domain . The discrete filter bank of 5 different spatial frequencies () and 8 orientations () are mostly exploited to filter the face images to facilitate multi-scale analysis.
To extract the Gabor features, one popular way is to extract the LBP-type patterns from the complex Gabor transformed image. As in , the LGBP feature is extracted from the amplitude spectrum. In  and , 4-ray phase-quadrant demodulator is applied to demodulate the phase from each complex Gabor coefficients, and local binary phase descriptors are subsequently generated from the demodulated phase spectrum. Gabor local descriptors can also be built by applying dimension reduction technique (e.g. FLD) to the raw Gabor amplitude patches as in , or by computing the local representation exemplified as GOM  and SLF .
Fusing the features that are independent of the local Gabor features can also lead to better performance: [30, 27, 38] fuses the global (holistic) features with local ones on feature level;  proposes fusion of Gabor phase and amplitude on the score and feature levels;  fuses real, imaginary, amplitude and phase. Alternatively, attaching illumination normalization step and weighting the local Gabor features is shown to be helpful as well .
2.2 The Block Matching method
While most local matching approaches perform matching only between the spatially corresponding patches, some others allow each segmented patch of one image to search the best matching from spatially neighboring locations on the other image to achieve better robustness to the spatial shift of the textures as in [17, 32, 14, 41].
The Block Matching method proposed by [Anonymous authors]  exploits the searching strategy to find best patch-wise matches, and during the same process, performs pair-specific normalization to compute the patch-wise distance. It can conduct face matching with any two face pair at hand without training. Although only raw pixel intensity was used, the evaluations show that the Block Matching outperformed the LBP method and its improved high-order variants .
3 The Gabor Phase Block Matching Approach
Although feature descriptors and similarity metrics are different, most of the existing local based methods perform matching of local features only between spatially corresponding patches. This means that the matching philosophy embedded in most of the local based approaches by default sees the spatially corresponding patches/features as the best match (since they perform matching only between spatially corresponding local regions). It might be true when two faces have different shapes and configurations, but due to the movement of facial components, head pose or imprecise alignment, the spatially corresponding patches would cover different face regions (as Fig. 2 in ). That is, due to the sensitiveness to the spatial shift of Gabor phase, the best matching regions are likely located away from the corresponding locations even if the matching face pairs are aligned. A better solution is to utilize searching scheme that enable each region to search for the best matching. In other words, to fairly match the local phase descriptors it is intuitive to combine a searching strategy that enables each local phase patch to search for its best match from the neighborhood locations.
Based on this thinking, we adopt the Block Matching method  for matching Gabor phase . The matching process of our Gabor Phase Block Matching (GPBM) is illustrated in Fig.1 with the details of the Block Matching searching scheme shown in Fig.2. A two-step matching process is conducted for matching a probe-gallery face pair. First, the face images are transformed to Gabor space using two Gabor filters of 1 scale and 2 orientations. The filtered images are demodulated by a Gray-coded Phase Shift Keying (PSK) demodulator to quantize the phase. Second, the demodulated phase spectra are inputted to the Block Matching method  to determine the pair wise distance between a probe () image and a gallery () image.
Specifically in Step 2 of Fig. 1, we first segment the probe phase spectrum into non-overlapping patches and each of the patches is simply formed by the raw phase code of the block. For each probe patch centered at image coordinate (noted as ), it searches within the corresponding searching window and yields a patch-wise distance vector which is noted as:
where is the number of candidate gallery patches within the searching window, i.e. when applying full search method, and stands for the searching offset in vertical and horizontal directions respectively. Each element in is computed by performing explicit matching over the raw demodulated phase as:
where the patch-wise distance metric is the 2-norm of element wise Humming distance in decimal and denotes the coordinate of the patch center within searching window on the gallery face image so that,
We then calculate the slope of the linear fitting of the first 5 ascendingly sorted values of for normalization of the patch wise distance for each patch, such that the preliminary normalization factor is
where . It is further normalized as:
Finally, the distance between a matching pair of probe and gallery face is
4 Experiments and Results
4.1 Database Selection
There are a variety of large-scale datasets available for benchmark evaluation of different face recognition approaches, such as the FERET , FRGC 2.0 and the LFW dataset. But the FRGC2.0 and LFW are dedicated face verification benchmarks. Thus we select the FERET database which is the most commonly recognized face identification benchmark to evaluate and compare our method with state-of-the-art approaches. In addition, the CMU-PIE  dataset is selected to evaluate our GPBM under variations of pose, expression and illumination in the probe images.
4.2 Experimental Setup
Face images are first normalized based on the positions of both eyes and the Gabor filtered face image to so that the same height/width ratio (1.1 : 1) is maintained the same as in [34, 37]. We use a Gabor filter with , , , , . Accordingly, Gray-coded 16-PSK demodulator is used for phase quantization and the constellation is shown in Fig.3.
4.3 Evaluations on the CMU-PIE database
The CMU-PIE database contains 41368 images of 68 subjects. Images with Pose Label 05, 07, 09, 27, and 29 under 21 illuminations (Flash 2 to 22) of all the 68 persons are selected as the probe set.
In the Blocking Matching method, the most important parameters are the block size ( and ) and searching offset ( and ). We have conducted a set of empirical tests over other datasets to select suitable parameters. We found that and satisfying , and work quite good. In our experiments, we select the block size of and search offset of , pixels (a tradeoff between performance and complexity). For this group of parameters we selected the first 2000 probe images to evaluate its effect on the performance. In addition, we check how sensitive the performance is to the selection of parameters by looking around the selected group of parameters. The test results are shown in Fig. 4. From the test results one can see that the performance is quite robust to the selection of parameters.
We then conduct experiments on the CMU-PIE probe set and compare our GPBM with G_LBP and G_LDP . The G_LBP is the Gabor version LBP and the G_LDP is a type of improved Gabor amplitude Local Binary Pattern. The G_LDP achieved equivalent performance as LGXP (Gabor Phase pattern) on the FERET evaluations so that it is a good reference for comparison. The comparative rank-1 recognition rates are listed in Table 1. It can be seen that our method is at least 3% better than the G_LDP, even though LDP extract much more effective patterns than the LBP from the Gabor amplitude space. Utilizing the Gabor phase in the Block Matching scheme is more effective in dealing with pose and illumination changes than LBP-type patterns extracted from the Gabor amplitude space.
4.4 Evaluations on the FERET database
The FERET database contains 1196 frontal images in the gallery set, 1195 images with different expressions in the probe set ‘Fa’, 194 images with illumination variations in the probe set ‘Fc’, 722 images taken in later time in the ‘Dup1’ set, and 234 images taken at least 1 year later than the gallery set form the hardest ‘Dup2’ set. We faithfully follow the evaluation protocol of the FERET dataset. The results of our GPBM with other approaches using Gabor-phase are listed in Table 2.
From Table 2 we can see that in a fair comparison, where only Gabor phase is utilized for matching, our GPBM is almost 12% better than LGXP in the hardest ‘Dup2’; even in unfavorable comparisons, where pre-processing, training, and fusion method are exploited by LMGEW//LN+LGXP and S[LGBP Mag+LGXP], our GPBM still outperforms. To our best knowledge, the S[LGBP_ Mag+LGXP] — aided by the Gabor amplitude and training procedures — was state-of-the-art Gabor phase based method in the hardest FERET ‘Dup2’, and our GPBM is better than that.
|S[LGBP+LGXP] ||No||Amplitude + Phase||No||93%|
|GOM ||No||Amplitude + Phase||No||93.1%|
We also comprehensively compare our GPBM with other recent state-of-the-art approaches based on other techniques on the FERET in Table 3. From the table one can see that all these approaches are based on Gabor features, which indicates the Gabor space is a very effective signal representation. Our GPBM method outperforms all the other approaches on the hardest ‘Dup2’ set and it features three advantages: 1) it only requires two Gabor filters, which is 20 times less than other methods which utilized 40 Gabor filters; 2) it does not require any training procedure or any attached database to perform face matching; 3) it uses merely raw phase for matching and no feature extraction was performed.
The computational complexity is always a big concern. By Table 4 in , under the image size of with a Gabor filter bank, the histogram extraction of LGBP takes around 0.45 second, S[LGBP_ Mag+LGXP] takes 0.99 second. Extracting GOM feature takes 0.7 second . But for our method, the ‘feature extraction’ time is 0 since only raw phase is used for matching; the demodulation is the only required on-line computation of the probe face and it is extremely fast. Our Matlab implementation executes the matching of a face pair in 0.05 second in average (Gabor filtering included) on a 3.4GHz Intel CPU. We can therefore safely conclude that our GPBM outperforms the best Gabor-phase based approach (S[LGBP_ Mag+LGXP]) in efficiency for a big margin and we can also infer that the other methods in Table 3 could hardly be more efficient than our GPBM due to higher image resolution, Gabor face dimensions, and additional photometric processing. Here we should mention that our GPBM used the very basic ‘exhaust search’ strategy which is not an efficient choice for the matching. Our preliminary experiments show that there is a large potential to achieve even higher efficiency by using optimized searching strategy. (Optimization of the Block Matching framework is outside the scope of this paper.)
In this paper, we propose a plain approach to leverage the demodulated Gabor phase for face identification based on the Block Matching method. The proposed approach dose not utilize a large Gabor filter bank or any training process. Rather, it only depends on the signal representation from a single-scale Gabor filter pair and performs explicit matching over the raw Gabor phase spectrum.
Comparative experiments show that: 1) our approach features the highest accuracy utilizing the Gabor phase for face recognition; 2) our approach has very low computational complexity with totally comparable performance to state-of-the-art methods. Our experiments also confirm that the Gabor phase is a powerful source to construct features for face identification. To leverage the power of the Gabor phase, the key is to have a suitable feature construction approach. In this paper, we show that the Blocking Matching is a good choice for such purpose. We strongly recommend the Block Matching as an alternative to the commonly adopted LBP should be used more in face recognition.
Ahonen, T., Hadid, A., Pietikäinen, M.: Face recognition with local binary patterns. In: Computer vision-eccv 2004, pp. 469–481. Springer (2004)
-  Belhumeur, P.N., Hespanha, J.P., Kriegman, D.: Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. Pattern Analysis and Machine Intelligence, IEEE Transactions on 19(7), 711–720 (1997)
-  Cament, L.A., Castillo, L.E., Perez, J.P., Galdames, F.J., Perez, C.A.: Fusion of local normalization and gabor entropy weighted features for face identification. Pattern Recognition 47(2), 568–577 (2014)
-  Cao, X., Wei, Y., Wen, F., Sun, J.: Face alignment by explicit shape regression. International Journal of Computer Vision 107(2), 177–190 (2014)
-  Chai, Z., Sun, Z., Mendez-Vazquez, H., He, R., Tan, T.: Gabor ordinal measures for face recognition. Information Forensics and Security, IEEE Transactions on 9(1), 14–26 (Jan 2014)
-  Chan, C.H., Tahir, M.A., Kittler, J., Pietikainen, M.: Multiscale local phase quantization for robust component-based face recognition using kernel fusion of multiple descriptors. Pattern Analysis and Machine Intelligence, IEEE Transactions on 35(5), 1164–1177 (2013)
-  Chen, D., Cao, X., Wang, L., Wen, F., Sun, J.: Bayesian face revisited: A joint formulation. In: Computer Vision–ECCV 2012, pp. 566–579. Springer (2012)
-  Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on. vol. 1, pp. 886–893. IEEE (2005)
-  Daugman, J.: Probing the uniqueness and randomness of iriscodes: Results from 200 billion iris pair comparisons. Proceedings of the IEEE 94(11), 1927–1935 (Nov 2006)
-  Daugman, J.: Statistical richness of visual phase information: update on recognizing persons by iris patterns. International Journal of computer vision 45(1), 25–38 (2001)
-  Daugman, J.: How iris recognition works. Circuits and Systems for Video Technology, IEEE Transactions on 14(1), 21–30 (2004)
-  Gao, Y., Wang, Y., Zhu, X., Feng, X., Zhou, X.: Weighted gabor features in unitary space for face recognition. In: Automatic Face and Gesture Recognition, 2006. FGR 2006. 7th International Conference on. pp. 6–pp. IEEE (2006)
-  Givens, G.H., Beveridge, J.R., Lui, Y.M., Bolme, D.S., Draper, B.A., Phillips, P.J.: Biometric face recognition: from classical statistics to future challenges. Wiley Interdisciplinary Reviews: Computational Statistics 5(4), 288–308 (2013)
-  Hua, G., Akbarzadeh, A.: A robust elastic and partial matching metric for face recognition. In: Computer Vision, 2009 IEEE 12th International Conference on. pp. 2082–2089. IEEE (2009)
-  Huang, G.B., Mattar, M., Berg, T., Learned-miller, E.: E.: Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Tech. rep. (2007)
Huang, G., Lee, H., Learned-Miller, E.: Learning hierarchical representations for face verification with convolutional deep belief networks. In: Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. pp. 2518–2525 (June 2012)
-  Lades, M., Vorbruggen, J.C., Buhmann, J., Lange, J., von der Malsburg, C., Wurtz, R.P., Konen, W.: Distortion invariant object recognition in the dynamic link architecture. Computers, IEEE Transactions on 42(3), 300–311 (1993)
-  Li, H., Hua, G., Lin, Z., Brandt, J., Yang, J.: Probabilistic elastic matching for pose variant face verification. In: Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on. pp. 3499–3506. IEEE (2013)
-  Liu, C., Wechsler, H.: Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition. Image processing, IEEE Transactions on 11(4), 467–476 (2002)
-  Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International journal of computer vision 60(2), 91–110 (2004)
-  Mu, M., Ruan, Q., Guo, S.: Shift and gray scale invariant features for palmprint identification using complex directional wavelet and local binary pattern. Neurocomputing 74(17), 3351–3360 (2011)
-  Oppenheim, A.V., Lim, J.S.: The importance of phase in signals. Proceedings of the IEEE 69(5), 529–541 (1981)
-  Phillips, P.J., Flynn, P.J., Scruggs, T., Bowyer, K.W., Chang, J., Hoffman, K., Marques, J., Min, J., Worek, W.: Overview of the face recognition grand challenge. In: Computer vision and pattern recognition, 2005. CVPR 2005. IEEE computer society conference on. vol. 1, pp. 947–954. IEEE (2005)
-  Phillips, P.J., Moon, H., Rizvi, S.A., Rauss, P.J.: The feret evaluation methodology for face-recognition algorithms. Pattern Analysis and Machine Intelligence, IEEE Transactions on 22(10), 1090–1104 (2000)
-  Pinto, N., DiCarlo, J.J., Cox, D.D.: How far can you get with a modern face recognition test set using only simple features? In: Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. pp. 2591–2598. IEEE (2009)
-  Sim, T., Baker, S., Bsat, M.: The cmu pose, illumination, and expression (pie) database. In: Automatic Face and Gesture Recognition, 2002. Proceedings. Fifth IEEE International Conference on. pp. 46–51. IEEE (2002)
Su, Y., Shan, S., Chen, X., Gao, W.: Hierarchical ensemble of global and local classifiers for face recognition. Image Processing, IEEE Transactions on 18(8), 1885–1896 (2009)
-  Sun, Y., Wang, X., Tang, X.: Deep learning face representation from predicting 10,000 classes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 1891–1898 (2013)
-  Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: Closing the gap to human-level performance in face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 1701–1708 (2013)
-  Tan, X., Triggs, B.: Fusing gabor and lbp feature sets for kernel-based face recognition. In: Analysis and Modeling of Faces and Gestures, pp. 235–249. Springer (2007)
-  Turk, M.A., Pentland, A.P.: Face recognition using eigenfaces. In: Computer Vision and Pattern Recognition, 1991. Proceedings CVPR’91., IEEE Computer Society Conference on. pp. 586–591. IEEE (1991)
-  Wiskott, L., Fellous, J.M., Kuiger, N., Von Der Malsburg, C.: Face recognition by elastic bunch graph matching. Pattern Analysis and Machine Intelligence, IEEE Transactions on 19(7), 775–779 (1997)
-  Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. Pattern Analysis and Machine Intelligence, IEEE Transactions on 31(2), 210–227 (2009)
-  Xie, S., Shan, S., Chen, X., Chen, J.: Fusing local patterns of gabor magnitude and phase for face recognition. Image Processing, IEEE Transactions on 19(5), 1349–1361 (2010)
Yang, M., Zhang, L., Shiu, S.K., Zhang, D.: Robust kernel representation with statistical local features for face recognition. Neural Networks and Learning Systems, IEEE Transactions on 24(6), 900–912 (2013)
-  Yi, D., Lei, Z., Li, S.Z.: Towards pose robust face recognition. In: Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on. pp. 3539–3545. IEEE (2013)
-  Zhang, B., Gao, Y., Zhao, S., Liu, J.: Local derivative pattern versus local binary pattern: face recognition with high-order local pattern descriptor. Image Processing, IEEE Transactions on 19(2), 533–544 (2010)
-  Zhang, B., Shan, S., Chen, X., Gao, W.: Histogram of gabor phase patterns (hgpp): A novel object representation approach for face recognition. Image Processing, IEEE Transactions on 16(1), 57–68 (Jan 2007)
-  Zhang, W., Shan, S., Gao, W., Chen, X., Zhang, H.: Local gabor binary pattern histogram sequence (lgbphs): A novel non-statistical model for face representation and recognition. In: Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on. vol. 1, pp. 786–791. IEEE (2005)
-  Zhang, W., Shan, S., Qing, L., Chen, X., Gao, W.: Are gabor phases really useless for face recognition? Pattern Analysis and Applications 12(3), 301–307 (2009)
-  Zhong, Y., Li, H.: Is block matching an alternative tool to lbp for face recognition? In: Image Processing (ICIP), 2014 IEEE International Conference on. pp. 723–727 (Oct 2014)
-  Zou, J., Ji, Q., Nagy, G.: A comparative study of local matching approach for face recognition. Image Processing, IEEE Transactions on 16(10), 2617–2628 (2007)