Local Directional Relation Pattern for Unconstrained and Robust Face Retrieval

09/20/2017
by   Shiv Ram Dubey, et al.
0

Face recognition is still a very demanding area of research. This problem becomes more challenging in unconstrained environment and in the presence of several variations like pose, illumination, expression, etc. Local descriptors are widely used for this task. The existing local descriptors are not able to utilize the wider local information to make the descriptor more discriminative. The wider local information based descriptors mainly suffer due to the increased dimensionality. In this paper, this problem is solved by encoding the relationship among directional neighbors in an efficient manner. The relationship between the center pixel and the encoded directional neighbors is utilized further to form the proposed local directional relation pattern (LDRP). The descriptor is inherently uniform illumination invariant. The multi-scale mechanism is also adapted to further boost the discriminative ability of the descriptor. The proposed descriptor is evaluated under the image retrieval framework over face databases. Very challenging databases like PaSC, LFW, PubFig, ESSEX, FERET, and AT&T are used to test the discriminative ability and robustness of LDRP descriptor. Results are also compared with the recent state-of-the-art face descriptors such as LBP, LTP, LDP, LDN, LVP, DCP, LDGP and LGHP. Very promising performance is observed using the proposed descriptor over very appealing face databases as compared to the existing face descriptors.

READ FULL TEXT VIEW PDF

Authors

page 5

01/03/2022

Local Gradient Hexa Pattern: A Descriptor for Face Recognition and Retrieval

Local descriptors used in face recognition are robust in a sense that th...
09/16/2017

Face Retrieval using Frequency Decoded Local Descriptor

The local descriptors have been the backbone of most of the computer vis...
01/21/2014

Multi-Directional Multi-Level Dual-Cross Patterns for Robust Face Recognition

To perform unconstrained face recognition robust to variations in illumi...
12/31/2018

DCI: Discriminative and Contrast Invertible Descriptor

Local feature descriptors have been widely used in fine-grained visual o...
05/02/2019

Face Identification using Local Ternary Tree Pattern based Spatial Structural Components

This paper reports groundbreaking results of a face identification syste...
04/02/2018

Average Biased ReLU Based CNN Descriptor for Improved Face Retrieval

The convolutional neural networks (CNN) like AlexNet, GoogleNet, VGGNet,...
10/03/2012

Robust Degraded Face Recognition Using Enhanced Local Frequency Descriptor and Multi-scale Competition

Recognizing degraded faces from low resolution and blurred images are co...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

I-a Motivation

Unconstrained and robust face recognition is the current demand for the betterment of the quality life. Face is a biometric trait which can be captured very easily without making users uncomfortable and co-operative using surveillance cameras, etc. The research in this field is being done from the last two decades. Most of the early days research has been conducted in a very controlled environment, where users have given their facial images in frontal pose, under consistent lighting, without glasses or occlusion, etc.

Some researchers also developed the face recognition approaches robust for specific geometric and photometric changes. Wright et al. used sparse representation for the face recognition [1]. Ding et al. proposed the pose-invariant face recognition [2]. Discriminant analysis over multi-view images are conducted for face recognition across pose [3]. Ding et al. used pose normalization for pose-invariant face recognition [4]. The face recognition robust to more than one effect is also investigated. Punnappurath et al. proposed the face recognition across motion blur, illumination and pose [5]. The face recognition approaches are surveyed time to time by many researchers [6], [7], [8].

The face recognition approaches are categorized into two major areas namely, deep learning based face recognition and descriptor based face recognition. Some recent deep learning based approaches are FaceNet

[9] and DeepFace [10] for face recognition. The deep learning based approaches are being popular due to high performance, but at the cost of increased complexity of training in terms of the time, computing power and data size. The deep learning based approaches are also biased towards the training data.

The descriptor based face recognition approaches can be divided into learning based descriptors and handcrafted descriptors. Cao et al. learned an encoder from the training examples in unsupervised manner and applied PCA to reduce the dimension [11]. A family of face-image descriptors are used with learned background samples by Wolf et al. for face recognition [12]. The discriminant image filters are learned with soft determined neighborhood sampling strategy [13]

. The pixel difference vectors in local patches are mapped into low-dimensional binary vectors in an unsupervised manner to learn the binary face descriptor for Face Recognition

[14]. The simultaneous local binary feature learning and encoding approach learns the binary codes and the codebook jointly for face recognition [15]. The main drawback of the learning based descriptors are the dependency over the training database and vocabulary size.

The hand-designed local descriptors are very simple from design aspect. This class of descriptors have shown very promising performance in most of the computer vision problems

[16]. Some typical applications are local image matching [17], image retrieval [18],[19], [20], [21], texture classification [22], [23], [24], [25], medical image retrieval [26], [27], [28], [29], 3D face recognition [30], [31], etc. The main advantages of the handcrafted local descriptors are as follows: a) it is not dependent upon the database, b) it does not require very complex computing facility, and c) lower dimensional descriptors can boost the time efficiency significantly.

I-B Related Works

Several face descriptors have been investigated in the last decade. Ahonen et al. applied the local binary pattern (LBP) for the face recognition [32]. The LBP simply finds a binary code for each pixel corresponding to its neighbors. The binary code corresponding to a neighbor is 1 if the intensity value of that neighbor is greater than or equal to the intensity value of the center pixel; otherwise it is 0. The features are represented by the histogram of these binary codes. Inspired from the simplicity and success of LBP, several variants are proposed for the face recognition. Huang et al. presented a survey over facial image analysis using LBP based approaches [33]. Another survey is conducted by Yang et al. over LBP based face recognition [34].

Some researchers tried to use the Gabor, Wavelet and Weber concepts in the framework of LBP. Zhang et al. computed the histograms of local Gabor magnitude binary pattern over different blocks of the image for representing the face images [35]. The histogram of Gabor phase pattern (HGPP) uses the orientation of Gabor wavelet and local XOR pattern for face representation [36]

. The Fourier transform phase is quantized in local neighborhoods for blur-robust face recognition

[37]. Chen et al. introduced the Weber local descriptor (WLD) which is inspired from the Weber’s law [38]. The local patterns of Gabor magnitude and phase are fused by using block-based Fisher’s linear discriminant for face recognition [39]. Recently, local SVD decomposition is used with local descriptors for near infrared face retrieval [40]. The drawback of such descriptors is that the time and space complexity of the transformations are too high for practical face recognition.

Instead of individual pixels, the average values of block subregions are used to compute the multi-scale block local binary pattern (MB-LBP) [41]. The binary concept of LBP is extended in ternary by local ternary pattern (LTP) with the help of a threshold [42]. The local derivative pattern (LDP) computes the LBP over derivative images in four directions [43]. Gradient edge map features are computed using a cascade of processing steps for illumination invariant frontal face representation [44]. The relationships between gradient orientations and magnitudes like patterns of orientation difference and patterns of oriented edge magnitudes are exploited by Vu for face recognition [45]. Recently, Local intensity orders among two sets of local neighbors over gradient images are used to develop the local gradient order pattern for face recognition [46]. Some researchers are trying to combine the descriptors. Recently, Lumini et al. combined the multiple descriptors like local binary pattern (LBP), histogram of oriented gradients (HOG), etc. and applied to the face recognition [47]. These descriptors encode the gradient information in a local neighborhood in circular fashion but miss to utilize the directional information across different radius of the local neighborhood.

Some of the LBP variants tried to utilize the directions in order to improve it. Local directional number (LDN) pattern uses a compass mask to represent the image into different directions, then encodes it using the directional numbers and sign [48]. Local vector pattern (LVP) uses the pairwise direction of the vector with diverse distances for each pixel to represent the face image [49]. Jabid et al. utilized the relative magnitude’s strength of edge response values in all eight directions to compute the local directional pattern for facial expression recognition [50]. A single eight bit code for a block is used to reduce the dimensionality of the local directional pattern [51]. Local directional gradient pattern (LDGP) uses the four directions at high order derivative space to capture the local information for recognition [52]. The major drawback of these approaches is the computation of directional information as a separate stage similar to pre-processing. These approaches also do not consider the directional relationship at different radius.

Some descriptors tried to consider the wider neighborhood to increase the discriminative ability of the descriptor. Local quantized pattern (LQP) quantizes the binary code generated from the large neighborhood with the help of a codebook to reduce the dimensionality [53]. The LQP is computed over regional features of the image and combined using kernel fusion into a single descriptor called multiscale LQP [54]. In order to consider the wider neighborhood, dual cross pattern (DCP) considers the local neighbors at two radius [55]. DCP divides the local neighborhood to reduce the dimension into two groups: first one consisting of the horizontal and vertical neighbors, whereas the second one consisting of the diagonal neighbors. DCP exploits the first derivative of Gaussian operator to encode the directional information. Very recently, the local directional ternary pattern (LDTP) has been proposed by Ryu et al. for the facial expression recognition [56]. It first converts the image into eight directional images using Robinson compass masks and then finds the primary and secondary directions to generate the feature vector. A very recently, the local gradient hexa pattern (LGHP) is proposed for the face recognition and retrieval [57]. LGHP basically works by encoding the relationship of center pixel with its neighboring pixels at different distances across different derivative directions. Because LGHP uses the wider local neighborhood, its dimension is increased drastically. The major problem associated with these existing descriptors is the increased dimension while considering more local neighbors. The relationship between different neighbors in a particular direction is also not utilized by these descriptors.

I-C Major Contribution

It is pointed out from the related works that most of the descriptors use only immediate local neighbors for the feature extraction which decreases the discriminative ability of the descriptor. Some descriptors tried to utilize the wider local neighborhood, but the dimension of such descriptors is too high. The directional information in the descriptor increases the discriminative power. The existing descriptors use some filters to create the directional gradient image which increases the complexity of the descriptor.

In order to overcome the above issues of existing descriptors, this paper proposes a local directional relation pattern (LDRP). The LDRP first encodes the relationship among directional neighbors and then utilizes the encoded values with center pixel value to generate the final pattern. The major contributions are as follows:

  • The proposed descriptor utilizes the wider local neighborhood without increasing the dimension.

  • In contrast to the existing descriptors which use derivatives to represent the directions, the LDRP descriptor uses the direction inherently. Basically, it encodes the relationship among the directional neighbors at multiple radius to transform it into a single value.

  • The proposed descriptor enriches the pattern with the relationship among local directional neighbors at multiple radius as well as the relationship between the center and transformed directional local neighboring values.

  • The relation among directional neighbors at multiple radius is computed by considering the binary relation between each pair in that direction.

  • The binary relation provides the robustness against uniform illumination while the wider local neighborhood increases the discriminative ability.

  • The proposed descriptor is experimented over six benchmarks and challenging face databases.

The rest of the paper is structured in the following manner: Section II describes the proposed descriptor; Section III illustrates the experimental setup; Section IV reports the experimental results and comparison; Section V presents the performance analysis; and finally Section V sets the concluding remarks.

Ii Proposed Descriptor

In this section, the construction process of proposed local directional relation pattern is described in detail. The whole process is divided into several steps such as local neighborhood extraction, local directional information coding, local directional relation pattern generation, feature vector computation and multiscale adaptation.

Ii-a Local Neighborhood Extraction

Let, is an image with dimension and represents the intensity value for the pixel in row and column with and . The coordinates of top and left corner is with positive x-axis downside across the rows and positive y-axis right side across the columns. The local neighbors of at a radius are represented by , where is the neighbor with as shown in Fig. 1. The coordinates of neighbor of pixel at a radius is given by defined as follows,

(1)
(2)

where is the angular displacement of neighbor w.r.t. first neighbor and given as follows,

(3)

So, can be written as follows,

(4)

The first neighbor is considered in the right side of the center pixel and rest of the neighbors are computed w.r.t. first neighbor in the counter-clockwise direction (see Fig. 1).

Fig. 1: The neighbors in the local neighborhood of pixel at radius . Note that the first neighbor is considered in the right side and rest neighbors are considered w.r.t. first neighbor in counter-clockwise direction.

Ii-B Local Directional Information Coding

In order to increase the discriminative ability of the proposed descriptor, the wider neighborhood is used in this work. The relation among local neighbors at multiple radius are utilized to encode the directional information. The represents the direction with . Considering the directional neighbors in direction, the binary codes are computed between each pair. The number of pairs out of neighbors are . Let, the directional neighboring pair in direction is represented by (). The index for pair can be computed from and as follows,

(5)

where , and .

Let, denotes the local directional binary pattern for center pixel in direction. The binary code between directional neighboring pair (or between the two neighbors at radius and ) in direction for center pixel is generated as follows,

(6)
Fig. 2: An illustration of the local directional relation pattern computation. (a) A local neighborhood with directions and neighbors in each direction. (b) Local direction binary bits are generated in each direction. For , the number of binary values is . (c) Local directional relation code, , is computed in each direction for by converting binary bits into equivalent decimal. (d) The transformed center value () is computed from the original center pixel value () in order to match its range with local directional codes. (e) The local directional relation binary values (i.e., for , and ) are computed for each direction. (f) Finally, the local directional relation pattern for a center pixel (), , is generated from by converting the binary values into equivalent decimal.

An example of local neighborhood with number of circular neighbors and number of directional neighbors is considered in Fig. 2(a). The intensity value of the center pixel is 50 in this example. Since 4 directional neighbors at different radius is considered, number of binary values are generated in each direction as depicted in Fig. 2(b). The local directional binary bits in direction (i.e. ) are for the pairs (56,98),(56,75),(56,60),(98,75),(98,60),(75,60) respectively. Similarly, the local directional binary bits in direction (i.e. ) is , in direction (i.e. ) is , and in direction (i.e. ) is .

For neighbors in a direction, number of binary values are generated. In order to reduce the dimension of the descriptor, it is required to code these binary values into a single value. A local directional information code () is generated in each direction from the binary values in that direction. The local directional information code, , in direction for pixel is computed by the following equation,

(7)

where, is a weight function and defined as follows,

(8)

The local directional relation codes in the example of Fig. 2(a) are computed in Fig. 2(c). The weights for directional neighbors are . The directional relation code in direction is . Similarly, the directional relation codes in , , and directions are , , and as shown in Fig. 2(c).

Ii-C Local Directional Relation Pattern

The local directional relation code is computed in the previous sub-section for a direction by encoding the relationship among the neighbors at different radius in that direction. Now, the next step is to find out the relation between center pixel and local directional relation codes. The minimum and maximum values of local directional code are dependent upon the number of directional neighbors considered (i.e. ). The code is generated from the number of binary values. So, the different number of decimal values that can be generated from binary bits is with a minimum value as and maximum value as . Whereas, the minimum and maximum values of center pixel are and respectively, where is the bit-depth of the image. Note that, the bit-depth () of the images is 8 in the databases used in this paper. A clear mismatch can be observed between the range of center pixel and local directional relation code. Thus, a transformation is required over either the center pixel or the local directional relation codes to match both the ranges. Due to efficiency reason, the center pixel is transformed into the range of local directional relation codes as follows,

(9)

where, is the transformed version of , is a function to round to the closest integer value. The transformed value of center pixel in Fig. 2(d) for and is computed as .

Let, is a binary pattern representing the relationship between center and directional relation code having values corresponding to each direction. The for center pixel in direction is given as follows,

(10)

where, is the difference between local directional relation code in direction and transformed value of the center pixel, i.e.,

(11)

The local directional relation pattern for pixel by considering the local neighbors in directions with neighbors in each direction is computed as follows,

(12)

where, is a weight function defined in (8).

Ii-D LDRP Feature Vector

The LDRP feature vector () is generated by finding the number of occurrences of LDRP values over the whole image. Note that, the minimum and maximum values of LDRP are and respectively. Thus, the length of feature vector is . The LDRP feature vector for Image with local neighborhood from directions having neighbors in each direction is defined as follows,

(13)

where, is calculated by following rule,

(14)

Note that, in this paper the number of directions (i.e., ) is considered as , so the dimension of the LDRP feature vector is .

Ii-E Multi-scale LDRP

Fig. 3: The construction process of multi-scale LDRP feature vector for an image. The image is taken from LFW database [58]. First the LDRP maps are computed for , , , and number of directional neighbors, then the feature vector is computed from all LDRP maps, and finally all feature vectors are concatenated into a single feature vector.

In order to make the LDRP descriptor more discriminative, the multi-scale directional neighborhood characteristics are utilized in this work. The LDRP feature descriptors are computed by varying the number of directional neighbor (i.e. ) in each direction. The values of are considered from to with . Finally, the LDRP feature vectors ( for ) are concatenated into a single feature vector. Mathematically, the final LDRP feature vector () can be written as follows,

(15)

The dimension of the multi-scale LDRP feature vector only depends upon the number of directions () and number of scales () and given as follows,

(16)

In order to make the final feature vector invariant to the image resolution, is normalized as follows,

(17)

for . In the experiments, the normalized version of feature vector is considered for all descriptors.

In case of , the multiscale LDRP feature vector is equivalent to the single scale LDRP feature vector (i.e., ). In Fig. 3, an image is considered from the LFW database [58]. Four LDRP maps are created by considering , , , and number of directional neighbors (i.e., and ) in directions. The final LDRP feature vector is computed by concatenating the feature vectors of different LDRP maps. In the rest of this paper, the by-default values of parameters of LDRP descriptor are , , , and .

Iii Experimental Setup

Fig. 4: The face retrieval framework using proposed LDRP descriptor.

The image retrieval framework is used in this work for the experiments. The face retrieval using proposed local directional relation pattern (LDRP) descriptor is shown in Fig. 4. The best matching faces from a database are extracted against a query face using LDRP descriptor. The LDRP descriptor is generated for all the images of the database as well as for the query image. The similarity scores are computed between the LDRP descriptors of query image and database images using a distance measure technique. Note that high similarity score (or low distance) between two descriptors signifies that the corresponding images are more similar and vice-versa.

Iii-a Distances Measures

The distance measures play an important role in image matching. The top number of faces is retrieved based on the lower distances, computed using the distance measures. The Chi-square distance measure is generally used in the experiments in this paper, whereas other distances like Euclidean, Cosine, L1, and D1 are also tested with the proposed descriptor to find its suitability [19], [20].

Iii-B Evaluation Criteria

The image retrieval algorithms are generally evaluated using precision, recall, f-score, and retrieval rank metrics. These metrics are also adapted in this work. In order to find the performance over a database, all the images of that database are converted as the query image one by one and metrics are computed. The average retrieval precision (ARP) and average retrieval rate (ARR) over whole database are calculated by taking the average of mean precision (MP) and mean recall (MR) of each category of that database respectively. The MP and MR for a category are computed by taking the mean of precision and recall by turning all of the images in that category as the query image one by one respectively. The F-score is computed from the ARP and ARR values as follows,

The average normalized modified retrieval rank (ANMRR) metric is also calculated to judge the rank of correctly retrieved faces [59]. The higher value of ARP, ARR and F-Score indicates the better retrieval performance and vice-versa, whereas the lower value of ANMRR indicates the better retrieval performance and vice-versa.

Iii-C Databases Used

In the experiments, six challenging face databases are used to demonstrate the performance of the proposed LDRP descriptor. The used six face databases are PaSC [60], LFW [58], [61], PubFig [62], FERET [63], [64], ESSEX [65], and AT&T (or ORL face database) [66]. All the face images are down-sampled in dimension. The PaSC still images face database is one of the challenging database having the variations like pose, illumination and blur [60]. This database is having 293 subjects with total 9376 images (i.e., 32 images per subject). Viola Jones object detection method [67] is used for the facial part extraction over PaSC images. Finally, 8718 faces are successfully detected using Viola Jones detector in PaSC database. The unconstrained face retrieval is very challenging and close to real scenarios. The LFW and PubFig databases are having the images from the Internet. These images are taken in totally unconstrained scenarios without subjects cooperations. The variations like pose, lighting, expression, scene, camera, etc. are part of these databases. The gray-scaled version of LFW cropped database [61] is used in this work for the experiments. In the image retrieval framework, it is required to retrieve more than one (typically 5, 10, etc.) best matching images. So, it is required that the sufficient number of images should be present in each category of the database. Thus, the subjects having at least 20 images are considered. Total 3023 face images from 62 individuals are present in the LFW database used. The Public Figure database (i.e., PubFig) consists the images from 60 individuals and have 6472 number of total images [62]. The images are downloaded from the Internet directly following the urls given in this database (dead urls are removed).

The ESSEX face database is very appealing database with a variety of background, scale, illumination, blur, and extreme variation of expressions [65]. The viola jones algorithm is used to extract the faces from the image [67]. Total 7740 faces are present from 392 subjects with nearly 20 images per subject. The AT&T face database (formerly known as the ORL Database of Faces) consisting of 10 images per subject from 40 different individuals [66]. Different lighting conditions and facial expressions are present in the images of some subjects. The images in AT&T database are captured in an upright and frontal position with a dark homogeneous background. ”Portions of the research in this paper use the FERET database of facial images collected under the FERET program, sponsored by the DOD Counterdrug Technology Development Program Office” [63], [64]. The Color-FERET database is considered due to the severe variations in the expression and pose (13 different poses). The subjects having at least 20 images are considered and all the color images are converted into the grayscale images. In this work, 4053 images from 141 subjects are present in FERET database.

Iv Face Retrieval Experimental Results

(a) ARP over PaSC Database
(b) ARR over PaSC Database
(c) F-Score over PaSC Database
(d) ANMRR over PaSC Database
(e) ARP over LFW Database
(f) ARR over LFW Database
(g) F-Score over LFW Database
(h) ANMRR over LFW Database
(i) ARP over PubFig Database
(j) ARR over PubFig Database
(k) F-Score over PubFig Database
(l) ANMRR over PubFig Database
(m) ARP over ESSEX Database
(n) ARR over ESSEX Database
(o) F-Score over ESSEX Database
(p) ANMRR over ESSEX Database
(q) ARP over AT&T Database
(r) ARR over AT&T Database
(s) F-Score over AT&T Database
(t) ANMRR over AT&T Database
(u) ARP over FERET Database
(v) ARR over FERET Database
(w) F-Score over FERET Database
(x) ANMRR over FERET Database
Fig. 5: The results over PaSC, LFW, PubFig, ESSEX, AT&T, and FERET databases in terms of the ARP, ARR, F-Score, and ANMRR vs number of retrieved images.

In order to demonstrate the superior performance of proposed local directional relation pattern (LDRP) descriptor for face retrieval, the state-of-the-art face descriptors like LBP [32], LTP [42], LDP [43], LDN [48], LVP [49], LDGP [52], DCP [55], and LGHP [57] are used for the comparison over all six databases. The dimensions of LBP, LTP, LDP, LDN, LVP, LDGP, DCP, LGHP, and LDRP descriptors are 256, 512, 1024, 64, 1024, 65, 512, 9216, and 1024 respectively. Note that, all these descriptors have shown very promising results for facial analysis under varying conditions such as rotation, scale, background, blur, illumination, pose, masking, etc. The parameters for all the compared descriptors are used as per their source papers.

The results comparison using different descriptors in terms of the ARP (%), ARR (%), F-score (%) and ANMRR (%) vs number of retrieved images () over PaSC, LFW, PubFig, ESSEX, AT&T and FERET face databases are presented in Fig. 5. The , , , , , and rows correspond to the results over PaSC, LFW, PubFig, ESSEX, AT&T and FERET face databases respectively. The , , , and columns correspond to ARP(%) vs , ARR(%) vs , F-Score(%) vs , and ANMRR(%) vs respectively. The LDRP descriptor outperforms the existing face descriptors over PaSC database as it has the highest values for ARP, ARR, and F-Score and lowest values for ANMRR (see Fig. 4(a)-4(d)). PaSC database is having variations like scale, blur, pose and illumination. The LFW and PubFig databases are fully unconstrained database. It is observed that the performance of LDRP descriptor is comparable with the LGHP descriptor over both LFW and PubFig databases as depicted in Fig. 4(e)-4(h) and Fig. 4(i)-4(l) respectively, whereas the dimension of LDRP(dim: 1024) is much lower than the dimension of LGHP(dim: 9216). Thus, the time efficiency of LDRP is far better than LGHP while maintaining similar performance over the unconstrained databases.

The performance of LDRP is improved significantly over the frontal face databases, but with other variations like scale, background, illumination, blur, expressions, etc. as shown in Fig. 4(m)-4(p) over ESSEX database and Fig. 4(q)-4(t) over AT&T database. The proposed LDRP descriptor is outstanding over both ESSEX and AT&T databases as compared to the state-of-the-art face descriptors. In order to test the suitability of LDRP in pose and scale variations, FERET database is considered because it has the faces with huge pose and scale variations. The results over FERET database is summarized in Fig. 4(u)-4(x). It is found that the LDRP is equivalent to the other top performing descriptors such as LGHP over FERET database, whereas its dimension is much lower than LGHP.

From the experimental results of Fig. 5, it is observed that LDRP and LGHP descriptors outperforms the other descriptors over the PaSC, LFW, PubFig, ESSEX, AT&T, and FERET face databases. It is also noticed that the LGHP descriptor is mostly the second best performing method and having very good discriminative features, but redundant and at the cost of increased dimensionality. Whereas, it is clear from Fig. 5 results that despite of having much lower dimensionality, the proposed LDRP descriptor is either outperforms LGHP or has the comparable performance against LGHP.

Face Databases
PubFig PaSC LFW FERET AT&T ESSEX
3 3 39.90 26.25 32.18 67.51 89.80 82.78
3 4 43.81 33.43 37.07 74.44 94.95 98.34
3 5 45.83 36.92 39.36 75.62 95.35 98.88
3 6 47.76 39.12 41.04 75.91 96.10 99.05
3 7 48.61 40.96 42.15 75.90 96.30 99.11
4 4 43.24 38.28 38.42 70.61 94.50 98.44
4 5 46.53 42.09 41.00 73.47 94.45 99.01
4 6 48.08 44.20 42.75 74.07 95.65 99.07
4 7 49.22 45.74 44.05 74.07 96.05 99.14
5 5 44.17 42.99 37.26 69.12 92.65 98.91
5 6 46.50 45.42 40.47 71.21 94.35 99.05
5 7 48.01 46.86 42.51 71.87 95.30 99.10
6 6 44.20 43.86 37.55 67.86 93.55 98.87
6 7 46.32 45.97 40.47 69.86 95.10 99.03
7 7 43.66 44.08 37.58 66.71 93.70 98.86
TABLE I: The performance comparison of LDRP descriptors in terms of the ARP(%) for number of retrieved images over the PubFig, PaSC, LFW, FERET, AT&T, and ESSEX face databases by varying the values of and (i.e. multiscale parameters) for different radius and the number of local neighborhoods. The Chi-square distance is used. The highest ARP values are highlighted in bold for each database.

V Performance Analysis

This section is devoted to the performance analysis of the proposed descriptor. First, the effect of size of local neighborhood and multi-scale is analyzed, and then the effect of distance measure is tested over each database.

V-a Effect of Local Neighborhood

In the previous experiment, the LDRP parameters are as follows: , , and . In this subsection, the performance of LDRP is tested by varying the values of and from 3 to 7. The ARP(%) values for number of retrieved images over the PubFig, PaSC, LFW, FERET, AT&T, and ESSEX face databases are summarized in Table I. The highest ARP for a particular database is highlighted in bold. It is observed that LDRP with and is having highest precision only over the FERET database. It is due to the huge pose variations present in the FERET database. The performance of LDRP is improved over each database for upper limit as a maximum (i.e. ). From this experiment, it is also clear that and are better suited for unconstrained scenario. Though, and are used in the previous results, the performance of LDRP can be further improved by considering and .

V-B Effect of Distance Measures

Distance Face Databases
PubFig PaSC LFW FERET AT&T ESSEX
Euclidean 35.49 27.53 30.17 65.09 88.45 95.83
Cosine 37.24 30.93 32.37 67.23 93.40 97.46
L1 44.91 36.96 38.57 75.54 95.50 98.80
D1 45.25 37.62 39.14 75.53 95.65 98.86
Chi-square 47.76 39.12 41.04 75.91 96.10 99.05
TABLE II: The ARP(%) using proposed LDRP descriptor with Euclidean, Cosine, L1, D1, and Chi-square distance measures over the PubFig, PaSC, LFW, FERET, AT&T, and ESSEX face databases. The number of retrieved images () is 5. The highest ARP values are highlighted in bold for each database.
Distance Face Databases
PubFig PaSC LFW FERET AT&T ESSEX
Euclidean 34.28 24.21 27.82 49.11 81.90 90.57
Cosine 36.10 25.82 29.51 54.60 83.10 91.75
L1 43.89 30.25 36.30 69.60 91.65 95.09
D1 43.96 30.29 36.33 69.71 91.75 95.11
Chi-square 47.05 32.93 39.53 74.88 94.00 95.73
TABLE III: The ARP(%) using LGHP descriptor [57] with Euclidean, Cosine, L1, D1, and Chi-square distance measures over the PubFig, PaSC, LFW, FERET, AT&T, and ESSEX face databases. The number of retrieved images () is 5. The highest ARP values are highlighted in bold for each database.

In order to find out the suitable distance measure for the proposed descriptor, this experiment is conducted by using the different distance measures. The Euclidean, Cosine, L1, D1, and Chi-square distances are used in this experiment [19], [20]. The ARP in percentage over the PubFig, PaSC, LFW, FERET, AT&T, and ESSEX databases for number of top matches are displayed in Table II using the proposed LDRP descriptor. In this experiment, the default parameter values are used for LDRP descriptor (i.e., , , and ). The best result over a database is highlighted in bold. It is observed from the results that the Chi-square distance measure is better suited with the proposed LDRP descriptor for a face retrieval task. The effect of distances is also tested with LGHP descriptor [57] in Table III and interestingly, the Chi-square distance is also better suited for LGHP. The Chi-square distance is performing better because the descriptors are in the form of histograms and representing the occurrences of patterns in some form.

Vi Conclusion

In this paper, a local directional relation pattern (LDRP) is proposed that utilizes the wider neighborhood information to increase the discriminative ability and local relations to increase the robustness. The LDRP first converts the wider local neighborhood into local directional codes in order to decrease the dimension of the descriptor by exploiting the relation among the directional neighbors at multiple radius, then it transforms the center pixel into the range of local directional relation codes, and finally the descriptor is computed by utilizing the relation of transformed center pixel with directional relation codes. The proposed LDRP descriptor is tested in an image retrieval framework over six very challenging face databases such as PaSC, LFW, FERET, etc. Some databases are totally unconstrained while some are having very severe variations in pose, expressions, etc. The retrieval results of LDRP are compared with the state-of-the-art face descriptors like LBP, LDN, DCP, LGHP, etc. The experimental results confirm the superiority of the LDRP descriptor as compared to the existing face descriptors. It is noticed that the performance of LDRP can be further boosted by considering more wider neighborhoods. The Chi-square distance measure is found to be best suited with LDRP for face retrieval.

References

  • [1] J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma, “Robust face recognition via sparse representation,” IEEE transactions on pattern analysis and machine intelligence, vol. 31, no. 2, pp. 210–227, 2009.
  • [2] C. Ding, C. Xu, and D. Tao, “Multi-task pose-invariant face recognition,” IEEE Transactions on Image Processing, vol. 24, no. 3, pp. 980–993, 2015.
  • [3] M. Kan, S. Shan, H. Zhang, S. Lao, and X. Chen, “Multi-view discriminant analysis,” IEEE transactions on pattern analysis and machine intelligence, vol. 38, no. 1, pp. 188–194, 2016.
  • [4] L. Ding, X. Ding, and C. Fang, “Continuous pose normalization for pose-robust face recognition,” IEEE Signal Processing Letters, vol. 19, no. 11, pp. 721–724, 2012.
  • [5] A. Punnappurath, A. N. Rajagopalan, S. Taheri, R. Chellappa, and G. Seetharaman, “Face recognition across non-uniform motion blur, illumination, and pose,” IEEE Transactions on Image Processing, vol. 24, no. 7, pp. 2067–2082, 2015.
  • [6] W. Zhao, R. Chellappa, P. J. Phillips, and A. Rosenfeld, “Face recognition: A literature survey,” ACM computing surveys (CSUR), vol. 35, no. 4, pp. 399–458, 2003.
  • [7] X. Zhang and Y. Gao, “Face recognition across pose: A review,” Pattern Recognition, vol. 42, no. 11, pp. 2876–2896, 2009.
  • [8] C. Ding and D. Tao, “A comprehensive survey on pose-invariant face recognition,” ACM Transactions on intelligent systems and technology (TIST), vol. 7, no. 3, p. 37, 2016.
  • [9] F. Schroff, D. Kalenichenko, and J. Philbin, “Facenet: A unified embedding for face recognition and clustering,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 815–823.
  • [10] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf, “Deepface: Closing the gap to human-level performance in face verification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 1701–1708.
  • [11] Z. Cao, Q. Yin, X. Tang, and J. Sun, “Face recognition with learning-based descriptor,” in Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on.   IEEE, 2010, pp. 2707–2714.
  • [12] L. Wolf, T. Hassner, and Y. Taigman, “Effective unconstrained face recognition by combining multiple descriptors and learned background statistics,” IEEE transactions on pattern analysis and machine intelligence, vol. 33, no. 10, pp. 1978–1990, 2011.
  • [13] Z. Lei, M. Pietikäinen, and S. Z. Li, “Learning discriminant face descriptor,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 2, pp. 289–302, 2014.
  • [14] J. Lu, V. E. Liong, X. Zhou, and J. Zhou, “Learning compact binary face descriptor for face recognition,” IEEE transactions on pattern analysis and machine intelligence, vol. 37, no. 10, pp. 2041–2056, 2015.
  • [15] J. Lu, V. Erin Liong, and J. Zhou, “Simultaneous local binary feature learning and encoding for face recognition,” in Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 3721–3729.
  • [16] M. Pietikäinen, A. Hadid, G. Zhao, and T. Ahonen, “Local binary patterns for still images,” in Computer vision using local binary patterns.   Springer, 2011, pp. 13–47.
  • [17] S. R. Dubey, S. K. Singh, and R. K. Singh, “Rotation and illumination invariant interleaved intensity order-based local descriptor,” IEEE Transactions on Image Processing, vol. 23, no. 12, pp. 5323–5333, 2014.
  • [18] ——, “Local neighbourhood-based robust colour occurrence descriptor for colour image retrieval,” IET Image Processing, vol. 9, no. 7, pp. 578–586, 2015.
  • [19] S. Murala, R. Maheshwari, and R. Balasubramanian, “Local tetra patterns: a new feature descriptor for content-based image retrieval,” IEEE Transactions on Image Processing, vol. 21, no. 5, pp. 2874–2886, 2012.
  • [20] S. R. Dubey, S. K. Singh, and R. K. Singh, “Multichannel decoded local binary patterns for content-based image retrieval,” IEEE Transactions on Image Processing, vol. 25, no. 9, pp. 4018–4032, 2016.
  • [21] ——, “Boosting local binary pattern with bag-of-filters for content based image retrieval,” in Electrical Computer and Electronics (UPCON), 2015 IEEE UP Section Conference on.   IEEE, 2015, pp. 1–6.
  • [22] T. Ojala, M. Pietikainen, and T. Maenpaa, “Multiresolution gray-scale and rotation invariant texture classification with local binary patterns,” IEEE Transactions on pattern analysis and machine intelligence, vol. 24, no. 7, pp. 971–987, 2002.
  • [23] S. Liao, M. W. Law, and A. C. Chung, “Dominant local binary patterns for texture classification,” IEEE transactions on image processing, vol. 18, no. 5, pp. 1107–1118, 2009.
  • [24] L. Liu, Y. Long, P. W. Fieguth, S. Lao, and G. Zhao, “Brint: binary rotation invariant and noise tolerant texture classification,” IEEE Transactions on Image Processing, vol. 23, no. 7, pp. 3071–3084, 2014.
  • [25] X. Qi, R. Xiao, C.-G. Li, Y. Qiao, J. Guo, and X. Tang, “Pairwise rotation invariant co-occurrence local binary pattern,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 11, pp. 2199–2213, 2014.
  • [26] S. R. Dubey, S. K. Singh, and R. K. Singh, “Local wavelet pattern: A new feature descriptor for image retrieval in medical ct databases,” IEEE Transactions on Image Processing, vol. 24, no. 12, pp. 5892–5903, 2015.
  • [27] ——, “Local bit-plane decoded pattern: A novel feature descriptor for biomedical image retrieval,” IEEE Journal of Biomedical and Health Informatics, vol. 20, no. 4, pp. 1139–1147, 2016.
  • [28] ——, “Local diagonal extrema pattern: a new and efficient feature descriptor for ct image retrieval,” IEEE Signal Processing Letters, vol. 22, no. 9, pp. 1215–1219, 2015.
  • [29] ——, “Novel local bit-plane dissimilarity pattern for computed tomography image retrieval,” Electronics Letters, vol. 52, no. 15, pp. 1290–1292, 2016.
  • [30] H. Tang, B. Yin, Y. Sun, and Y. Hu, “3d face recognition using local binary patterns,” Signal Processing, vol. 93, no. 8, pp. 2190–2198, 2013.
  • [31] S. Elaiwat, M. Bennamoun, F. Boussaid, and A. El-Sallam, “3-d face recognition using curvelet local features,” IEEE Signal Processing Letters, vol. 21, no. 2, pp. 172–175, 2014.
  • [32] T. Ahonen, A. Hadid, and M. Pietikainen, “Face description with local binary patterns: Application to face recognition,” IEEE transactions on pattern analysis and machine intelligence, vol. 28, no. 12, pp. 2037–2041, 2006.
  • [33] D. Huang, C. Shan, M. Ardabilian, Y. Wang, and L. Chen, “Local binary patterns and its application to facial image analysis: a survey,” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 41, no. 6, pp. 765–781, 2011.
  • [34] B. Yang and S. Chen, “A comparative study on local binary pattern (lbp) based face recognition: Lbp histogram versus lbp image,” Neurocomputing, vol. 120, pp. 365–379, 2013.
  • [35] W. Zhang, S. Shan, W. Gao, X. Chen, and H. Zhang, “Local gabor binary pattern histogram sequence (lgbphs): A novel non-statistical model for face representation and recognition,” in Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on, vol. 1.   IEEE, 2005, pp. 786–791.
  • [36] B. Zhang, S. Shan, X. Chen, and W. Gao, “Histogram of gabor phase patterns (hgpp): A novel object representation approach for face recognition,” IEEE Transactions on Image Processing, vol. 16, no. 1, pp. 57–68, 2007.
  • [37] T. Ahonen, E. Rahtu, V. Ojansivu, and J. Heikkila, “Recognition of blurred faces using local phase quantization,” in Pattern Recognition, 2008. ICPR 2008. 19th International Conference on.   IEEE, 2008, pp. 1–4.
  • [38] J. Chen, S. Shan, C. He, G. Zhao, M. Pietikainen, X. Chen, and W. Gao, “Wld: A robust local image descriptor,” IEEE transactions on pattern analysis and machine intelligence, vol. 32, no. 9, pp. 1705–1720, 2010.
  • [39] S. Xie, S. Shan, X. Chen, and J. Chen, “Fusing local patterns of gabor magnitude and phase for face recognition,” IEEE transactions on image processing, vol. 19, no. 5, pp. 1349–1361, 2010.
  • [40] S. R. Dubey, S. K. Singh, and R. K. Singh, “Local svd based nir face retrieval,” Journal of Visual Communication and Image Representation, 2017.
  • [41] S. Liao, X. Zhu, Z. Lei, L. Zhang, and S. Z. Li, “Learning multi-scale block local binary patterns for face recognition,” in International Conference on Biometrics.   Springer, 2007, pp. 828–837.
  • [42] X. Tan and B. Triggs, “Enhanced local texture feature sets for face recognition under difficult lighting conditions,” IEEE transactions on image processing, vol. 19, no. 6, pp. 1635–1650, 2010.
  • [43] B. Zhang, Y. Gao, S. Zhao, and J. Liu, “Local derivative pattern versus local binary pattern: face recognition with high-order local pattern descriptor,” IEEE transactions on image processing, vol. 19, no. 2, pp. 533–544, 2010.
  • [44] O. Arandjelovic, “Gradient edge map features for frontal face recognition under extreme illumination changes,” in BMVC 2012: Proceedings of the British machine vision association conference.   BMVA Press, 2012, pp. 1–11.
  • [45] N.-S. Vu, “Exploring patterns of gradient orientations and magnitudes for face recognition,” IEEE Transactions on Information Forensics and Security, vol. 8, no. 2, pp. 295–304, 2013.
  • [46] C.-X. Ren, Z. Lei, D.-Q. Dai, and S. Z. Li, “Enhanced local gradient order features and discriminant analysis for face recognition,” IEEE transactions on cybernetics, vol. 46, no. 11, pp. 2656–2669, 2016.
  • [47]

    A. Lumini, L. Nanni, and S. Brahnam, “Ensemble of texture descriptors and classifiers for face recognition,”

    Applied Computing and Informatics, vol. 13, no. 1, pp. 79–91, 2017.
  • [48] A. R. Rivera, J. R. Castillo, and O. O. Chae, “Local directional number pattern for face analysis: Face and expression recognition,” IEEE transactions on image processing, vol. 22, no. 5, pp. 1740–1752, 2013.
  • [49] K.-C. Fan and T.-Y. Hung, “A novel local pattern descriptor—local vector pattern in high-order derivative space for face recognition,” IEEE transactions on image processing, vol. 23, no. 7, pp. 2877–2891, 2014.
  • [50] T. Jabid, M. H. Kabir, and O. Chae, “Facial expression recognition using local directional pattern (ldp),” in Image Processing (ICIP), 2010 17th IEEE International Conference on.   IEEE, 2010, pp. 1605–1608.
  • [51] C. M. PVSSR et al., “Dimensionality reduced local directional pattern (dr-ldp) for face recognition,” Expert Systems with Applications, vol. 63, pp. 66–73, 2016.
  • [52] S. Chakraborty, S. K. Singh, and P. Chakraborty, “Local directional gradient pattern: a local descriptor for face recognition,” Multimedia Tools and Applications, vol. 76, no. 1, pp. 1201–1216, 2017.
  • [53] S. U. Hussain, T. Napoléon, and F. Jurie, “Face recognition using local quantized patterns,” in British machive vision conference, 2012, pp. 11–pages.
  • [54] C. H. Chan, M. A. Tahir, J. Kittler, and M. Pietikäinen, “Multiscale local phase quantization for robust component-based face recognition using kernel fusion of multiple descriptors,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 5, pp. 1164–1177, 2013.
  • [55] C. Ding, J. Choi, D. Tao, and L. S. Davis, “Multi-directional multi-level dual-cross patterns for robust face recognition,” IEEE transactions on pattern analysis and machine intelligence, vol. 38, no. 3, pp. 518–531, 2016.
  • [56] B. Ryu, A. R. Rivera, J. Kim, and O. Chae, “Local directional ternary pattern for facial expression recognition,” IEEE Transactions on Image Processing, 2017.
  • [57] S. Chakraborty, S. Singh, and P. Chakraborty, “Local gradient hexa pattern: A descriptor for face recognition and retrieval,” IEEE Transactions on Circuits and Systems for Video Technology, 2016.
  • [58] G. B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller, “Labeled faces in the wild: A database for studying face recognition in unconstrained environments,” Technical Report 07-49, University of Massachusetts, Amherst, Tech. Rep., 2007.
  • [59] K. Lu, N. He, J. Xue, J. Dong, and L. Shao, “Learning view-model joint relevance for 3d object retrieval,” IEEE Transactions on Image Processing, vol. 24, no. 5, pp. 1449–1459, 2015.
  • [60] J. R. Beveridge, P. J. Phillips, D. S. Bolme, B. A. Draper, G. H. Givens, Y. M. Lui, M. N. Teli, H. Zhang, W. T. Scruggs, K. W. Bowyer et al., “The challenge of face recognition from digital point-and-shoot cameras,” in Biometrics: Theory, Applications and Systems (BTAS), 2013 IEEE Sixth International Conference on.   IEEE, 2013, pp. 1–8.
  • [61] C. Sanderson and B. C. Lovell, “Multi-region probabilistic histograms for robust and scalable identity inference,” in International Conference on Biometrics.   Springer, 2009, pp. 199–208.
  • [62] N. Kumar, A. C. Berg, P. N. Belhumeur, and S. K. Nayar, “Attribute and simile classifiers for face verification,” in Computer Vision, 2009 IEEE 12th International Conference on.   IEEE, 2009, pp. 365–372.
  • [63] P. J. Phillips, H. Wechsler, J. Huang, and P. J. Rauss, “The feret database and evaluation procedure for face-recognition algorithms,” Image and vision computing, vol. 16, no. 5, pp. 295–306, 1998.
  • [64] P. J. Phillips, H. Moon, S. A. Rizvi, and P. J. Rauss, “The feret evaluation methodology for face-recognition algorithms,” IEEE Transactions on pattern analysis and machine intelligence, vol. 22, no. 10, pp. 1090–1104, 2000.
  • [65] L. Spacek, “University of essex face database.” [Online]. Available: http://cswww.essex.ac.uk/mv/allfaces/
  • [66] F. S. Samaria and A. C. Harter, “Parameterisation of a stochastic model for human face identification,” in Applications of Computer Vision, 1994., Proceedings of the Second IEEE Workshop on.   IEEE, 1994, pp. 138–142.
  • [67] P. Viola and M. Jones, “Rapid object detection using a boosted cascade of simple features,” in Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on, vol. 1.   IEEE, 2001, pp. I–I.