Collaborative representation-based robust face recognition by discriminative low-rank representation

12/17/2019 ∙ by Wen Zhao, et al. ∙ 11

We consider the problem of robust face recognition in which both the training and test samples might be corrupted because of disguise and occlusion. Performance of conventional subspace learning methods and recently proposed sparse representation based classification (SRC) might be degraded when corrupted training samples are provided. In addition, sparsity based approaches are time-consuming due to the sparsity constraint. To alleviate the aforementioned problems to some extent, in this paper, we propose a discriminative low-rank representation method for collaborative representation-based (DLRR-CR) robust face recognition. DLRR-CR not only obtains a clean dictionary, it further forces the sub-dictionaries for distinct classes to be as independent as possible by introducing a structural incoherence regularization term. Simultaneously, a low-rank projection matrix can be learned to remove the possible corruptions in the testing samples. Collaborative representation based classification (CRC) method is exploited in our proposed method which has closed-form solution. Experimental results obtained on public face databases verify the effectiveness and robustness of our method.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 14

page 17

page 21

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Face recognition (FR), as one of the biometric technologies, has been raised concern due to its critical applications in many real-world scenarios, including access control, social networks, card identification, digital entertainment and intelligent interaction Ding and Tao (2015)Zhen et al. (2015)

. So the improvement of FR technology has meaningful value in developing a modern city. It is well-known that the original face images usually have high dimensionality, leading to high computational complexity. Due to the fact that face images usually reside on a low dimensional subspace, many feature extraction techniques, such as Eigenfaces 

Turk and Pentland (1991), Fisherfaces Belhumeur et al. (1997), Laplacianfaces He et al. (2005) and their variants Xiao-Jun et al. (2004); Zheng et al. (2006a, b)

, have been developed. Then a corresponding classifier such as the nearest neighbor classifier (NNC) and support vector machine (SVM) can be used. When face images are taken under a well controlled setting, the above subspace learning methods can shown promising results on FR under the circumstances. Nevertheless, usually robust FR is a challenging issue due to the appearance of variability in variance, illumination, pose, occlusion and even disguise. Moreover, from another point of view, the effectiveness of the learned subspace will be degraded due to small amount of training samples. Luo et al.

Luo et al. (2015) proposed a selective regularized subspace learning (SRSL) algorithm which learns a local subspace for each sample instead of learning a global subspace for all samples.

In recent years, sparse representation-based classification (SRC) Wright et al. (2008) as a new robust FR framework has been proposed, which represents each test image as a sparse linear combination of the whole training images. Then by solving an -minimization optimization problem Donoho (2006)

, the results of the identification can be achieved by the minimum class-specific reconstruction error. If the test image is corrupted due to occlusion or corruption, SRC is able to show good robustness by introducing an identity matrix as the occlusion dictionary. However, because of the high dimensionality of the identity matrix, the sparse coding procedure would be computationally expensive. An extended SRC (ESRC) method was proposed by Deng et al.

Deng et al. (2012) in order to solve the above issue, which uses an auxiliary intra-class variant dictionary by subtracting the neutral image from the other images of the same subject to represent the possible corruption. The variant matrix with much smaller dimensionality is then used as occlusion dictionary appended to the raw training data. However, the auxiliary intra-class variant dictionary might not depict the various corruptions of face images accurately. Additionally, Zhang et al.Yang et al. (2013) proposed Gabor feature based SRC (GSRC) method, in which the image Gabor features are used for SRC. The use of Gabor kernels makes the occlusion dictionary compressible, which reduces the computational cost to some extent. Although both ESRC Deng et al. (2012) and GSRC Yang and Zhang (2010) are more robust to deal with test images with occlusion or corruption than SRC, it is still time-consuming to solve the -minimization optimization problem.

What is more, very recently some works Rigamonti et al. (2011), Zhang et al. (2011), Wang et al. (2010), Shi et al. (2011) have started to show doubt about the necessity of -norm-based sparsity, such as described in Wright et al. (2008), to the improvement of the performance in face image classification. Meanwhile, the sparsity based classification schemes such as SRC are very computationally expensive Yang and Zhang (2010). Zhang et al.Zhang et al. (2011) replaced the -norm by the -norm to emphasize the role of collaborative representation (CR), which represents the test image collaboratively by using the whole training images from all classes. The above problem has been examined by some recent works directly or indirectly. One of the example is Rigamonti et al.proposed in Rigamonti et al. (2011), they used two different data models, the first one is the sparse representation based on the -norm and the second one is proposed by passing the input signal into a simple convolution filter, to compared the discrimination of themselves. As a result, a similar recognition accuracy could be achieved by these two models. Therefore, some scholars drew a conclusion that -norm-based sparsity is not as important as it claims in the previous methods. Under this awareness, by replacing the -norm with the -norm, a very simple yet much more valid face classification scheme was proposed by Zhang et al.Zhang et al. (2011) named CRC. The collaborative representation technique has been adopted in many papers Zhang et al. (2015), Chi and Porikli (2013) in recent years.

The role of collaboration between different subjects in linear representation of the test image is considered by CRC, it employs the much weaker -norm instead of the -norm to regularize the coefficient representation and has very competitive FR accuracy but significantly lower complexity. It is unfortunate that when the training and test samples are corrupted simultaneously Chen et al. (2012), the effectiveness of CRC would be dropped. To alleviate the aforementioned problems De La Torre and Black (2003), Candès et al. (2011), many recent works on low-rank matrix recovery (LRMR) have been proposed. Among these works, the robust PCA (RPCA) method presented by Wright et al.Wright et al. (2009) assumed that all data reside on a single subspace, and then a low-rank data matrix can be recovered by separating occluded pixels and various corruptions from the original training samples. Actually, the face image samples usually come from multiple subspaces, which will affect the performance of RPCA. A new method was proposed by Liu et al.Liu et al. (2010), Liu et al. (2012) named low-rank representation (LRR) under the assumption that data samples are drawn from multiple subspaces. Although LRR can work well to remove the corruptions from the training data, the local structure of data was neglected which might lead to the degradation of recovery performance. There are also some other works proposed in recent years to improve the performance of LRMR, e.g., under the Bayesian framework, Zhao et al.Zhao et al. (2014) presented a generative RPCA model, the noise in the face images is modeled as a mixture of Gaussians to suit a huge range. Yin et al.Yin and Wu (2016) presented a new method called low rank matrix recovery with structural incoherence and low rank projection (LRSI_LRP) which can correct the corrupted test images with a low rank projection matrix. Zhao et al.Zhao et al. (2015) developed a discriminative low-rank representation method for collaborative representation-based (DLRR-CR) robust face recognition. Chen et al.Chen et al. (2018) proposed a robust low-rank recovery algorithm (RLRR) with a distance-measure structure for face recognition. Zhang et al.Zhang et al. (2014) proposed a new image classification scheme, which utilizes the non-negative sparse coding, low-rank and sparse matrix decomposition techniques to obtain a classification framework. Zhang et al.Zhang and Li (2012) presented a new image decomposition technique for an ensemble of associated image samples. This method utilizes a procedure to decompose every face image into three parts: a common image, a low-rank image and a sparse corruption matrix. However, if the test images are corrupted, the performance of the above methods will degrade, because they cannot correct the corrupted test images.

In this paper, to address the problem of robust face recognition, in which both training and test samples might be corrupted by the unknown type, we propose a discriminative low-rank representation method for collaborative representation based (DLRR-CR) method. As revealed by other literatures, if the original training images with corruption are directly used as the dictionary, it will degenerate the performance of FR. To avoid this problem, the proposed method first constructs a discriminative LRR framework to separate the corruptions and recover the clean training samples. The LRR method presented by Liu et al.Liu et al. (2012) only imposes the low-rank constraint on the representation coefficient matrix of the training samples. In order to reflect incoherence between different classes, a regularization term is added to the formulation of LRR, which can provide additional discriminating ability to our framework and obtain better representations. In addition, a low-rank projection matrix can be learned and then applied to project the corrupted test images into their corresponding underlying subspace to get the corrected test images. It is more importantly to see that our proposed approach is not just used for reconstructing the test samples but for recognition, as illustrated in Fig. 1, the standard SRC and CRC classify the test image as the class with most similar images, while our proposed method alleviates this problem. Obviously, the results of class-wise reconstruction errors of the three methods, shown in Fig. 1 (c), exhibit that by using our method the correct class has the smallest reconstruction error, which can demonstrate that our method is more robust to occlusions presented in both training and test images. Section 3 will present more details. Furthermore, it is worthy to note that in the testing stage, CRC is exploited, which has the observably lower complexity but excellent FR performance. In the experimental part, our method DLRR-CR will show the effectiveness and robustness for FR.

Figure 1: Comparison between SRC, CRC and our approach. (a) An original test image occluded by a scarf from the AR database. (b) The first column includes the coefficients of the three methods for the same test image respectively. The second column shows the most similar training images chosen by the three methods. (c) The results of class-wise reconstruction errors of the three methods. Our approach can classify the test image to the correct class while SRC and CRC classify the test image as that class with the most similar training images.

The remainder of this paper is organized as follows. Section 2 briefly reviews some related works on CRC and LRR for FR. In Section 3, we present the proposed model for FR in detail. Experimental results on real-world face image data are presented in Section 4. Finally, Section 5 concludes this paper.

2 Related work

2.1 Collaborative representation-based classification (CRC)

We consider original bases collected from different classes, and is the dimension of each base. Class includes training images which denoted by . can be rewritten as . When comes a new test sample , SRC aims to find a sparse linear representation coefficient vector so that can be represented as . This approximation problem can be calculated via minimizing the following problem:

(1)

where denotes a scalar constant, denotes the -norm, and denotes the -norm. Many algorithms, such as basis pursuit Chen et al. (2001) and Homotopy Yang and Zhang (2011) can be used to figure out the above -norm minimization problem. The test sample should lie in the space spanned by the training samples from the correct class. Once we get the solution of (1), where and is the representation vector of associated with class , the test sample can be recognized by the reconstruction error of each class, i.e.,

(2)

Based on the fact that face images from different subjects may have similar appearances, thus samples from the uncorrelated classes can participate in representing the test sample . A regularized least square method is developed with significantly lower complexity named CRC, which is formulated as,

(3)

where is a balance factor. CRC can offer improvements in decreasing computational complexity by using the -norm-based model. A closed-form solution can be derived by solving (3), in which can be pre-calculated which leads to the fast computation speed of CRC. In the classification stage, the regularized residuals is used to classify the test image by utilizing the discrimination information contained in , where is the coefficient vector associated with class . Finally, the test sample is designated to the class that has the least regularized residual.

2.2 Low-rank representation (LRR)

Low-rank matrix recovery technique is used in our proposed method to recover a clean data matrix, so we investigate its formulation for the purpose of completeness. The corrupted training samples can be decomposed into by RPCA Wright et al. (2009), in which is the clean low rank matrix and is the associated sparse error matrix. The rank of matrix is minimized by RPCA, meanwhile, is reduced for the sake of deriving the low-rank approximation of . The original formulation of RPCA is formulated as,

(4)

where denotes the -norm. Eq. (4) is NP-hard as well as highly nonconvex, it is relaxed by replacing the -norm with the -norm and the rank function with the nuclear norm. The new optimization problem is more tractable as follows:

(5)

The limitation of RPCA is that it assumes that all the column vectors in are drawn from a single low-rank subspace Liu et al. (2012). This hypothesis is not general and reasonable because face images usually reside on a union of multiple subspaces. A modified rank optimization problem in LRR is presented by Liu et al.Liu et al. (2010, 2012) defined as follows:

(6)

where the representation matrix of is denoted by and is a certain regularization strategy for expressing different corruptions. The inexact augmented Lagrange multipliers (ALM) algorithm Lin et al. (2010) is employed to efficiently solve the above optimization problem (6). After the optimal solution is obtained, the recovered clean data matrix can be acquired from the corrupted data matrix by .

3 Proposed method

In order to deal with the problem of robust FR, where training and test images might be simultaneously corrupted by outlier and cannot be well solved by CRC, a discriminative low-rank representation method is proposed in this section. Our method can recover a clean and discriminative dictionary from the highly corrupted training images. To handle the corruption appeared in the test samples, a low-rank projection matrix is learned to project test samples onto their corresponding underlying subspaces and obtain the new clean test samples. In the testing stage, CRC is exploited to classify the corrected test samples.

3.1 Discriminative low-rank representation for matrix recovery

The low-rank matrix recovery techniques can be utilized to improve the recognition accuracy of CRC because the problems brought by corrupted training samples can be alleviated. The recovered clean dictionary with a better representation ability can be obtained as from the original matrix by solving (6). One fact is that face images from different people share similarities due to the location of eyes, mouth, etc., so the drawback is that discriminative information is not contained by and it is not suitable for classification. Inspired by Ramirez et al. (2010), we propose a DLRR scheme for matrix recovery, different from LRR, the incoherence between different subjects in is promoted. Consequently, the introduction of such incoherence would make the derived low-rank matrix from different subjects as independent as possible. The discrimination capacity is improved while the commonly shared features are suppressed.

A set of training images with corruptions are set as the original data matrix , where is composed of the training images from class . As mentioned above, the data matrix can be decomposed into a low-rank matrix and the sparse error matrix by the formulation in (6), where represents the clean data matrix from subject . A regularization term which sums the Frobenius norms of each pair of low-rank matrix is added to the original LRR formulation to improve the independence of different classes. If the value of is as small as possible, then the between-class independence can be achieved in our method. The new optimization problem is formulated as follows,

(7)

where is scalar parameter. The last term promotes the structural incoherence of different subjects, which is penalized by the scalar parameter balancing the low-rank matrix decomposition and discriminative features. When draw a comparison between (7) and (6), the regularization term is utilized to provide improved discriminative ability, which can enforce more training samples from the correct subject to represent the test samples. We use the -norm to encourage the columns of error term matrix which represents extreme noise to be zero, the other extra regularization is unnecessary to be used on since is sparse. As a result, our new formulation (7) can fully explore the discrimination capacity contained in the original corrupted training images.

The discriminatory ability of the DLRR scheme in classifying face samples from different subjects is illustrated in Fig. 2

. Similar to former Eigenfaces and CRC-based methods, we use the training and test face images from two different subjects in the AR face database and then project them onto the first two eigenvectors of the covariance matrix of the original data matrix

as shown in Fig. 2 (a), then the same data is projected onto the subspace which is derived by without structural incoherence, the result is shown in Fig. 2 (b), and Fig. 2 (c) is the result of our proposed method. It is obvious that the distinction between the data projected onto with and without structural incoherence are both improved compared with Fig. 2 (a). However, compared Fig. 2 (b) with Fig. 2 (c), we can find that the within-class scatter in Fig. 2 (c) is smaller than that of in Fig. 2 (b), and thus a better discriminative ability can be obtained by using our proposed approach. We also choose other images from three different classes in the Extended Yale B face database to do the same experiment as described above, as shown in the second row of Fig. 2. We plot their corresponding 2D subspaces spanned by the first two eigenvectors in Figs. 2 (d), (e) and (f), respectively. It is worth noting that our method, i.e.Fig. 2 (f), exhibits desirable discrimination capacity, while the distinction between the data projected onto the original data matrix and the low-rank matrix are observed to be degraded. Hence, better representation ability can be achieve by utilizing our derived LR matrix.

Figure 2: Data distributions for different classes represented by different colors. The training and test data are projected onto the first two eigenvectors of the covariance matrices of (a) the original data matrix , (b) the LR matrix without structural incoherence, (c) the LR matrix with structural incoherence. When the training and test data are spanned by three different classes, the corresponding plots for (a), (b) and (c) are shown in (d), (e) and (f), respectively.

3.2 Optimization via ALM

Our proposed optimization problem (7) is solved by ALM Lin et al. (2010) in this section. Firstly, we introduce an auxiliary variable to make the optimization problem (7) solvable, the new equivalent optimization problem is given as follows,

(8)

The transformed augmented Lagrangian function is formulated as an unconstrained optimization problem,

(9)

where and are Lagrangian multipliers, and is used as a penalty parameter. The optimization problem (9) can be rewritten by some simple algebra as follows,

(10)

where

The variables , , could be iteratively updated. Two variables are fixed at each iteration to update the remaining one. The detailed updating schemes are showed as follows in each step.

  1. Updating

    Updating by minimizing is equivalent to minimizing the following unconstrained minimization function for .

    where

    The above problem has the following closed-form solution,

    (11)
  2. Updating

    Now by fixing , , and , we optimize the variable for class . The updating scheme of is as follows:

    Then the solution to the above problem can be solved by solving the partial derivatives of w.r.t. and setting it to be zero, and the solution is given by,

    (12)

    where

  3. Updating

    By fixing , , and , we update the error matrix for subject as follows,

    (13)

Algorithm 1 summaries the whole detailed procedures for solving the optimization problem (10).

  Input: Training data matrix for classes, and parameters ,
      1. for
      2. Initialize:
      3. while not converged, maxiter do
      4. update according to (11);
      5. update according to (12);
      6. update according to (13);
      7. update the Lagrange multipliers
         ;
         ;
      8. update the parameter by ;
      9. check the convergence conditions
          and
      10. update
      11. end while
      12. end for
      13. an optimal solution
  Output: the recovered clean data matrix .
Algorithm 1 Solving Problem (8) by ALM

An example is given in Fig. 3 to intuitively display the recovery results of DLRR for matrix recovery, the corrupted images are successfully separated into the recovered clean images and the error images. Some training samples from one subject in AR database with illumination variations, pose changes, and occlusions are shown in Fig. 3 (a). The recovered clean samples and the corresponding sparse errors are shown in Figs. 3 (b) and (c), respectively.

Figure 3: Recovery results of DLRR.

3.3 Low-rank projection matrix

In the testing phase, to handle the possible occlusion variations appeared in the test samples, which could degrade the performance of CRC. Motivated by Bao et al.Bao et al. (2012), we try to find a low-rank projection matrix which can project the new corrupted samples onto their corresponding underlying subspaces.

After the original corrupted samples are successfully separated into the recovered matrix and the sparse error matrix. The matrix can be seen as the set of principal components obtained from the matrix , so a linear projection can be studied between and . Then any data sample can be projected onto its underlying subspace to get the recovery results as . Based on the assumption that the recovery result is considered to be drawn from a union of multiple low-rank subspaces, we could hypothesize that is a low-rank matrix and the optimization problem can be formulated as follows,

(14)

As mentioned in Section 2, a convex relaxation of the optimization problem 14 is proposed by replacing the rank function with the nuclear norm which can decrease the computational complexity. The convex optimization problem is formulated as,

(15)

is formulated as the uniqueness of the minimizer for the above problem (15) under the hypothesis that and has feasible solution, where is the pseudo-inverse of . The new test sample can be recovered by after obtained the optimal solution .

We outline the detailed procedures of collaborative representation-based classification by discriminative low-rank representation in Algorithm 2.

  Input: Training data matrix for classes, a test sample and parameters
      1. for
      2. Find an optimal solution () of the following optimization problem by Algorithm 1:
        
        s.t.
      3. end for
      4. Correct a test sample :
        
        
        
      5. Classify by CRC
        
      6. for
      7.
      8. end for
  Output: identity
Algorithm 2 Collaborative representation-based classification by discriminative low-rank representation

4 Experiments

In this section, the performance of our proposed DLRR-CR is evaluated on two face databases: AR Martinez (1998) and Extended Yale B Georghiades et al. (2001) databases. The most face images chosen for face recognition are with variations in illumination, expression and corruption etc., and compare the performance of our method with the state-of-the-art methods, including SRC Wright et al. (2008), CRC Zhang et al. (2011)

, LRC (linear regression classification)

Naseem et al. (2010) and NN. To demonstrate the discriminating ability of the additional item in (7

), the LRR-CRC is implemented in our experiments. The DLRR-based SRC algorithm is also implemented denoted by DLRR-SRC to compare the effectiveness of SRC and CRC in the testing phase. In our experiments, high dimensionality of face images will lead to high computation complexity, so PCA is used as the dimensionality reduction method before testing. In our methods, the new learned eigenspace is spanned by the eigenvectors of the covariance matrix of the LR matrix

with structural incoherence. We choose the feature dimensions of 25, 50, 75, 100, 200 and 300. The method ALM Yang and Zhang (2011) is used to solve the -norm problem, and the regularization parameter in ALM is set to 0.001.

Figure 4: Example training and test images used in our experiments.

4.1 AR database

The AR database Martinez (1998) contains over 4,000 frontal images from 126 subjects. In our experiments a subset from the AR database that contains 50 male subjects and 50 female subjects are used. There are 26 face images available for each subject, divided into two sessions, under the variations of expression, illumination, and disguise. In each session, there are 7 clean images with illumination and expressions variations, 3 images in sunglasses and the remaining 3 images in scarves disguise. All face images are cropped to 165120 pixels and then converted into grayscale before training and testing. Some images from the first subject in the AR database are shown in Fig. 4 (a). We consider the following three scenarios to validate the performance of DLRR-CR as in Chen et al. (2012).

  1. Sunglasses: We consider the situation in which training and test images corrupted by sunglasses simultaneously. The presence of sunglasses produce about 20% occlusion of the frontal face image. From session one, we use eight training images, 7 neutral images plus one image with sunglasses. We use twelve test images, all non-occluded images from session two plus the rest of the face images with sunglasses.

  2. Scarf: We consider the situation in which training and test images corrupted by disguise simultaneously due to scarf, which occlude about 40% of the frontal face image. A similar choice of training and testing set is applied as above. From the first session, we use 7 neutral plus one with scarf for training. 7 non-occluded from the other session plus the remaining images with scarves for testing.

  3. Sunglasses+Scarf: In the final situation, we consider the most challenging case which training and test images occluded by mixed corruption due to sunglasses and scarves. From the first session, we use 7 neutral plus one with sunglasses and the other with a scarf for training. 7 non-occluded from the other session add the remaining images for testing.

For fair comparison, the feature dimensions are reduced to the same size for all methods. The regularization parameters in our method are set as =1.1, and =0.02. The recognition accuracy of the three situations are given in Tables 1-3. From the results we can see that our method almost exceeds the other methods in each situation, which demonstrates the superiority of our method over the other approaches in dealing with real disguise.

Dim 25 50 75 100 200 300
DLRR-CR 65.58 82.00 87.33 90.00 92.00 91.75
LRR-CR 61.25 81.58 87.50 89.50 90.67 90.58
LRR-CRC 54.75 75.67 83.08 86.67 91.33 92.08
CRC 52.08 73.67 80.25 84.67 89.25 90.50
SRC 56.67 71.83 75.67 77.92 82.25 84.00
LRC 57.50 68.08 70.50 71.75 73.67 73.92
NN 45.17 51.00 53.17 54.58 56.92 57.17
Table 1: Recognition accuracy (%) of different methods on the AR database with the occlusion of sunglasses.
Dim 25 50 75 100 200 300
DLRR-CR 58.25 84.75 88.50 90.83 91.58 91.83
LRR-CR 53.50 82.58 87.75 89.25 89.67 89.50
LRR-CRC 46.08 76.17 83.33 86.25 90.67 90.75
CRC 45.08 72.25 80.50 84.75 90.00 90.33
SRC 51.42 66.25 70.75 74.75 79.17 80.58
LRC 56.42 65.67 68.08 70.00 70.58 70.50
NN 39.75 45.42 47.00 48.83 50.50 50.75
Table 2: Recognition accuracy (%) of different methods on the AR database with the occlusion of scarves.
Dim 25 50 75 100 200 300
DLRR-CR 55.82 81.53 87.06 88.59 90.65 90.29
LRR-CR 53.82 78.29 85.53 88.00 88.82 88.53
LRR-CRC 45.65 72.76 80.29 85.53 89.71 90.12
CRC 42.94 69.29 78.35 82.12 88.18 89.53
SRC 51.06 66.00 70.88 73.65 78.06 80.47
LRC 53.53 64.71 68.35 69.41 70.94 70.76
NN 35.47 40.65 42.65 44.12 46.41 47.06
Table 3: Recognition accuracy (%) of different methods on the AR database with the occlusion of sunglasses and scarves.

4.2 Extended Yale B database

The Extended Yale B database Naseem et al. (2010) includes 2414 frontal face images for 38 subjects, each subject has 64 face images obtained under different laboratory-controlled lighting conditions. The face images are cropped to a size of 192168 pixels and normalized in advance. Some exampe face images from the the extended Yale B database are shown in Fig. 4 (b). Firstly, 16 face images are randomly selected from each individual for training, and the remaining images for testing. Secondly, 32 face images are randomly selected from each subject for training, and the remaining images for testing. The eigenface feature dimensions are set to the same as in the experiments in the AR database. The regularization parameters used in DLRR-CR are set as =1.1, =0.001. Depending on feature dimension, the parameter ranges from 0.004 to 0.1 for the two cases. All experiments run 5 times and the averaged accuracy is reported for performance evaluation shown in Tables 4-5.

From the results in Tables 4-5, for each dimension, DLRR-CR outperforms the other methods, this indictes our method can handle the problem of changes in illumination and expression. It should be noted that in the step 5 of Algorithm 2, the original training samples are used to reduce dimensions. The main reason is that we have already learned a desirable PCA subspace by the derived clean dictionary with discrimination and this subspace would not be too sensitive to sparse errors. The second reason is that CRC represents test sample collaboratively by all classes, a small proportion of corrupted training samples will have a small influence and there are also abundant images taken under well controlled settings which can participate in representing the test sample.

Dim 25 50 75 100 200 300
DLRR-CR 83.61 92.43 94.57 95.04 97.19 97.82
LRR-CR 72.43 91.20 93.36 94.68 94.74 95.85
LRR-CRC 63.16 81.77 88.19 91.31 95.16 96.27
CRC 61.17 79.20 85.60 89.17 94.21 96.09
SRC 45.16 67.00 77.95 84.29 93.36 95.07
NN 32.52 42.36 46.16 48.43 51.900 52.73
Table 4: Recognition accuracy (%) of different methods on the Extended Yale B database with 16 training images per person.
Dim 25 50 75 100 200 300
DLRR-CR 89.26 97.83 98.22 98.86 99.39 99.61
LRR-CR 75.88 96.27 96.94 96.94 98.38 98.99
LRR-CRC 60.18 84.03 90.07 93.21 96.30 97.11
CRC 65.44 85.23 90.62 93.66 97.02 97.97
SRC 48.05 72.73 82.08 87.31 94.99 96.61
NN 39.93 52.03 56.93 59.85 64.38 65.19
Table 5: Recognition accuracy (%) of different methods on the Extended Yale B database with 32 training images per person.

4.3 FR with artificial corruption

In this subsection, we consider the scenario in which both the training and test images are corrupted due to artificial occlusion. The face images from the Extended Yale B database are used to investigate the robustness of all competing approaches. In the first kind of setting, we randomly select 10% of all the images, then we randomly select pixels of these images, and these pixels are replaced by a random value in the range of [0,1]. In the second kind of setting, to examine the robustness of block occlusion, we also randomly selected 10% of all the samples from the database, then we randomly select square blocks of these images, and these square blocks are replaced by an unrelated image. Some representative examples of images with these two kinds of artificial occlusion are shown in Fig. 5. The percentage of corrupted samples in both situation are set to 10% and 20%.

Figure 5: Example images with artificial corruption.

When the artificial occlusion is added, 32 images are randomly selected from each class for training and the rest for testing. Now the regularization parameters are set as =1.1, =0.001 and depend on different feature dimensions, ranges from 0.005 to 0.1. We investigate the classification accuracy with the eigenface feature dimensions set as 50, 100 and 300. We run 5 times of all experiments and the average accuracy is recorded. Table 6 shows the recognition accuracy of all seven algorithms for the two kinds of percentages of pixel corruptions. Specifically, our method achieves the best recognition performance, which are higher than those of the other methods. The recognition accuracy of block occlusion is shown in Table 7. We can also see that the performance gains of our algorithm is significant.

Dim 10% 20%
50 100 300 50 100 300
DLRR-CR 96.69 98.28 99.08 95.10 96.99 98.36
DLRR-SRC 94.38 96.88 97.94 92.85 95.94 95.88
LRR-CRC 88.84 95.66 98.55 87.76 94.88 97.94
CRC 86.95 93.10 96.85 85.62 93.04 95.27
SRC 72.76 86.70 94.57 70.31 85.00 90.93
LRC 92.21 93.52 94.27 91.37 93.10 93.18
NN 51.45 58.51 62.99 51.45 57.09 57.98
Table 6: Recognition accuracy (%) of different methods on the Extended Yale B database with pixel corruptions.
Dim 10% 20%
50 100 300 50 100 300
DLRR-CR 95.83 97.74 98.78 94.46 96.55 97.30
DLRR-SRC 93.57 96.33 97.80 91.93 95.13 96.46
LRR-CRC 84.73 92.85 97.36 82.39 90.82 95.88
CRC 81.58 91.62 96.72 79.16 89.18 94.46
SRC 67.75 84.20 95.60 65.38 82.41 93.35
LRC 86.06 89.96 92.07 82.33 85.81 88.34
NN 49.78 57.71 62.88 46.38 54.20 59.10
Table 7: Recognition accuracy (%) of different methods on the Extended Yale B database with block occlusions.

5 Conclusions

Collaborative representation mechanism applied in face recognition has aroused considerable interest during the past few years. The most challenging case of robust face recognition is both the training and test images are corrupted by the unknown type of corruptions. A discriminative low-rank representation method for collaborative representation-based (DLRR-CR) method is proposed to solve this challenging problem. Our key contribution is the use of structural incoherence term which promotes the discrimination between different subjects. Meanwhile, CRC can achieve superb performance in various pattern classification tasks with low computational complexity. The experimental results proved that DLRR-CR is robust and effective to the possible corruptions existed in the face images.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Grant No. 61672265, U1836218), the 111 Project of Ministry of Education of China (Grant No. B12018), the Postgraduate Research & Practice Innovation Program of Jiangsu Province under Grant No. KYLX_1123, the Overseas Studies Program for Postgraduates of Jiangnan University and the China Scholarship Council (CSC, No.201706790096).

References

References

  • B. Bao, G. Liu, C. Xu, and S. Yan (2012)

    Inductive robust principal component analysis

    .
    IEEE Transactions on Image Processing 21 (8), pp. 3794–3800. Cited by: §3.3.
  • P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Transactions on Pattern Analysis & Machine Intelligence (7), pp. 711–720. Cited by: §1.
  • E. J. Candès, X. Li, Y. Ma, and J. Wright (2011) Robust principal component analysis?. Journal of the ACM (JACM) 58 (3), pp. 11. Cited by: §1.
  • C. Chen, C. Wei, and Y. F. Wang (2012) Low-rank matrix recovery with structural incoherence for robust face recognition. In

    2012 IEEE Conference on Computer Vision and Pattern Recognition

    ,
    pp. 2618–2625. Cited by: §1, §4.1.
  • S. S. Chen, D. L. Donoho, and M. A. Saunders (2001) Atomic decomposition by basis pursuit. SIAM review 43 (1), pp. 129–159. Cited by: §2.1.
  • Z. Chen, X. Wu, H. Yin, and J. Kittler (2018) Robust low-rank recovery with a distance-measure structure for face recognition. In

    Pacific Rim International Conference on Artificial Intelligence

    ,
    pp. 464–472. Cited by: §1.
  • Y. Chi and F. Porikli (2013) Classification and boosting with multiple collaborative representations. IEEE transactions on pattern analysis and machine intelligence 36 (8), pp. 1519–1531. Cited by: §1.
  • F. De La Torre and M. J. Black (2003) A framework for robust subspace learning. International Journal of Computer Vision 54 (1-3), pp. 117–142. Cited by: §1.
  • W. Deng, J. Hu, and J. Guo (2012) Extended src: undersampled face recognition via intraclass variant dictionary. IEEE Transactions on Pattern Analysis and Machine Intelligence 34 (9), pp. 1864–1870. Cited by: §1.
  • C. Ding and D. Tao (2015) Robust face recognition via multimodal deep face representation. IEEE Transactions on Multimedia 17 (11), pp. 2049–2058. Cited by: §1.
  • D. L. Donoho (2006) For most large underdetermined systems of linear equations the minimal l1-norm solution is also the sparsest solution. Communications on Pure and Applied Mathematics: A Journal Issued by the Courant Institute of Mathematical Sciences 59 (6), pp. 797–829. Cited by: §1.
  • A. S. Georghiades, P. N. Belhumeur, and D. J. Kriegman (2001) From few to many: illumination cone models for face recognition under variable lighting and pose. IEEE Transactions on Pattern Analysis & Machine Intelligence (6), pp. 643–660. Cited by: §4.
  • X. He, S. Yan, Y. Hu, P. Niyogi, and H. Zhang (2005) Face recognition using laplacianfaces. IEEE Transactions on Pattern Analysis & Machine Intelligence (3), pp. 328–340. Cited by: §1.
  • Z. Lin, M. Chen, and Y. Ma (2010) The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices. arXiv preprint arXiv:1009.5055. Cited by: §2.2, §3.2.
  • G. Liu, Z. Lin, S. Yan, J. Sun, Y. Yu, and Y. Ma (2012) Robust recovery of subspace structures by low-rank representation. IEEE transactions on pattern analysis and machine intelligence 35 (1), pp. 171–184. Cited by: §1, §1, §2.2.
  • G. Liu, Z. Lin, and Y. Yu (2010) Robust subspace segmentation by low-rank representation.. In ICML, Vol. 1, pp. 8. Cited by: §1, §2.2.
  • C. Luo, B. Ni, S. Yan, and M. Wang (2015) Image classification by selective regularized subspace learning. IEEE Transactions on Multimedia 18 (1), pp. 40–50. Cited by: §1.
  • A. M. Martinez (1998) The ar face database. CVC Technical Report24. Cited by: §4.1, §4.
  • I. Naseem, R. Togneri, and M. Bennamoun (2010) Linear regression for face recognition. IEEE transactions on pattern analysis and machine intelligence 32 (11), pp. 2106–2112. Cited by: §4.2, §4.
  • I. Ramirez, P. Sprechmann, and G. Sapiro (2010) Classification and clustering via dictionary learning with structured incoherence and shared features. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3501–3508. Cited by: §3.1.
  • R. Rigamonti, M. A. Brown, and V. Lepetit (2011) Are sparse representations really relevant for image classification?. In CVPR 2011, pp. 1545–1552. Cited by: §1.
  • Q. Shi, A. Eriksson, A. Van Den Hengel, and C. Shen (2011) Is face recognition really a compressive sensing problem?. In CVPR 2011, pp. 553–560. Cited by: §1.
  • M. Turk and A. Pentland (1991) Eigenfaces for recognition. Journal of cognitive neuroscience 3 (1), pp. 71–86. Cited by: §1.
  • J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, and Y. Gong (2010) Locality-constrained linear coding for image classification. In 2010 IEEE computer society conference on computer vision and pattern recognition, pp. 3360–3367. Cited by: §1.
  • J. Wright, A. Ganesh, S. Rao, Y. Peng, and Y. Ma (2009) Robust principal component analysis: exact recovery of corrupted low-rank matrices via convex optimization. In Advances in neural information processing systems, pp. 2080–2088. Cited by: §1, §2.2.
  • J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma (2008) Robust face recognition via sparse representation. IEEE transactions on pattern analysis and machine intelligence 31 (2), pp. 210–227. Cited by: §1, §1, §4.
  • W. Xiao-Jun, J. Kittler, Y. Jing-Yu, K. Messer, and W. Shitong (2004) A new direct lda (d-lda) algorithm for feature extraction in face recognition. In Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., Vol. 4, pp. 545–548. Cited by: §1.
  • J. Yang and Y. Zhang (2011) Alternating direction algorithms for ell_1-problems in compressive sensing. SIAM journal on scientific computing 33 (1), pp. 250–278. Cited by: §2.1, §4.
  • M. Yang, L. Zhang, S. C. Shiu, and D. Zhang (2013) Gabor feature based robust representation and classification for face recognition with gabor occlusion dictionary. Pattern Recognition 46 (7), pp. 1865–1878. Cited by: §1.
  • M. Yang and L. Zhang (2010) Gabor feature based sparse representation for face recognition with gabor occlusion dictionary. In European conference on computer vision, pp. 448–461. Cited by: §1, §1.
  • H. Yin and X. Wu (2016) Face recognition based on structural incoherence and low rank projection. In International Conference on Intelligent Data Engineering and Automated Learning, pp. 68–78. Cited by: §1.
  • C. Zhang, J. Liu, C. Liang, Z. Xue, J. Pang, and Q. Huang (2014) Image classification by non-negative sparse coding, correlation constrained low-rank and sparse decomposition. Computer Vision and Image Understanding 123, pp. 14–22. Cited by: §1.
  • L. Zhang, M. Yang, and X. Feng (2011) Sparse representation or collaborative representation: which helps face recognition?. In 2011 International conference on computer vision, pp. 471–478. Cited by: §1, §4.
  • Q. Zhang and B. Li (2012) Mining discriminative components with low-rank and sparsity constraints for face recognition. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 1469–1477. Cited by: §1.
  • Y. Zhang, Y. Zhang, J. Zhang, and Q. Dai (2015)

    CCR: clustering and collaborative representation for fast single image super-resolution

    .
    IEEE Transactions on Multimedia 18 (3), pp. 405–417. Cited by: §1.
  • Q. Zhao, D. Meng, Z. Xu, W. Zuo, and L. Zhang (2014) Robust principal component analysis with complex noise. In

    International conference on machine learning

    ,
    pp. 55–63. Cited by: §1.
  • W. Zhao, X. Wu, and H. Yin (2015) Collaborative representation-based robust face recognition by discriminative low-rank representation. In 2015 IEEE International Conference on Smart City/SocialCom/SustainCom (SmartCity), pp. 21–27. Cited by: §1.
  • Q. Zhen, D. Huang, Y. Wang, and L. Chen (2015) Muscular movement model based automatic 3d facial expression recognition. In International Conference on MultiMedia Modeling, pp. 522–533. Cited by: §1.
  • Y. Zheng, J. Yang, J. Yang, X. Wu, and Z. Jin (2006a) Nearest neighbour line nonparametric discriminant analysis for feature extraction. Electronics Letters 42 (12), pp. 679–680. Cited by: §1.
  • Y. Zheng, J. Yang, J. Yang, and X. Wu (2006b) A reformative kernel fisher discriminant algorithm and its application to face recognition. Neurocomputing 69 (13-15), pp. 1806–1810. Cited by: §1.