Multiplication fusion of sparse and collaborative-competitive representation for image classification

01/20/2020 ∙ by Zi-Qi Li, et al. ∙ 30

Representation based classification methods have become a hot research topic during the past few years, and the two most prominent approaches are sparse representation based classification (SRC) and collaborative representation based classification (CRC). CRC reveals that it is the collaborative representation rather than the sparsity that makes SRC successful. Nevertheless, the dense representation of CRC may not be discriminative which will degrade its performance for classification tasks. To alleviate this problem to some extent, we propose a new method called sparse and collaborative-competitive representation based classification (SCCRC) for image classification. Firstly, the coefficients of the test sample are obtained by SRC and CCRC, respectively. Then the fused coefficient is derived by multiplying the coefficients of SRC and CCRC. Finally, the test sample is designated to the class that has the minimum residual. Experimental results on several benchmark databases demonstrate the efficacy of our proposed SCCRC. The source code of SCCRC is accessible at https://github.com/li-zi-qi/SCCRC.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 10

page 11

page 12

page 13

page 14

page 16

page 17

page 18

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Representation based classification methods (RBCM) have already gained increasing attention in various research fields, e.g. character recognition Qu et al. (2018), person re-identification Prates and Schwartz (2019) and hyperspectral image classification Yang and Qian (2018). SRC Wright et al. (2009)

is a pioneering work of RBCM, which directly uses all the training data as the dictionary to represent the test sample, and classifies the test sample by checking which class leads to the minimal reconstruction error. SRC solves an

-norm optimization problem, and thus when the size of dictionary is huge, the sparse decomposition process may be very slow. One way to speed up the sparse coding process is to reduce the size of dictionary by selecting representative training samples. Li et al.Li et al. (2010) proposed a local sparse representation based classification (LSRC) scheme, which performs sparse decomposition in local neighborhood. Similarly, Zhang et al.Zhang and Yang (2010)

presented KNN-SRC, which chooses

nearest neighbors of a testing sample from all the training samples to represent the testing sample. Ortiz et al.Ortiz and Becker (2014)

developed a linearly approximated sparse representation-based classification (LASRC) algorithm that employs linear regression to perform sample selection for

-minimization. The other way is to obtain a compact and discriminative dictionary through dictionary learning. The most classic dictionary learning approach is KSVD Aharon et al. (2006), which is an iterative method that alternates between sparse coding and a process of updating the dictionary atoms. To make KSVD more suitable for classification tasks, Zhang et al.Zhang and Li (2010) proposed a discriminative K-SVD (D-KSVD) method which incorporates the classification error into the objective function of KSVD. Jiang et al.Jiang et al. (2013) presented a label consistent K-SVD (LC-KSVD) algorithm which combines a new label consistency constraint with the reconstruction error and the classification error to form a unified objective function. Later Kviatkovsky et al.Kviatkovsky et al. (2017) proved that under identical initialization conditions, LC-KSVD with uniform atom allocation is essentially a reformulation of DKSVD. Very recently, Song et al.Song et al. (2018) designed a kernel dictionary learning approach called Euler Label Consistent K-SVD (ELC-KSVD) to capture the nonlinear similarity of features.

Another prominent approach of RBCM is CRC Zhang et al. (2011) which replaces the -norm in SRC with -norm. Furthermore, Zhang et al.Zhang et al. (2011) revealed that it is the collaborative representation (CR) mechanism, but not the -norm sparsity that makes SRC successful for classification. Likewise, Xu et al.Xu et al. (2017)

introduced a discriminative sparse representation (DSR) method for robust face recognition via

regularization. The representation fidelity term in DSR is measured by -norm. To enhance the robustness of DSR, Zeng et al.Zeng et al. (2017a) proposed a robust version of DSR called Antinoise sparse representation method based on joint and regularization (Anti-L1L2) by employing the -norm to measure the fidelity term. Although CRC presents a geometric interpretation of its working mechanism, it is hard to understand. Afterwards, Cai et al.Cai et al. (2016) analyzed the classification mechanism of CRC from a probabilistic viewpoint and proposed a probabilistic collaborative representation based classifier (ProCRC). Based on ProCRC, Lan et al.Lan et al. (2018) explored a method called prior knowledge-based probabilistic CRC (PKPCRC) with further consideration of the prior knowledge extracted from the training samples. Yuan et al.Yuan et al. (2018) constructed a collaborative-competitive representation based classifier (CCRC) model by introducing a competitive regularization term into the objective function of CRC. Using the training samples of all classes to collaboratively represent a test sample may produce negative effect, to overcome this problem, Zheng et al.Zheng and Wang (2019) presented a -nearest classes based classification scheme. Moreover, Waqas et al.Waqas et al. (2013) proposed a method known as collaborative neighbor representation classifier (CNRC) which represents a test sample over the whole training dictionary by automatically selecting bases from the training samples close to test sample. By integrating CNRC and DSR, Gou et al.Gou et al. (2018) designed a new discriminative collaborative neighbor representation (DCNR) method for face recognition.

The above RBCM and their variants emphasize too much on the role of -norm sparsity or collaborative representation, and thus researchers are seeking to find new ways of combining them to enhance the performance of classification. Akhtar et al.Akhtar et al. (2017) argued that sparseness of collaborative representation explicitly contributes to accurate classification, and they developed a sparsity augmented collaborative representation based classification (SA-CRC) scheme. Zeng et al.Zeng et al. (2017b) proposed a representation-based image classification method that integrates SRC with CRC by a multiplication operation on the solutions. Li et al.Li et al. (2016) presented a fused representation-based classification (FRC) method which attempts to achieve the balance between CR and SR in the residual domain. However, there is a weight parameter in FRC which needs to be set manually. Furthermore, the results in  Akhtar et al. (2017)

demonstrate that sparsity of collaborative representation does play a critical role in the correct classification of test samples. By multiplying the coefficient vectors obtained by SRC and CRC, sparsity of the fused coefficient vector in SCRC 

Zeng et al. (2017b) is enhanced to some extent. Unfortunately, the coefficient vector obtained by CRC is limited sparse. In order to further promote the sparsity of fused coefficient vector, in this paper, we propose a new method coined sparse and collaborative-competitive representation based classification (SCCRC) for image classification. The fused coefficient is obtained by multiplying the coefficients of SRC and CCRC, and then the test sample is classified by checking which class yields the least residual. Some representative RBCM and our proposed SCCRC are summarized in Table 1, and our main contributions are summarized as follows,

  1. Our proposed SCCRC involves the multiplication of coefficients obtained by SRC and CCRC, it does not need to determine additional parameters, which makes SCCRC very efficient.

  2. Although SCCRC is simple in principle, it outperforms some state-of-the-art RBCM on both clean and corrupted images in terms of classification accuracy.

  3. Statistical significance test indicates that the performance differences between SCCRC and all the competing approaches are statistically significant.

Algorithm Formulation
SRC Wright et al. (2009)
CRC Zhang et al. (2011)
FRC Li et al. (2016)
SCRC Zeng et al. (2017b)
NRC Xu et al. (2019)
SA-CRC Akhtar et al. (2017)
CCRC Yuan et al. (2018)
SCCRC
Table 1: Summarization of some representative RBCM. Note that in FRC, the residuals obtained by SRC and CRC are fused by a weghting parameter . In SCRC and SA-CRC, and are the coefficient vectors obtained by SRC and CRC, respectively. In our proposed SCCRC, and are the coefficient vectors obtained by SRC and CCRC, respectively.

The remainder of this paper is structured as follows: Section 2 introduces several related approaches which include SRC, CRC, LRC Naseem et al. (2010) and CCRC. Section 3 presents our SCCRC algorithm. Section 4 reports the experiments on several benchmark databases. Finally, Section 5 concludes the paper.

2 Related work

As our SCCRC is based on SRC and CCRC, we now briefly review these approaches for the sake of completeness.

We consider a set of training samples collected from subjects, each training image is represented as a vector corresponding to the th column of the dictionary. Thus, all training samples form the matrix and , where is the dimensionality of each sample and denotes the number of training samples in the th class.

2.1 Sparse representation based classification

Given a test sample , SRC employs a sparse linear superposition of all the training data to represent the test sample by solving the following -norm minimization problem,

(1)

where is a balancing parameter. Then the reconstruction error (residual) for each class is obtained by,

(2)

where is the coefficients that correspond to the th class. Finally, the identity of the test sample is determined by evaluating which class leads to the least residual, i.e.

(3)

2.2 Collaborative representation based classification

CRC estimates the representation of the test sample

by relaxing the -norm to the -norm in (1). The objective function of CRC is formulated as follows,

(4)

where is a balancing parameter. Eq. (4) has closed form solution . In Zhang et al. (2011), Zhang found that in addition to the residual, the classwise coefficients can also bring some discrimination information for classification. Therefore, they proposed the following regularized residual for classification,

(5)

2.3 Linear regression classification

Different from SRC and CRC which employ all the training samples for representation, the mechanism of LRC adopts the training samples in each class to reconstruct the test sample in a classwise manner. Specifically, the objective function of LRC is formulated as,

(6)

Eq. (6) has closed-form solution, which is given by,

(7)

Finally, the test sample is classified according to the following rule,

(8)

In fact, the essence of LRC is nearest subspace (NS) classifier with downsampled features.

2.4 Collaborative-competitive representation based classification (CCRC)

Yuan et al.Yuan et al. (2018) proposed a collaborative-competitive representation based classifier model (CCRC) which incorporates a competitive term into the formulation of CRC, and the objective function of CCRC is formulated as follows,

(9)

where aims to collaboratively express the test sample by using all the training samples, encourages the competitive representation across different classes, and are balancing parameters. As we can see from Eq. (9), if equals to 0, CCRC boils down to CRC. CCRC has closed-form solution, which is given by,

(10)

where and

3 Sparse and collaborative-competitive representation based classification

3.1 SCCRC method

Although CRC and its improved approaches achieve impressive results in various classification tasks, it does not mean that sparsity can be totally ignored. Deng et al.Deng et al. (2013) pointed out that the dense coefficient of CRC would mislead the classification, and in a more recent work Deng et al. (2018), they find that when given uncontrolled and limited training data, the -minimization technique obtains more desirable results than that of -norm. Akhtar et al.Akhtar et al. (2017) also argued that sparsity plays an explicit role in accurate classification and they proposed a sparsity augmented collaborative representation based classification (SA-CRC) algorithm. Inspired by the above work, we present sparse and collaborative-competitive representation based classification (SCCRC) method which combines sparse and collaborative-competitive representation for classification. Concretely, we first obtain the coefficients of test sample by SRC and CCRC, respectively. Then the two coefficients are fused by element-wise multiplication. Finally, we classify the test sample to the class that has the minimal residual. The detailed procedures of our SCCRC is summarized in Algorithm 1.

0:  training data matrix , test data , parameter for SRC, parameters and for CCRC.
1:  Compute the sparse coefficient of test sample according to Eq. (1);
2:  Obtain the coefficient of CCRC for based on Eq. (10);
3:  Compute the fused coefficient ;
4:  Classify to the class that has the least residual: ;
4:  the identity of .
Algorithm 1 SCCRC

3.2 Difference between SCCRC and CCRC-

Our proposed SCCRC aims to promote the sparsity of coefficient vector obtained by CCRC, which involves a simple multiplication of coefficient vectors derived by SRC and CCRC. Another intuitive way to increase the sparsity of coefficient vector of CCRC is modifying the objective function of CCRC, i.e., replacing the -norm constraint on with the -norm. This method is called CCRC-, and its objective function is formulated as follows,

(11)

To solve Eq. (11), by introducing an auxiliary variable , Eq. (11) can be converted into the following equivalent optimization problem,

(12)

The Augmented Lagrange Multiplier (ALM) scheme can be adopted to solve Eq. (12), and the augmented Langrangian function is formulated as,

(13)

where is the Lagrange multiplier, and is a penalty parameter. Eq. (13) can be solved iteratively by updating and once at a time. The detailed procedures are presented as follows.

Update : Fix and update by solving the following problem,

(14)

Suppose is a matrix that has the same size as , and only consists of samples from the -th class, i.e., , Equation (14) can be reformulated as,

(15)

which has the following closed-form solution,

(16)

where

is an identity matrix.

Update : Fix and update by solving the following problem,

(17)

Eq. (17) can be solved by the soft-thresholding operator Combettes and Wajs (2005). The complete procedures of solving Eq. (12) are summarized in Algorithm 2.

0:  Test sample , training data matrix , balancing parameter and .
1:  Initialize , , , , ;
2:  while not converged do
3:     Update by Eq. (16);
4:     Update by solving Eq. (17);
5:     Update by ;
6:     Update by ;
7:     Check the convergence condiction: .
8:  end while
8:  Coefficient vector .
Algorithm 2 Solve Eq. (12) via ALM

From Eq. (11), one can see that CCRC- directly imposes the -norm constraint on the coefficient vector , and the -norm constraint induces sparsity of . CCRC- is solved by ALM, which iteratively update the variables. By contrast, our proposed SCCRC is more straightforward, it fuses the coefficients of SRC and CCRC by multiplication. Note that coefficient vectors of test data can be viewed as some kind of features, and the multiplication fusion of coefficient vectors is equivalent to the data integration at the feature level. We believe that the feature level fusion can bring in improved performance, as demonstrated by the experimental results in Section 4.

3.3 Rationale of SCCRC

In our proposed SCCRC, the fused coefficient vector of a test sample is derived by , where and are obtained by SRC and CCRC, respectively. According to the procedures of SRC and CCRC, the test sample and training data matrix are normalized to have unit -norm. Therefore, absolute values of the entries in the coefficient vectors are less than 1. When both the elements in and have large absolute value, after multiplication, the corresponding element in will also have large absolute value. In most cases, coefficient vector obtained by SRC is sparse, while the coefficient vector obtained by CCRC is a little dense, and coefficients of the training samples, whose labels are the same as the test sample, tend to have large absolute value. Therefore, sparsity of the fused coefficient vector is promoted which is beneficial for the correct classification of the test sample.

To vividly illustrate the effectiveness of SCCRC, we here present an example on the ORL database. We choose a test image from the 28th subject, and the training data consists of the first 3 images per person, and thus the dictionary contains atoms. Fig. 1 shows the coefficients obtained by SRC, and the prominent coefficients correspond to the 28th class. Fig. 2 depicts the reconstruction error for each class, and the 28th class has the least residual, so the test sample is correctly classified. Fig. 3 plots the coefficients computed by CCRC, we can find that the largest coefficient corresponds to the 28th class. However, when we check the residual presented in Fig. 4, the test sample is wrongly classified into the 6th class. The reason is that the other two coefficients of the 28th subject are negative, which leads to the fact that the 28th class does not produce the least residual. Fig. 5 shows the coefficients computed by our SCCRC, we can see that by fusing the coefficients of SRC and CCRC, the dominant coefficients correspond to the 28th class. Though the other two coefficients are negative, they are relatively small compared with the positive one. Thus, the residual of the 28th subject is the minimal, which is illustrated in Fig. 6. By comparing Fig. 5 with Figs. 1 and 3, we can observe that the fused coefficient vector of SCCRC is more sparse than that of SRC and CCRC, which validates the effectiveness of multiplication fusion.

Figure 1: Coefficients computed by SRC, we can see that the prominent coefficients concentrate on the correct class (the 28th class).
Figure 2: The residual obtained by SRC, and the 28th class has the least residual.
Figure 3: Coefficients computed by CCRC, although the 28th class has the largest coefficient, the other two coefficients are negative. In fact, the test sample is wrongly classified into the 6th class.
Figure 4: The residual obtained by CCRC, and the 6th class has the least residual. As a result, the test sample is misclassified.
Figure 5: Coefficients computed by SCCRC, we can see that the dominant coefficients correspond to the correct class (the 28th class).
Figure 6: The residual obtained by SCCRC, and the 28th class has the least residual.

From Figs. 24 and 6, we can see that by augmenting the sparsity of coefficients obtained by CCRC, SCCRC can avoid the misclassification scenario to some degree. Next we employ another criteria to illustrate the effectiveness of SCCRC. In Wright et al. (2009), Wright defined a measure of how concentrated the coefficients are on a single class, i.e. sparsity concentration index (SCI), which is defined as,

(18)

Bigger values of SCI means enhanced sparsity; therefore, we exploit this index to calculate the SCI of the above test sample. The values of SRC, CCRC and our proposed SCCRC are 0.0428, 0.0714 and 0.2079, respectively. We can see that by introducing the competitive regularization term, sparsity of CCRC is improved compared with SRC. By combining sparse and collaborative-competitive representation, our proposed SCCRC achieves the highest sparsity, thus improved classification results can be expected.

4 Experimental results and analysis

In this section, we report the performance of SCCRC on five publicly available datasets, i.e. ORL Samaria and Harter (1994), Georgia Tech Goel et al. (2005), FERET Phillips et al. (1997), Extended Yale B Georghiades et al. (2001) and AR databases Martínez and Benavente (1998). We compare the classification accuracy of SCCRC with SRC Wright et al. (2009), LRC Naseem et al. (2010), CRC Zhang et al. (2011), SCRC Zeng et al. (2017b), NRC Xu et al. (2019), ProCRC Cai et al. (2016) CCRC Yuan et al. (2018) and CCRC-. In addition, we present the classification time (in seconds) of all the competing approaches. We use SolveFISTA.m Yang et al. (2010) to solve the sparse optimization problem. The parameters and in CCRC and our proposed SCCRC are selected from the set . All experiments are conducted with MATLAB R2019b under Windows 10 on a PC equipped with Intel i9-8950HK 2.90 GHz CPU and 32 GB RAM.

4.1 Experiments on the ORL database

The ORL database contains 400 images of 40 individuals. For each subject, there are 10 images with the variations in lighting, facial expression and facial details (with or without glasses), Fig. 7 shows some face images from this database. In our experiments, each image is resized to 5646, and the first 1 to 6 images per person are selected as training samples and the remaining are testing samples.

The recognition accuracy and the testing time (when the first 6 images per subject are used as training samples) of different methods are shown in Table 2. We can see that SCCRC outperforms all the competing approaches in terms of recognition accuracy. Meanwhile, the testing time of SCCRC is comparable to that of SCRC, and it is 31 times faster than NRC. With the increase of the number of training samples, the classification accuracy of SCCRC increases consistently. By integrating SRC and CRC, SCRC obtains better results than those of SRC and CRC, especially when the number of training samples is relatively small. By introducing the non-negative constraint, NRC exhibits superiority over SRC and CRC. Thanks to the competitive regularization term, CCRC outperforms CRC with the increasing number of training samples.

Figure 7: Example images from the ORL database.
Methods 1 2 3 4 5 6 testing time (s)
SRC Wright et al. (2009) 73.06 82.50 84.64 88.33 88.50 92.50 0.79
LRC Naseem et al. (2010) 67.50 79.38 81.43 86.25 88.00 94.38 0.70
CRC Zhang et al. (2011) 71.67 83.75 86.07 91.25 90.50 92.50 0.34
SCRC Zeng et al. (2017b) 73.89 86.56 86.43 91.25 92.50 92.50 0.77
NRC Xu et al. (2019) 72.78 87.19 88.21 90.00 90.00 93.75 28.66
ProCRC Cai et al. (2016) 67.50 85.00 86.07 90.41 92.00 94.37 0.09
CCRC Yuan et al. (2018) 68.06 85.63 87.50 90.83 92.50 94.38 0.29
CCRC- 71.67 85.63 87.14 90.00 89.50 93.13 0.29
SCCRC 73.89 87.81 88.57 91.67 93.00 95.00 0.91
Table 2: Classification accuracy (%) and the testing time of competing approaches on the ORL database.

4.2 Experiments on the GT database

There are 750 face images (50 individuals and each has 15 images) in the GT face database. These face images show frontal and/or tilted faces with different facial expressions, lighting conditions and scale. In our experiments, all images are cropped and resized to 6050. Fig. 8 shows some cropped face images from the GT face dataset. The first 1-6 face images of each subject are used as training samples, and the remaining images are regarded as testing samples.

Table 3 lists the classification accuracy and the testing time (when the first 6 images per subject are used as training samples) of compared methods on the GT database. One can see that with the increasing number of training samples, accuracy of all competing methods is increased, and the proposed SCCRC achieves the best results in most cases. Moreover, ProCRC has the shortest testing time, and SCCRC is 25 times faster than NRC.

Figure 8: Example images from the GT database.
Methods 1 2 3 4 5 6 testing time (s)
SRC Wright et al. (2009) 38.71 47.38 49.17 52.18 53.60 62.44 5.38
LRC Naseem et al. (2010) 36.14 46.62 51.33 55.64 59.60 67.33 2.81
CRC Zhang et al. (2011) 36.43 46.00 49.33 54.36 58.40 64.67 1.68
SCRC Zeng et al. (2017b) 37.71 48.31 53.83 56.91 61.20 68.00 4.47
NRC Xu et al. (2019) 36.86 47.69 50.33 57.82 60.20 66.89 116.20
ProCRC Cai et al. (2016) 34.14 45.69 51.33 55.09 58.40 63.77 0.35
CCRC Yuan et al. (2018) 33.43 44.46 48.67 51.64 55.40 61.33 1.30
CCRC- 33.86 44.77 48.67 51.64 55.20 62.89 1.32
SCCRC 39.14 50.15 54.50 57.64 62.20 68.89 4.57
Table 3: Classification accuracy (%) and the testing time of various approaches on the GT database.

4.3 Experiments on the FERET database

This subset used is from the well-known FERET face database, which consists of 1400 face images of 200 subjects, each provided with seven different face images. This subset is composed of images in the original FERET face data set whose names are marked with two-character strings: ba, bj, bk, be, bf, bd, and bg. Fig. 9 shows some of these face images. Each face image is resized to a 4040 image. The first 1-6 face images of each subject are selected as training samples, and the rest as testing samples.

Table 4 details the classification results of competing methods. One can see that SCCRC is very efficient, and its testing time is only one-fifth of that of NRC. The classification results of all approaches drop when the number of training samples per subject is 3,  Zeng et al. (2017b) also reported this observation. The reason behind this phenomenon is that we select the first 3 images for all subjects, that is, the indices of training samples are fixed across different individuals. A more reasonable way is to randomly choose the training samples for each subject.

Figure 9: Example images from the FERET database.
Methods 1 2 3 4 5 6 testing time (s)
SRC Wright et al. (2009) 45.83 57.20 52.38 68.00 76.25 79.50 8.57
LRC Naseem et al. (2010) 44.50 63.10 58.88 76.33 81.50 75.50 3.36
CRC Zhang et al. (2011) 40.67 55.60 47.38 56.83 65.25 69.50 12.96
SCRC Zeng et al. (2017b) 45.67 61.90 56.00 69.17 80.50 80.50 8.61
NRC Xu et al. (2019) 46.33 59.90 52.62 63.00 73.75 73.00 42.46
ProCRC Cai et al. (2016) 39.75 54.90 42.87 49.66 56.75 50.00 0.43
CCRC Yuan et al. (2018) 42.08 57.20 48.25 53.83 57.00 47.00 5.59
CCRC- 43.42 58.50 48.63 59.33 69.75 68.50 6.09
SCCRC 47.83 65.50 60.12 74.83 83.25 85.50 8.92
Table 4: Classification accuracy (%) and the testing time of different methods on the FERET database.

In order to demonstrate the statistical significance of our proposed SCCRC compared with the other methods, we conduct a significance test, McNemar’s test Wen et al. (2018); Li and Fu (2015), for the results shown in Table 4. The significance level, i.e., -value is set as 0.05, which means that the performance difference between two methods is statistically significant, if the estimated -value is lower than 0.05. Table 5 lists the -values between SCCRC and the other methods. From this table, one can see that the performance differences between SCCRC and the methods (SRC, CRC, SCRC, ProCRC, CCRC and CCRC-) are statistically significant in all cases. The performance differences between SCCRC and LRC/NRC are not statistically significant; however, SCCRC outperforms NRC in all cases, and SCCRC is superior to LRC except when there are 4 training samples per subject. The above experimental results validate the effectiveness of our proposed SCCRC.

Methods 1 2 3 4 5 6
SRC Wright et al. (2009) 6.59 2.54 2.73 2.64
LRC Naseem et al. (2010) 0.3291 0.2370 0.3222 1.03
CRC Zhang et al. (2011) 3.90 1.05 1.83 5.67 2.52 1.02
SCRC Zeng et al. (2017b) 2.18 3.74 7.27
NRC Xu et al. (2019) 0.1783 6.71 5.68 1.13 4.00 8.67
ProCRC Cai et al. (2016) 1.79 6.32 1.28 1.46 2.58 5.23
CCRC Yuan et al. (2018) 5.55 2.93 2.74 4.65 2.20 1.02
CCRC- 1.77 1.32 5.26 1.70 1.49 3.62
Table 5: -value between SCCRC and the other methods on the FERET database. indicates that the difference between the two methods is statistically significant when =0.05.

4.4 Experiments on the Extended Yale B database

The Extended Yale B Georghiades et al. (2001) face dataset contains 2414 frontal facial images of 38 individuals. These images are captured under various controlled lighting conditions. The size of an image is 192168 pixels. In our experiments, all images are cropped and resized to 3232 pixels. Fig. 10 shows some face images from the Extended Yale B face dataset. The first 5, 10, 15, 20, 25 and 30 face images of each subject are treated as training samples and the remaining as testing samples.

Classification accuracy and the testing time (when the first 30 images per subject are used as training samples) of different approaches on this database is listed in Table 6. We can observe that the testing time of SCCRC is comparable to that of SCRC, and it is about 3 times faster than NRC. Similar to the experimental results on the above three databases, classification accuracy of SCCRC increases steadily with the increasing number of training samples per subject, and the performance gain is significant when the number of training samples is increasing. For example, when the number of training samples is 15, SCCRC has more than 6.73% higher accuracy than the second best method, i.e., LRC.

Figure 10: Example images from the Extended Yale B database.
Methods 5 10 15 20 25 30 testing time (s)
SRC Wright et al. (2009) 40.51 41.79 38.77 41.17 44.81 44.82 38.82
LRC Naseem et al. (2010) 46.27 75.22 75.27 76.06 83.06 85.01 14.69
CRC Zhang et al. (2011) 48.29 68.24 74.13 77.81 79.85 85.09 5.45
SCRC Zeng et al. (2017b) 46.09 57.03 65.94 69.11 67.62 71.82 41.59
NRC Xu et al. (2019) 51.08 65.93 71.04 74.12 77.12 81.63 113.94
ProCRC Cai et al. (2016) 52.02 69.27 73.69 77.14 80.94 85.63 0.36
CCRC Yuan et al. (2018) 54.99 72.62 74.95 77.33 83.06 85.56 2.54
CCRC- 51.89 71.14 75.05 78.66 85.25 89.80 5.86
SCCRC 52.97 79.25 82.00 83.56 83.81 87.21 40.41
Table 6: Classification accuracy (%) and the testing time of competing algorithms on the Extended Yale B database.

4.5 Experiments on the AR database

AR database includes over 4000 face images of 126 people (70 male and 56 female) which vary in expression, illumination and disguise (wearing sunglasses or scarves). Each subject has 26 images consisting of 14 clean images, 6 images with sunglasses and 6 images with scarves. As in Jiang et al. (2013); Yuan et al. (2018), we use a subset that contains 1400 clean faces selected from 50 male and 50 female subjects, all the images are resized to 2820, and some example images are shown in Fig. 11. For each subject, we use the first 1-6 face images as training samples, and the remaining as testing samples. Experimental results are shown in Table 7. Once again, SCCRC outperforms the other competing approaches in terms of classification accuracy except for the case of 2 training samples per person.

Figure 11: Example images from the AR database.
Methods 1 2 3 4 5 6 testing time (s)
SRC Wright et al. (2009) 59.00 56.33 53.91 52.90 55.67 61.12 4.19
LRC Naseem et al. (2010) 58.08 56.92 55.64 56.50 60.00 75.50 5.06
CRC Zhang et al. (2011) 58.15 60.42 61.09 65.70 76.89 91.50 2.91
SCRC Zeng et al. (2017b) 62.08 64.17 65.00 66.70 75.78 88.62 4.48
NRC Xu et al. (2019) 65.31 68.33 68.64 72.60 84.00 93.00 17.41
ProCRC Cai et al. (2016) 68.46 73.00 75.72 75.70 84.00 91.62 0.28
CCRC Yuan et al. (2018) 67.62 71.92 72.82 73.20 83.00 91.25 1.17
CCRC- 63.77 68.75 70.36 73.30 85.11 93.13 1.49
SCCRC 70.00 72.75 77.27 78.90 86.11 93.63 4.54
Table 7: Classification accuracy (%) and the testing time of evaluating methods on the AR database.

4.6 Experiments on the corrupted face images

To explore the robustness of our proposed method to noise, we use corrupted face images as test data. Here the AR database is used for evaluation, as in Section 4.5, 1400 images of 100 subjects are selected, and the size of image is 28

20. The first seven images are used as training samples and the remaining as test samples. We add zero-mean, Gaussian white noise with variance of 0.01 to all the test images, some corrupted test face images are shown in Fig. 

12. Experimental results of all competing approaches are shown in Table 8. It can be seen that by introducing the -norm constraint on the coefficient vector, CCRC- outperforms CCRC by 10.14%. Our proposed SCCRC achieves the highest accuracy, and it is around 2.4 times faster than NRC.

Figure 12: Some corrupted test images from the AR database.
Methods Accuracy (%) testing time (s)
SRC Wright et al. (2009) 84.57 10.82
LRC Naseem et al. (2010) 74.57 4.75
CRC Zhang et al. (2011) 82.14 2.80
SCRC Zeng et al. (2017b) 84.00 4.84
NRC Xu et al. (2019) 86.14 26.85
ProCRC Cai et al. (2016) 83.14 0.29
CCRC Yuan et al. (2018) 75.00 1.25
CCRC- 85.14 1.65
SCCRC 87.29 11.27
Table 8: Classification accuracy and the testing time of evaluating methods on the corrupted face images.

4.7 Experiment analysis

The classification results on five databases validate the effectiveness and robustness of our proposed SCCRC. Based on the experimental results on these databases, the following observations can be made:

(1) By enhancing the sparsity of representation of CRC, SCRC outperforms SRC and CRC in most cases, which reveals that sparseness of collaborative representation explicitly contributes to accurate classification of test samples.

(2) Thanks to the non-negative constraint on the coefficient vector, NRC achieves higher classification accuracy than SRC and CRC, which demonstrates the discriminative capability of the non-negative regularization term.

(3) By introducing the competitive representation term, CCRC is superior to CRC in terms of classification accuracy. This regularization term promotes competitive representation between distinct classes, which encourages the coefficient vector to be sparse to some extent.

(4) On the clean test images, improvement of CCRC- over CCRC is not that significant. However, on the corrupted test images, CCRC- outperforms CCRC by 10.14%, which again verifies that sparsity of coefficient vector is necessary to improve the classification performance.

(5) Both on the clean and corrupted test images, our proposed SCCRC performs the best, which indicates that SCCRC needs clean training images. Like conventional RBCM, our proposed SCCRC is a general classification framework and it can be applied in other pattern classification tasks. For corrupted training images, as in Iliadis et al. (2017), we can first employ low rank matrix recovery techniques (e.g., robust PCA Candès et al. (2011) and its variants) to obtain clean training images, then SCCRC can be used for classification.

4.8 Parameter sensitiveness analysis

In our proposed SCCRC, there are three parameters to be determined, i.e. one parameter for SRC and two parameters and for CCRC. We set =0.001 for SRC as in SCRC. To examine how the remaining parameters and influence the performance of SCCRC, we conduct experiments on the AR database. Experimental setting is the same as in Section 4.5 and the number of training samples per subject is 6. Fig. 13 illustrates the effect of parameter selection. We can see that SCCRC achieves superb results when both and are in proper range. More specifically, the performance is better when is in the range of [0.0001 0.001], which validates the necessity of the competitive term in CCRC. Similarly, SCCRC can achieve better results when is in the range of [0.0001 0.001]. To better illustrate the influence of and , we plot the recognition accuracy against the variation of one parameter when the other is fixed. Fig. 14 shows the influence of when =0.001, we can see that the performance of SCCRC is desirable when is assigned to a small value. Fig. 15 presents the effect of when =0.001, one can see that SCCRC performs stable when is in the range of [0.0001 0.01]. Based on the above experimental results, we set =0.001 and =0.001 on the AR database.

Figure 13: Recognition accuracy of SCCRC versus parameters and on the AR database.
Figure 14: Recognition accuracy versus the variation of parameter when =0.001.
Figure 15: Recognition accuracy versus the variation of parameter when =0.001.

5 Conclusions

Sparse representation based classification (SRC) and collaborative representation based classification (CRC) have been widely studied due to their promising results for classification. Although CRC reveals that it is the collaborative representation mechanism rather than the -norm sparsity that makes SRC powerful, sparsity should not be completely ignored. Thus, in this paper, we propose a new technique to enhance the sparsity of CCRC by multiplying the coefficients of SRC and CCRC. Experiments on five publicly available databases validate the effectiveness of our proposed SCCRC, and demonstrate that SCCRC outperforms other representation based classification approaches.

In this paper, we did not explicitly consider the situation that both the training and test samples are contaminated due to occlusion or corruption, thus in future, we will extend SCCRC to tackle the above scenarios.

Acknowledgements.
This work was supported in part by the National Natural Science Foundation of China (Projects Numbers: 61673194, 61672263, 61672265 and 61876072), and in part by the national first-class discipline program of Light Industry Technology and Engineering (Project Number: LITE2018-25).

References

  • M. Aharon, M. Elad, A. Bruckstein, et al. (2006) K-svd: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on signal processing 54 (11), pp. 4311. Cited by: §1.
  • N. Akhtar, F. Shafait, and A. Mian (2017) Efficient classification with sparsity augmented collaborative representation. Pattern Recognition 65, pp. 136–145. Cited by: Table 1, §1, §3.1.
  • S. Cai, L. Zhang, W. Zuo, and X. Feng (2016) A probabilistic collaborative representation based approach for pattern classification. In

    Proceedings of the IEEE conference on computer vision and pattern recognition

    ,
    pp. 2950–2959. Cited by: §1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, §4.
  • E. J. Candès, X. Li, Y. Ma, and J. Wright (2011)

    Robust principal component analysis?

    .
    Journal of the ACM (JACM) 58 (3), pp. 11. Cited by: §4.7.
  • P. L. Combettes and V. R. Wajs (2005) Signal recovery by proximal forward-backward splitting. Multiscale Modeling & Simulation 4 (4), pp. 1168–1200. Cited by: §3.2.
  • W. Deng, J. Hu, and J. Guo (2013) In defense of sparsity based face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 399–406. Cited by: §3.1.
  • W. Deng, J. Hu, and J. Guo (2018) Face recognition via collaborative representation: its discriminant nature and superposed representation. IEEE transactions on pattern analysis and machine intelligence 40 (10), pp. 2513–2521. Cited by: §3.1.
  • A. S. Georghiades, P. N. Belhumeur, and D. J. Kriegman (2001) From few to many: illumination cone models for face recognition under variable lighting and pose. IEEE Transactions on Pattern Analysis & Machine Intelligence (6), pp. 643–660. Cited by: §4.4, §4.
  • N. Goel, G. Bebis, and A. Nefian (2005) Face recognition experiments with random projection. In Biometric Technology for Human Identification II, Vol. 5779, pp. 426–438. Cited by: §4.
  • J. Gou, L. Wang, Z. Yi, J. Lv, Q. Mao, and Y. Yuan (2018) A new discriminative collaborative neighbor representation method for robust face recognition. IEEE Access 6, pp. 74713–74727. Cited by: §1.
  • M. Iliadis, H. Wang, R. Molina, and A. K. Katsaggelos (2017) Robust and low-rank representation for fast face identification with occlusions. IEEE Transactions on Image Processing 26 (5), pp. 2203–2218. Cited by: §4.7.
  • Z. Jiang, Z. Lin, and L. S. Davis (2013) Label consistent k-svd: learning a discriminative dictionary for recognition. IEEE transactions on pattern analysis and machine intelligence 35 (11), pp. 2651–2664. Cited by: §1, §4.5.
  • I. Kviatkovsky, M. Gabel, E. Rivlin, and I. Shimshoni (2017) On the equivalence of the lc-ksvd and the d-ksvd algorithms. IEEE transactions on pattern analysis and machine intelligence 39 (2), pp. 411–416. Cited by: §1.
  • R. Lan, Y. Zhou, Z. Liu, and X. Luo (2018) Prior knowledge-based probabilistic collaborative representation for visual recognition. IEEE transactions on cybernetics. Cited by: §1.
  • C. Li, J. Guo, and H. Zhang (2010) Local sparse representation based classification. In 2010 20th International Conference on Pattern Recognition, pp. 649–652. Cited by: §1.
  • S. Li and Y. Fu (2015) Learning robust and discriminative subspace with low-rank constraints.

    IEEE transactions on neural networks and learning systems

    27 (11), pp. 2160–2173.
    Cited by: §4.3.
  • W. Li, Q. Du, F. Zhang, and W. Hu (2016) Hyperspectral image classification by fusing collaborative and sparse representations. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 9 (9), pp. 4178–4187. Cited by: Table 1, §1.
  • A. Martínez and R. Benavente (1998) The ar face database. Computer Vision Center, Technical Report 24. Cited by: §4.
  • I. Naseem, R. Togneri, and M. Bennamoun (2010) Linear regression for face recognition. IEEE transactions on pattern analysis and machine intelligence 32 (11), pp. 2106–2112. Cited by: §1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, §4.
  • E. G. Ortiz and B. C. Becker (2014) Face recognition for web-scale datasets. Computer Vision and Image Understanding 118, pp. 153–170. Cited by: §1.
  • P. J. Phillips, H. Moon, P. Rauss, and S. A. Rizvi (1997) The feret evaluation methodology for face-recognition algorithms. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 137–143. Cited by: §4.
  • R. Prates and W. R. Schwartz (2019) Kernel cross-view collaborative representation based classification for person re-identification. Journal of Visual Communication and Image Representation 58, pp. 304–315. Cited by: §1.
  • X. Qu, W. Wang, K. Lu, and J. Zhou (2018) In-air handwritten chinese character recognition with locality-sensitive sparse representation toward optimized prototype classifier. Pattern Recognition 78, pp. 267–276. Cited by: §1.
  • F. S. Samaria and A. C. Harter (1994) Parameterisation of a stochastic model for human face identification. In Proceedings of 1994 IEEE Workshop on Applications of Computer Vision, pp. 138–142. Cited by: §4.
  • Y. Song, Y. Liu, Q. Gao, X. Gao, F. Nie, and R. Cui (2018) Euler label consistent k-svd for image classification and action recognition. Neurocomputing 310, pp. 277–286. Cited by: §1.
  • J. Waqas, Z. Yi, and L. Zhang (2013) Collaborative neighbor representation based classification using l2-minimization approach. Pattern Recognition Letters 34 (2), pp. 201–208. Cited by: §1.
  • J. Wen, X. Fang, Y. Xu, C. Tian, and L. Fei (2018) Low-rank representation with adaptive graph regularization. Neural Networks 108, pp. 83–96. Cited by: §4.3.
  • J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma (2009) Robust face recognition via sparse representation. IEEE transactions on pattern analysis and machine intelligence 31 (2), pp. 210–227. Cited by: Table 1, §1, §3.3, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, §4.
  • J. Xu, W. An, L. Zhang, and D. Zhang (2019) Sparse, collaborative, or nonnegative representation: which helps pattern classification?. Pattern Recognition 88, pp. 679–688. Cited by: Table 1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, §4.
  • Y. Xu, Z. Zhong, J. Yang, J. You, and D. Zhang (2017) A new discriminative sparse representation method for robust face recognition via {} regularization. IEEE transactions on neural networks and learning systems 28 (10), pp. 2233–2242. Cited by: §1.
  • A. Y. Yang, S. S. Sastry, A. Ganesh, and Y. Ma (2010) Fast l1-minimization algorithms and an application in robust face recognition: a review. In 2010 IEEE International Conference on Image Processing, pp. 1849–1852. Cited by: §4.
  • J. Yang and J. Qian (2018) Hyperspectral image classification via multiscale joint collaborative representation with locally adaptive dictionary. IEEE Geoscience and Remote Sensing Letters 15 (1), pp. 112–116. Cited by: §1.
  • H. Yuan, X. Li, F. Xu, Y. Wang, L. L. Lai, and Y. Y. Tang (2018) A collaborative-competitive representation based classifier model. Neurocomputing 275, pp. 627–635. Cited by: Table 1, §1, §2.4, §4.5, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, §4.
  • S. Zeng, J. Gou, and L. Deng (2017a) An antinoise sparse representation method for robust face recognition via joint l1 and l2 regularization. Expert Systems with Applications 82, pp. 1–9. Cited by: §1.
  • S. Zeng, X. Yang, and J. Gou (2017b) Multiplication fusion of sparse and collaborative representation for robust face recognition. Multimedia Tools and Applications 76 (20), pp. 20889–20907. Cited by: Table 1, §1, §4.3, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, §4.
  • L. Zhang, M. Yang, and X. Feng (2011) Sparse representation or collaborative representation: which helps face recognition?. In 2011 International conference on computer vision, pp. 471–478. Cited by: Table 1, §1, §2.2, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, §4.
  • N. Zhang and J. Yang (2010) K nearest neighbor based local sparse representation classifier. In 2010 Chinese conference on pattern recognition (CCPR), pp. 1–5. Cited by: §1.
  • Q. Zhang and B. Li (2010) Discriminative k-svd for dictionary learning in face recognition. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2691–2698. Cited by: §1.
  • C. Zheng and N. Wang (2019) Collaborative representation with k-nearest classes for classification. Pattern Recognition Letters 117, pp. 30–36. Cited by: §1.