1 Introduction
With the advent of the age of big data, we are confronted with different kinds of data every day. Unsupervised learning consider various problems in big data based on samples of unknown category. Clustering is a very classic and important unsupervised learning algorithm, which has been extensively applied in data mining, image segmentation, computer vision, pattern recognition, finance and other fields
[1, 2, 3, 4]. The existing classic clustering algorithms include kmeans
[5], spectral clustering [6, 7], density clustering [8], fuzzy clustering [9]etc.. Spectral clustering has advantages such as the algorithm is efficient, the data can be of any shape, the method is not sensitive to abnormal data and can be applied to highdimensional problem. However, the spectral clustering needs to input the affine matrix in advance, which has a great influence on the clustering results.In 2010, the LowRank Representation (LRR) [10] problem has been proposed by Liu et al. The affine matrix is mainly obtained by solving the LRR problem, followed by clustering the matrix using spectral clustering methods (such as Normalized Cuts () [6]). They assume that data samples come from the union of multiple subspaces, and the purpose of the algorithm is to denoise and obtain samples on the corresponding subspaces to which they belong. In the article they proved that LRR can accurately obtain each real subspace for clean data. For the noisy data, LRR can approximately restore the subspace of the original data with theoretical guarantees. In the article [10], in the case of specifying the class, using the affine matrix obtained by LRR for spectral clustering is more accurate and the performance is more robust than traditional method.
When solving the LRR problem, the traditional method mainly uses the minimization of the nuclear norm to approximate the minimum rank in the objective function. This is a convex approximation that guarantees the convergence of the designed algorithm. However, singular value decomposition (SVD) is required to calculate in the process of solving, SVD is time consuming, and the computation complexity is for an affine matrix. The classic algorithms for using SVD to solve LRR such as APG [11], ADM [12], LADM and LADMAP [13]
, of which APG solves the approximate problem of LRR , and its clustering result is not good. LADMAP performs best of these algorithms, which is LADM combines with adaptive adjustment of penalty parameters, however its calculation speed is still slow, especially for the high dimensional data. Along this line of thinking, the accelerated LADMAP is proposed by Lin , and they use skinny SVD technology to reduce the complexity to
, whereis the rank of the affine matrix. However, its rate of convergence is sublinear, requiring more iterations, and the rank depends on the selection of hyperparameters. Lu introduced a smooth objective function with regular terms and used the Iterative Reweighted Least Squares (IRLS) method to solve the objective function
[14]. The new method does not need SVD, but the computation complexity of matrix multiplication is , their numerical experiments show that the convergence is linear, so it yields more faster than LADMAP in some cases.In order to avoid SVD calculation, Chen offer matrix factorization LRR model and hidden matrix factors augmented Lagrangian method (HMFALM) [15] . They decompose the affine matrix into and then use Augmented Lagrangian Method (ALM)to solve the model, where ,. They choose a method to traverse the rank that is they first choose a proper interval d, then run the algorithm on the ranks 1, d+1, 2d+1,…,kd+1,…and stop it when the results begin to worsen. Thus, searching through the options one by one to find the optimum rank. Although the original problem becomes a nonconvex problem, the algorithm does not require SVD, only multiplication of the factor matrix is required. Its complexity is , where is the dimension of the data. The numerical results show that, HMFALM is much smaller than IRLS in the number of iteration steps, and its rate of convergence is faster. However, HMFALM needs an outer loop to find rank , the inner loop iterates to meet the stopping criterion, and the result of finding the rank is heavily dependent on the given hyperparameter.
We introduce a group norm regularization to design an adaptive rankfinding matrix factorization model to solve LRR. We first let , where is a larger number. Group norm regularization will make some columns of the factor matrix become zero columns, so that the rank of the affine matrix is automatically reduced to achieve the purpose of adjusting the rank adaptively. Although we decreasing the rank from a large number, the numerical value shows that zero columns appear very quickly (zero columns can be deleted to speed up). It drops to a low rank in a few steps, and iterative convergence about ten steps. Its specific solving algorithm is ALM method, which is similar to [15], and the algorithm complexity is . Numerical results on synthetic noise data and real data (Hopkin155 and EYaleB) show that our model is faster, more accurate, and more robust to noise than above algorithms.
The structure of the paper is as follows: Section 2 introduces LRR problem, its convex approximation model [10] and matrix factorization model [15]. Section 3 introduces our model, gives ALM for our model, proposes acceleration technology for the ALM and introduces how to use the solution of LRR to spectral cluster. Numerical experimental results are reported in Section 4. Finally, Sections 5 concludes this paper.
2 LRR problem and two types of models
First, let’s recall the following LRR problem
(2.1) 
where is the data matrix,
is the dimension of the data vector,
is the number of data vectors, and . We call the optimal solution of above problem is the ”lowestrank representation” of data with respect to a dictionary . This is a NPhard problem, because the rank is norm and the solution is not unique. Just like the classic method of solving the lowrank problem, Liu [10] take advantage of the nuclear norm to approximate and get the following convex optimization problem:(2.2) 
Liu proved in [16] that under some conditions, the solution of (2.2) is unique and is one of the solutions to (2.1), and this solution can be transformed to obtain an affine matrix of data , which can be used for spectral clustering. The uniqueness of (2.2) is given by Wei and Lin [17]:
Theorem 2.1
Suppose the skinny SVD of is , then the minimizer to problem (2.2) is
uniquely defined by
(2.3) 
This formula naturally implies that exactly recovers affine matrix by Costeira [18].
As for the solution of (2.2) is one of the solutions to (2.1), we recommend to see corallary 4.1 in [16]. To make the model robust to noise, Liu [16] proposed the following noisy LRR nuclear norm model:
(2.4) 
Where .
In order to solve the (2.4), several algorithms have been designed. They need to calculate SVD and lack of speed. Based on this, Chen [15] put into a low rank factorization , and proposed the following matrix factorization model:
(2.5) 
where . We can write , then the problem is expressed as follows
(2.6) 
However, the rank of this model needs to be specified additional, Chen[15] gave a method to find the optimal rank:
1. Give the interval and hyperparameter .
2. Solve the problem (2.6) when and stop it
when begin to worsen. Thus, searching through the options one by one to find the optimum rank.
Assume the optimal rank , in this case, the solution obtained from (2.6) is . According to the data space is full and the theorem in[10, 16], we can get the optimal by ( is the pseudoinverse of ). The obtained rank is heavily dependent on the hyperparameter , and a lot of additional iterative calculations must be done before the optimal rank is obtained. In order to reduce the number of iteration steps and find the rank adaptively, we have designed a new model in section 3, which added the term of group norm regularization to the model (2.6).
3 Group norm regularized LRR factorization model and algorithm
Matrix factorization model is superior to the nuclear norm approximation method in calculation speed. However, it is difficult to estimate the rank of the restored matrix by the former method. So we want to find an adaptive method of estimating rank for different types of data. As is known to all, the rank of a matrix is determined by the number of rows or columns of the factor matrix, and the rank of the matrix is reduced if some columns are zero. So we take an oversized factor matrix first, and make the number of columns of the factor matrix zero by introducing the group norm regularization, so as to achieve the purpose of adjusting the rank adaptively.
3.1 Group norm regularized LRR factorization model
Assume that is a matrix of data samples, is the dimension of the data, is the number of data, and some data contain noise. We hope to remove noise and represent clean data at a low rank to obtain an affine matrix. We obtain the group norm regularized LRR factorization model (GNRLRRFM) by adding the group norm regularization term to (2.6):
(3.1) 
where , , , is a larger number. is group norm of , and . The true rank of is usually unknow, and is an initial guess which is a larger number (for example K=n). Owing to the group norm regularization, some columns of U will be equal to zero under proper parameter , . Assuming columns of will be zero by the group norm , then we can get . So we reached the goal of adjusting the rank of adaptively only by introducing the group norm regularization. is also very important because and play a role of balance and mutual restraint in GNRLRRFM.
In summary, the GNRLRRFM model can adaptively estimate rank under constrained condition for different types of data without the need to additionally design updated rank strategies. And the regularization term make the model more resistant to noise. Of course, we have introduced two extra hyperparameters and , but numerical results show that our model is less sensitive to hyperparameters relative to other selection of the models.
3.2 Augmented Lagrangian Method
In this section, we introduce the ALM method to solve (3.1). For such biconvex problems, i.e, convex in U for V fixed and convex in V for U fixed, Sun [19], Shen [20], Xu [21], Chen [15] all used similar ALM method to solve such biconvex problem, and have obtained relatively good numerical results. The augmented Lagrange function of formula (3.1) is as follows :
(3.2) 
where is a penalty parameter, is the lagrange multiplier corresponding to the constraint , is the usual inner product.
It is wellknown that, starting from , the classic augmented Lagrangian method solves
(3.3) 
at the th iteration and then updates . Similar to classical ALM, we can update and at the th iteration separately:
(3.4a)  
(3.4b) 
It is difficult to solve (3.4a) directly because and are coupled, so we propose a method called inner iteration technique to obtain approximate solution
(3.5) 
(3.6) 
where is the steps of inner iteration. At this point, can be solved by least square method:
(3.7) 
Since is difficult to solve inspired by[13] we make the quadratic linearizing in (3.5) and add a proximal term
(3.8) 
where is the same as proposed in [13].
We can get the solution of (3.8) by soft threshold shrinkage :
(3.9) 
where , means the th column of .
Owing to the softthresholding rule, then some columns of are equal to zeros, so we will get the lowrank solution. Similarly, we can get the explicit expression of :
(3.10) 
To avoid ALM converging to an infeasible point, we adopt the strategies which proposed by Lu and Zhang [22] to update in the third part in Algorithm 1. At this point, we have given the explicit formula to update the variables for (3.3) at th iteration. By the above update formula, we give the Algorithm 1 to solve problem (3.1).
1
(3.11) 
(3.12) 
(3.13) 
(3.14) 
3.3 Convergence issue
For noconvex problems of this type of matrix factorization (3.1), although many books or articles (Boyd[23], Sun [19], Shen [20], Xu [21], Chen [15]) all numerically show its strong convergence behavior and the results compared with the original convex problem SVD faster and better, the convergence proving of nonconvex problems by ALM is still a very difficult matter at present. The last three articles can only assume that it converges to the KKT point under some strong conditions which are difficult to verify theoretically, and this topic is deserved to research in the future.
Here we introduce the conditions and results of these three articles, that is the models studied by Shen [20] and Xu [21] don’t have regularization terms in comparison with our model, they need to assume that the variables are bounded and converge, then the ADMM algorithm in the article converges to the KKT point. Chen’s algorithm is the same as ours, and our model is within the general framework they proposed, Chen (2017) give a convergence analysis of ALM algorithm for general form:
(3.15)  
where is a lower semicontinuous function and is continuously differentiable function. The Relaxed Constant Positive Linear Dependence (RCPLD) condition holds is necessary for the local minimizer for problem (3.15) is a KKT point, RCPLD is introduced in [15] as follows:
Definition 3.1
For the above problem (3.15), let be the feasible region, , and is a basis for the space with .We say that RCPLD holds for the system at if such that the space has the same rank for each
For specific proof, we recommend readers to see [15].
3.4 Accelerated ALM for GNRLRRFM
In this section, we propose two techniques to accelerate ALM for GNRLRRFM. The techniques aim to reduce computational complexity at each iteration and iteration numbers. In Section 4, we compared accelerated and unaccelerated ALM on synthetic data.
The computational complexity comes mainly from the matrix multiplication at each iteration. For the present case, some columns of the matrix are zeros owing to the utilization of the group norm regularization. This fact inspires the first technique, that is, we delete the zero columns in and the corresponding rows in before we perform the matrix multiplication. In the numerical experiments, we found that , here is the number of nonzero columns of at the th iteration. Therefore, the first technique does not affect the convergence and will speed up the calculation.
The second technique is to inner iterate only one step for and , that is:
(3.16) 
(3.17) 
where the specific update steps can be seen in Algorithm 2. Although we solve (3.4a) and (U, V) at the same time with only one step of inner iteration approximately, but the numerical value shows that the Algorithm 2 converges in about ten steps. By applying the above acceleration techniques, we arrive at Algorithm 2 as below.
1
3.5 Subspace Segmentation (Clustering)
As same as Liu [16], we designed the following algorithm to perform subspace segmentating (clustering) based on the obtained by solving (3.1).
In the fifth step, each item is squared to ensure that the elements in the similarity matrix are positive. In summary, Algorithm 3 describes how to use the solution obtained by GNRLRRFM for clustering.
4 Numerical experiments
In this section, we test the efficiency of our algorithm and compare it with some other algorithms. We have implemented our algorithm on a PC with 3.2GHZ AMD Ryzen 7 2700 Processor and 16GB of memory running. All computations are done in Matlab version 2016b and few tasks are written by C++. We compare our algorithm with three methods (LADMAP(A) [13], IRLS [14] and HMFALM [15]). The first method is based on the model (2.4), which is faster than other SVD algorithm because it uses an adaptive adjustment penalty term to accelerate convergence and uses skinny SVD instead of SVD, reducing the complexity from to , where r is the predicted rank of the Z. IRLS smoothes the objective function by introducing regular terms, and then uses the weighted least squares method to solve the variables alternately. Although the singular value decomposition is not required during the algorithm, the matrix product complexity is still . During the solution process, the Matlab command lyap is used to solve the Sylvester equation (sometimes the solution of equation is not unique, and the program will be terminated), but under some problems, the number of iteration steps is less than that of LADMPA(A). HMFALM based on matrix factorization model (2.6) which does not need to calculate SVD, and only needs to perform matrix multiplication so as to be complexity. Its outer loop is r starting from 1 and increasing by step d. For each r, the inner loop must calculate iteratively until the stop condition is met to breaks out of the inner loop, and until the best rank interval is found to try to find the optimal r one by one. Where m is the dimension of the data, HMFALM is faster than the first two algorithms, but it is very sensitive to the hyperparameter , and antinoise ability is not good without regular term.
Our model add the group norm regularization term on the matrix factorization model (2.6), and use the nature of the group norm regularization term: the factor matrix will have zero columns, and then adaptively reduce the rank. Although our rank starts to decrease from a large number , however, it only takes a few steps to iterate from a large rank to a small rank. The numerical results show that our algorithm AALM has converged in about ten iteration steps of the (3.1). The stopping criteria in our numerical experiments is defined as follows:
(4.1) 
where is a moderately small number.
4.1 Experiments on synthetic data
We first compared the ALM and AALM (before and after acceleration) on the synthetic data, for the inner iteration of ALM, we tried two stopping criteria: 1.The internal iteration stops in the fixed 5 steps. 2.The stop criterion of inner iteration is met when .
The construction method of noisy synthetic data is the same as [13], [10], [24], [15]. The specific construction procedure is as follows. First, we denote the number of subspace by s, and the number of basis in each subspace by r while the dimensionality of the data is d. For the first subspace, we construct the basis
, which is a random orthogonal matrix with the dimension
, while basis of corresponding subspace obtained by , where is a random rotation matrix. This can ensure that these subspaces are independent of each other, and the basis in each subspace is linear independent. Then in the th space, we use the basis to generate samples : , whereis independent and identically distributed, obeying the standard normal distribution
. Then we randomly select 20 from all data to be contaminated, such as the data vector is drawn to, then we can add noise according to the following formula:We denote , , , , and generate synthetic data as described above. In Figure 1, ALM and AALM are compared, and Figure 1 shows that the effect after acceleration is better than that without acceleration. The horizontal axis is obtained after transformation of time. The vertical axis is the error . The purple line is the internal criterion of ALM which adopts the second criterion: each step iterate until the inner iteration convergence. The green line is the inner iteration with fixed five steps. The red line is the inner iteration with a fixed one step, the blue line is the inner iteration with one step and deletes the 0column of each . Comparing blue line with red line, we can observe that deleting the 0column validates our previous analysis: with no effect to the convergence result, and it improves memory savings and speeds up the calculating. From the whole Figure 1, we can see that the inner iteration does not need to converge, even one step is adopted, which can greatly reduce the calculation time.
From table 1 to table 3, we use LADMAP(A), HMFALM and AALM separately to obtain the corresponding affine matrix on the noisy synthetic data, and then Algorithm 3 is adopt to perform clustering. We want to verify noise resistance and sensitivity to hyperparameters between the AALM and several compared algorithms. For the intensity of the noise, we select . For the selection of hyperparameters, we select the for the three algorithms LADMAP(A), IRLS, HMFALM. With respect to our algorithm, , and are selected. For the other parameters from IRLS and LADMAP(A) algorithm , we select the optimal parameters set in the corresponding article, and we select for the HMFALM algorithm with the searching gap is and searching exactly. We select as the other parameters in our algorithm, we all run the algorithm three times and take the average as each result for each synthetic data.
(s,p,d,r)  Method  Time(s)  Ite  Acc(%)  Time(s)  Ite  Acc(%)  Time(s)  Ite  Acc(%) 
(10,20,200,5)  HMFALM  0.0687  106  53.50  0.2040  206  100.00  0.3253  245  98.50 
LADM  0.4223  53  98.50  2.7070  258  100.00  13.986  1246  92.67  
AALM  0.0493  10  100.00  0.0407  9  100.00  0.0373  9  99.67  
(15,20,200,5)  HMFALM  0.1073  73  44.33  0.4520  184  100.00  1.0693  259  87.67 
LADM  0.9667  56  99.44  6.9257  311  100.00  34.369  1359  87.78  
AALM  0.1147  10  100.00  0.0967  9  100.00  0.0900  9  99.89  
(20,25,500,5)  HMFALM  0.6620  102  87.07  1.8550  188  100.00  4.8747  282  81.00 
LADM  3.4753  80  99.93  24.856  409  99.40  72.036  902  82.40  
AALM  0.3883  10  100.00  0.3413  9  100.00  0.3087  9  100.00  
(30,30,900,5)  HMFALM  3.1897  126  99.52  9.9670  199  100.00  24.252  267  80.33 
LADM  16.577  83  100.00  115.86  472  88.44  902.24  2598  82.74  
AALM  1.8543  11  100.00  1.5953  10  100.00  1.3303  9  100.00  
(35,40,1400,5)  HMFALM  12.482  145  100.00  50.680  236  86.90  55.900  241  80.62 
LADM  119.13  198  99.17  259.32  349  81.05  20961  17467  63.50  
AALM  6.4487  13  99.81  5.0647  11  99.98  4.2353  9  100.00  
(40,50,2000,5)  HMFALM  36.148  146  100.00  124.23  231  80.55  120.70  230  80.60 
LADM  462.56  314  86.95  498.80  292  82.50  3987.7  1633  80.52  
AALM  17.832  16  99.55  15.046  12  99.97  11.490  9  100.00  
(s,p,d,r)  Method  Time(s)  Ite  Acc(%)  Time(s)  Ite  Acc(%)  Time(s)  Ite  Acc(%) 
(10,20,200,5)  HMFALM  0.0410  57  22.00  0.1907  192  100.00  0.5063  290  81.67 
LADM  0.4153  55  97.17  2.4267  215  100.00  14.404  944  86.00  
AALM  0.0513  10  100.00  0.0397  9  100.00  0.0410  9  99.83  
(15,20,200,5)  HMFALM  0.0860  56  20.33  0.4123  158  97.67  1.1663  264  80.22 
LADM  0.8593  58  98.56  5.4410  229  97.56  38.563  957  84.56  
AALM  0.1227  9  99.89  0.1130  9  100.00  0.1130  9  99.89  
(20,25,500,5)  HMFALM  0.4450  74  33.73  3.0117  204  95.73  4.7363  271  81.07 
LADM  3.0857  69  99.80  32.630  509  89.00  178.75  1672  81.33  
AALM  0.4253  10  99.40  0.3997  9  99.93  0.3607  9  100.00  
(30,30,900,5)  HMFALM  2.5263  108  90.30  23.785  255  80.81  23.144  255  80.56 
LADM  24.673  106  97.63  162.50  542  81.89  499.30  1187  80.15  
AALM  2.0857  13  98.52  1.7333  9  99.96  1.5700  9  100.00  
(35,40,1400,5)  HMFALM  22.388  184  85.21  53.265  227  80.48  53.220  229  80.57 
LADM  237.34  316  89.95  3378.0  3267  81.38  1388.8  1206  80.76  
AALM  7.0320  16  99.31  5.8153  12  99.88  4.2393  8  100.00  
(40,50,2000,5)  HMFALM  114.30  216  80.63  117.18  216  80.38  117.57  216  80.57 
LADM  707.86  419  80.92  8509.0  3627  80.72  2686.7  1053  80.63  
AALM  17.972  16  99.92  16.859  14  99.85  12.191  9  100.00  
(s,p,d,r)  Method  Time(s)  Ite  Acc(%)  Time(s)  Ite  Acc(%)  Time(s)  Ite  Acc(%) 
(10,20,200,5)  HMFALM  0.0483  67  25.17  0.2960  224  66.67  0.4907  279  81.67 
LADM  0.7967  83  84.17  4.2673  383  88.50  36.212  1657  83.17  
Ours  0.0507  9  95.67  0.0517  9  97.67  0.0610  9  97.83  
(15,20,200,5)  HMFALM  0.1247  79  36.00  0.8313  222  68.44  1.1647  253  80.89 
LADM  2.7097  138  67.89  12.076  496  90.89  146.66  3193  82.00  
Ours  0.1250  9  91.56  0.1230  9  95.56  0.1187  8  97.11  
(20,25,500,5)  HMFALM  1.2063  142  69.20  4.5427  260  81.20  4.6800  261  79.87 
LADM  20.271  293  30.20  85.304  1017  84.67  295.25  2684  79.93  
Ours  0.4310  10  91.53  0.3973  9  94.73  0.3627  8  95.60  
(30,30,900,5)  HMFALM  10.736  184  6.11  22.449  245  80.74  22.574  246  79.00 
LADM  117.06  438  52.19  636.09  1536  83.44  1220.9  2795  80.59  
Ours  2.1647  13  83.85  1.5993  9  91.96  1.4360  8  97.15  
(35,40,1400,5)  HMFALM  51.562  220  80.52  51.515  221  80.62  50.354  218  80.60 
LADM  555.73  632  83.40  2605.8  2218  80.60  2830.0  2416  80.55  
Ours  7.3887  16  89.86  5.8400  12  83.90  3.9590  8  96.45  
(40,50,2000,5)  HMFALM  114.17  208  79.98  113.02  207  80.45  111.18  207  80.38 
LADM  7750.0  3451  81.95  9314.1  3654  80.55  7959.7  3065  79.72  
Ours  18.338  16  95.75  15.982  14  85.65  10.763  8  92.70  
From table 1 to table 3, we can observe that our AALM algorithm has better calculation speed and clustering accuracy than HMFALM and LADM on synthetic data, where only ten steps iteration by our algorithm. Futhermore, compared to the other two algorithm, our clustering results are basically unchanged along with the change of hyperparameters. As a result, although our model (3.1) have one more hyperparameters than the (2.6), our model is not sensitive to hyperparameters, while the clustering result of the other two models is greatly affected by the hyperparameters . In addition, when , our clustering accuracy is the best, and even in some cases it can be more close to 20 accuracy than other algorithms, so that the GNRLRRFM model introduced with the group norm regularization term has good noise immunity and is robust.
4.2 Experiments on real data
In this section, we test the clustering effectiveness of our algorithm in the Hopkins155 dataset [25] and Extended Yale B dataset [26].
The Hopkins155 dataset contains 156 data sequences, each data sequence contains from 39 to 550 data vectors (from two or three motion modes), and the dimension of each data vector is 72 (24 frames 3). We specify the number of classes (two or three classes) of each data sequence, and take advantage of HMFALM, LADM, IRLS, and AALM respectively in these 156 sequences to solve the similarity matrix and cluster. In table 4, we give the total accuracy, average iteration steps, and average time on the data series in the condition of two modes, three modes, and all modes. Among them, as for HMFALM, LADM and IRLS, we select the (the optimal parameters tested by the authors in their article), with respect to AALM algorithm, we select .
Problem  Two Motions  Three motions  All motions  

Time  Iter.  Acc()  Time  Iter.  Acc()  Time  Iter.  Acc()  
HMFALM  0.0509  123.54  96.62  0.0734  137.03  95.04  0.0561  126.65  96.14 
LADM  74.160  28724  96.39  119.07  37564  95.72  84.525  30764  96.19 
IRLS  36.054  189.48  97.15  74.160  181.89  95.90  43.335  187.72  96.77 
AALM  0.0190  13.667  97.75  0.0231  13.889  96.62  0.0199  13.718  97.41 
As can be seen from table 4, our algorithm is faster than the other three algorithms, with the least number of iteration steps and the highest clustering accuracy.
Extended Yale B dataset contains 38 subjects (people). Each subject has 64 face images, Figure 2 shows thirty pictures from one of the people’s faces where data has lighting noise so that some faces cannot be seen clearly or even become dark. For instance the fourth picture can’t be recognized even by people. Similar to [14], we conduct two experiments by construct the first 5 subjects and the first 10 subjects into a dataset X. First, we resize all the picture to 32 32. Second, to reduce noise, project it to a 30dimensional subspace for 5 subjects clustering problem and a 60dimensional subspace for 10 subjects by principle component analysis (PCA).Third, By applying HMFALM, LADM, IRLS and AALM to solve the lowrank representation problem, we get different affine matrix. At last, comparing the spectral clustering result by Algorithm 3 with different affine matrix:
Problem  10 subjects  5 subjects  

Time  Iter.  Acc()  Time  Iter.  Acc()  
HMFALM  0.4280  396  81.87  0.1060  336  88.44 
LADM  98.7680  8430  81.56  15.8250  4324  88.44 
IRLS  97.9000  107  81.87  16.3400  102  88.44 
Ours  0.0720  16  81.87  0.0200  16  88.44 
It can be easily seen that the clustering accuracy of the four algorithms is the same for the 5 subjects, but our algorithm AALM is the fastest. AALM, IRLS and HMFALM have achieved the same accuracy for the 10 subjects, while our algorithm is still the fastest. In summary, our algorithm has achieved the best accuracy with fastest computing speed on the real problem that are Hopkins 155 motion mode clustering and Extended Yale B face clustering.
5 Conclusion
In this paper, we propose a group norm regularization factorization LRR model based on the lowrank representation factor model, and design an accelerated ALM (AALM) algorithm to obtain a affine matrix, and then cluster data by the algorithm of spectral clustering. For noisy synthetic data, our algorithm and model clustering results yield more accurate results than both the traditional nuclear normbased LRR model and the lowrank representation factor model without regularization, in addition compared with the selected classic algorithm, our model is more robust, insensitive to parameters, and has better clustering results. With respect to real data Hopkin155 motion pattern clustering and Extended Yale B face clustering, our algorithms have achieved optimal clustering accuracy with fastest rate than alternating algorithm. In a word, this paper proposes a group norm regularization factorization LRR model to solve similarity matrices. Compared with the previous LRR model, numerical experiments illustrates that the similarity matrix obtained by our model is fast and clustering results is good.
References
 [1] Xindong Wu, Vipin Kumar, J Ross Quinlan, Joydeep Ghosh, Qiang Yang, Hiroshi Motoda, Geoffrey J McLachlan, Angus Ng, Bing Liu, S Yu Philip, et al. Top 10 algorithms in data mining. Knowledge and information systems, 14(1):1–37, 2008.
 [2] Chihau Chen. Handbook of pattern recognition and computer vision. World Scientific, 2015.

[3]
Newton Da Costa Jr, Jefferson Cunha, and Sergio Da Silva.
Stock selection based on cluster analysis.
Economics Bulletin, 13(1):1–9, 2005.  [4] Amit Saxena, Mukesh Prasad, Akshansh Gupta, Neha Bharill, Om Prakash Patel, Aruna Tiwari, Meng Joo Er, Weiping Ding, and ChinTeng Lin. A review of clustering techniques and developments. Neurocomputing, 267:664–681, 2017.

[5]
James MacQueen et al.
Some methods for classification and analysis of multivariate
observations.
In
Proceedings of the fifth Berkeley symposium on mathematical statistics and probability
, volume 1, pages 281–297. Oakland, CA, USA, 1967.  [6] Jianbo Shi and Jitendra Malik. Normalized cuts and image segmentation. Departmental Papers (CIS), page 107, 2000.
 [7] Cuimei Guo, Sheng Zheng, Yaocheng Xie, and Wei Hao. A survey on spectral clustering. In World Automation Congress 2012, pages 53–56. IEEE, 2012.
 [8] Martin Ester, HansPeter Kriegel, Jörg Sander, Xiaowei Xu, et al. A densitybased algorithm for discovering clusters in large spatial databases with noise. In Kdd, volume 96, pages 226–231, 1996.
 [9] James C Bezdek, Robert Ehrlich, and William Full. Fcm: The fuzzy cmeans clustering algorithm. Computers & Geosciences, 10(23):191–203, 1984.
 [10] Guangcan Liu, Zhouchen Lin, and Yong Yu. Robust subspace segmentation by lowrank representation. In ICML, volume 1, page 8, 2010.
 [11] KimChuan Toh and Sangwoon Yun. An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems. Pacific Journal of optimization, 6(615640):15, 2010.
 [12] Zhouchen Lin, Minming Chen, and Yi Ma. The augmented lagrange multiplier method for exact recovery of corrupted lowrank matrices. arXiv preprint arXiv:1009.5055, 2010.
 [13] Zhouchen Lin, Risheng Liu, and Zhixun Su. Linearized alternating direction method with adaptive penalty for lowrank representation. In Advances in neural information processing systems, pages 612–620, 2011.
 [14] Canyi Lu, Zhouchen Lin, and Shuicheng Yan. Smoothed low rank and sparse matrix recovery by iteratively reweighted least squares minimization. IEEE Transactions on Image Processing, 24(2):646–654, 2014.
 [15] Baiyu Chen, Zi Yang, and Zhouwang Yang. An algorithm for lowrank matrix factorization and its applications. Neurocomputing, 275:1012–1020, 2018.
 [16] Guangcan Liu, Zhouchen Lin, Shuicheng Yan, Ju Sun, Yong Yu, and Yi Ma. Robust recovery of subspace structures by lowrank representation. IEEE transactions on pattern analysis and machine intelligence, 35(1):171–184, 2012.
 [17] Wei Siming and Lin Zhouchen. Analysis and improvement of low rank representation for subspace segmentation. arXiv preprint arXiv:1107.1561, 2011.
 [18] João Paulo Costeira and Takeo Kanade. A multibody factorization method for independently moving objects. International Journal of Computer Vision, 29(3):159–179, 1998.
 [19] Dennis L Sun and Cedric Fevotte. Alternating direction method of multipliers for nonnegative matrix factorization with the betadivergence. In 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP), pages 6201–6205. IEEE, 2014.
 [20] Yuan Shen, Zaiwen Wen, and Yin Zhang. Augmented lagrangian alternating direction method for matrix separation based on lowrank factorization. Optimization Methods and Software, 29(2):239–263, 2014.
 [21] Yangyang Xu, Wotao Yin, Zaiwen Wen, and Yin Zhang. An alternating direction algorithm for matrix completion with nonnegative factors. Frontiers of Mathematics in China, 7(2):365–384, 2012.

[22]
Zhaosong Lu and Yong Zhang.
An augmented lagrangian approach for sparse principal component analysis.
Mathematical Programming, 135(12):149–193, 2012. 
[23]
Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, Jonathan Eckstein, et al.
Distributed optimization and statistical learning via the alternating
direction method of multipliers.
Foundations and Trends® in Machine learning
, 3(1):1–122, 2011.  [24] Shijie Xiao, Wen Li, Dong Xu, and Dacheng Tao. Falrr: A fast low rank representation solver. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4612–4620, 2015.
 [25] Roberto Tron and René Vidal. A benchmark for the comparison of 3d motion segmentation algorithms. In 2007 IEEE conference on computer vision and pattern recognition, pages 1–8. IEEE, 2007.

[26]
Athinodoros S Georghiades, Peter N Belhumeur, and David J Kriegman.
From few to many: Illumination cone models for face recognition under variable lighting and pose.
IEEE Transactions on Pattern Analysis & Machine Intelligence, (6):643–660, 2001.
Comments
There are no comments yet.