Visual Processing by a Unified Schatten-p Norm and ℓ_q Norm Regularized Principal Component Pursuit

08/20/2016 ∙ by Jing Wang, et al. ∙ 0

In this paper, we propose a non-convex formulation to recover the authentic structure from the corrupted real data. Typically, the specific structure is assumed to be low rank, which holds for a wide range of data, such as images and videos. Meanwhile, the corruption is assumed to be sparse. In the literature, such a problem is known as Robust Principal Component Analysis (RPCA), which usually recovers the low rank structure by approximating the rank function with a nuclear norm and penalizing the error by an ℓ_1-norm. Although RPCA is a convex formulation and can be solved effectively, the introduced norms are not tight approximations, which may cause the solution to deviate from the authentic one. Therefore, we consider here a non-convex relaxation, consisting of a Schatten-p norm and an ℓ_q-norm that promote low rank and sparsity respectively. We derive a proximal iteratively reweighted algorithm (PIRA) to solve the problem. Our algorithm is based on an alternating direction method of multipliers, where in each iteration we linearize the underlying objective function that allows us to have a closed form solution. We demonstrate that solutions produced by the linearized approximation always converge and have a tighter approximation than the convex counterpart. Experimental results on benchmarks show encouraging results of our approach.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 12

page 13

page 14

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The popularity of webcams and mobile phone cameras has generated a large amount of visual data. However, visual data are easily corrupted by artifacts arising from imaging devices or natural factors such as illumination. The human vision system could recognize the corruption with accumulated information and knowledge. However, it will result in irrelevant or noisy information in the computer vision community. Thus, a bunch of methods have been proposed to obtain authentic data for visual processing tasks, such as image processing. Image denoising aims at reducing the noise from the original image

[Lee80] [LO00] [ZYZH13] [CHH13] [SYLL08]

. Specifically, some approaches focus on statistical image modeling for the purpose of optimal signal representation and transmission, such as the Gaussian scale Mixture (GSM) model, the variance-adaptive model or Bayesian estimation

[SCD02] [SYLL08]. Portilla et al. presented a denoising method based on a local Gaussian scale mixture model in an overcomplete oriented pyramid representation [PSWS03]. The approaches mentioned above are based on the initial features of the visual data. Generally, better features will enhance the performance of representation. For instance, Shao et al. generated domain-adaptive global feature descriptors to obtain better performance in image classification [SLL14]. Zhu et al. utilized weakly labeled data from other domains as the feature space for the visual categorization problem [ZS14]. Based on a comprehensive feature space, some effective and promising denoising approaches are proposed by exploiting sparse and redundant representations over a trained dictionary [ER06]. Elad et al. proposed the K-SVD algorithm [AEB06]. It was the first time that sparse modeling of image patches has been successfully applied in image denoising. Yan et al. exploited the sparsity within representation in the wavelet domain to handle high-level noises [YSL13]

. One reason for the success of the algorithm is the statistical properties of noise. It is natural to assume the noise is sparse. Besides, the visual data such as images are probably of low rank structure

[HZY12]. For example, for a facial image taken under certain illumination conditions, the low-rank component captures the face, and the sparse component captures the light on the face [WYG09]. Thus, the idea of turning the problem into a low rank matrix and a sparse matrix recovery problem has drawn considerable attention. In the following, we first describe the problem.

1.1 The Problem Description

Suppose is an observed data matrix in , where is used to denote the ambient dimension of a sample and is the number of samples. The problem can be formulated as:

(1.1)

where has a low rank structure that is assumed to be the authentic structure of the observed data, and is assumed to be sparse representation of the noise. Rank() is the rank of the matrix , is the -norm which counts the number of non-zero entries in , and is a parameter balancing the two components. The goal of the above optimization problem (1.1) is called Robust Principal Component Analysis (RPCA), aiming to recover the low-rank component and sparse component , with the constraint of .

1.2 The Reformulation and Solutions

It is challenging to solve problem (1.1), because rank() and are both discontinuous and non-convex. In fact, it is NP-hard. A common strategy [CLMW11] is to relax the rank function to the convex nuclear norm , where denotes the

-th singular value of

, and relax the -norm to the -norm , where is the magnitude of the -th element in . Problem (1.1) can then be reformulated as:

(1.2)

Candès et al. theoretically proved that if and satisfy certain assumptions, they can be recovered exactly via solving a convex program called Principal Component Pursuit with [CLMW11]. Unlike the formulation defined in (1.1), RPCA in (1.2) is convex, and the optimal solution is tractable. An efficient solver for (1.2) is Alternating Direction Method (ADM) [LCM09] which guarantees to obtain the optimal solution. Another well-known first-order algorithm is the Accelerated Proximal Gradient (APG), which solves an unconstrained Stable Principal Component Pursuit (SPCP) problem [ZLW10] as follows:

(1.3)

where and are balancing parameters. APG is a fast method with a convergence rate , where is the number of iterations.

1.3 Related Works

As the RPCA model is capable of recovering the low rank components from grossly corrupted data and theoretical conditions to ensure the perfect recovery have been analyzed in depth, RPCA and its extensions have been applied to many applications, including background modeling [CLMW11], image alignment [PGW10] and subspace segmentation [LLY13a]. Specifically, Hui et al. presented a patch-based algorithm using low-rank matrix recovery [JLSX10]. Wang et al. studied the problem of aligning correlated images by decomposing the matrix of corrupted images as the sum of a sparse matrix of errors and a low-rank matrix of recovered aligned images [WLLL13]. Hu et al. proposed a truncated nuclear norm regularization for estimating missing values from corrupted images [HZY12].

There are several works aimed at improving the low-rank and sparse matrix recovery. Mu et al. [MDYY11] proposed an Accelerated RPCA using random projection. Zhou and Tao [ZT11] developed a fast solver for low-rank and sparse matrix recovery with hard constraints on both and . To alleviate the challenges raised by coherent data, most recently, Liu et al. recovered the coherent data by Low-Rank Representation (LRR) [LLY13b]. Aybat et al. developed a fast first-order algorithm to solve the SPCP problem [AGI11]. Fazel suggested to reformulating the rank optimization problem as a Semi-Definite Programming (SDP) problem [Faz02]. An accelerated proximal gradient optimization technique was applied to solve the nuclear norm regularized least squares [TY10] [JY09].

However, existing algorithms may lead to solutions that deviate from the original problem. Most previous works use the convex nuclear norm as a surrogate of the rank function and the -norm as a surrogate of the -norm, and then instead solve the new problem. But the nuclear norm is the sum of the singular values, while the rank function is the number of the non-zero singular values in which each singular value contributes equally. There are similar differences between the -norm and -norm when performing a theoretical analysis [RFP10]. Hence, the solution to the relaxed problem may be far from the original one. Some researchers instead consider non-convex surrogate functions.

The smoothed Schatten- norm is a popular non-convex surrogate of the rank function defined as [MF12][NWC12]:

(1.4)

where

is the identity matrix with the same size as

, and is differentiable for and nonconvex for . Mohan and Fazel used the Schatten- norm to replace the rank function and considered the problem [MF12]:

(1.5)

where is a linear map, and denotes the measurements. They also proposed the Iterative Reweighted Least Squares (IRLS) algorithm for rank minimization. Under certain conditions, IRLS-1 converges to the global minimum of the smoothed nuclear norm and IRLS- converges to a stationary point of the corresponding non-convex yet smooth approximation to the rank function. Nie et al. [NHD12] proposed the extended Schatten- norm as an efficient surrogate of the rank function defined as:

(1.6)

They derived an efficient algorithm to solve the above problem.

For the -norm, many non-convex surrogate functions have been proposed, e.g., -norm with [FL09], and Smoothly Clipped Absolute Deviation (SCAD) [FL01]. Nie et al. [NWC12] used the Alternate Direction Method (ADM) to solve a similar problem for the non-convex matrix completion problem. Candés et al. [CWB08] proposed an algorithm to solve the reweighted minimization problem, which could better recover the

-norm. The condition of sparse vector recovery has been given in

[FL09].

The major drawback to the above approaches is that previous iteratively reweighted algorithms can only approximate either the low-rank component or the sparse one with a non-convex surrogate [CZ][LXY13]. One important reason for this is that it is difficult to solve a problem whose objective function contains two or more nonsmooth terms. Thus, in this paper, we simultaneously approximate the low rank and sparse functions with non-convex surrogates.

1.4 Introducing Our Approach

In this paper, we propose a new formulation with the Schatten- norm and -norm regularized Principal Component Pursuit (-PCP) () for recovering the low-rank and sparse matrices. We also provide an algorithm to solve such a non-convex problem with two non-smooth components. Empirically, our proposed Proximal Iteratively Reweighted Algorithm (PIRA) can solve -PCP effectively without loss of efficiency. In each iteration, PIRA provides closed-form solutions which make the algorithm efficient. To the best of our knowledge, this is the first time that the norm has been used to approximate the RPCA problem. We are also the first to provide corresponding solutions. Experimental results demonstrate that the solutions can tightly approximate the RPCA problem and the objective function can converge in several iterations. The main contributions of this study are summarized as follows.

  • We propose a joint Schatten- norm and -norm regularized Principal Component Pursuit (-PCP) model for low-rank and sparse matrix recovery.

  • A new Proximal Iteratively Reweighted Algorithm (PIRA) is presented to solve the -PCP problem. We demonstrate the effectiveness and efficiency of our algorithm.

  • We empirically show that our solutions can approximate the original problem and the objective function will converge with a few iterations.

  • Extensive experiments on synthetic data and real world data show that our proposed algorithm outperforms state-of-the-art algorithms.

1.5 Overview of the Paper

The rest of the paper is organized as follows. In Section 2, we give detailed information about our proposed non-convex -PCP model and an iteratively reweighted algorithm (PIRA) to solve -PCP. We provide a detailed analysis of the optimization algorithm in Section 3. Experimental results are presented in Section 4. We conclude this paper in Section 5.

2 Non-convex Principal Component Pursuit

In this section, we first present the non-convex principal component pursuit model. We then propose a new iteratively reweighted algorithm to solve the non-convex principal component pursuit problem.

2.1 The -PCP Model

The motivation for approximating the rank function with -norm is to obtain better empirical performance, in terms of recovering low-rank matrices, than the nuclear norm [MF12] when . Mohan and Fazel theoretically prove that the -norm is similar to the nuclear norm minimization when [MF12]. The -norm is used with as a surrogate of the -norm because it generalizes and improves the -norm [FL09]. The -norm degenerates into the -norm when . A similar property holds for the Schatten- norm as a surrogate of the rank function. It is of interest to consider the non-convex principal component pursuit by using the Schatten- norm and -norm jointly:

(2.1)

where (we assume in this paper) is the observed data matrix, and , , denotes the singular values of . , . More generally, we further consider the stable model as follows:

(2.2)

When , the above -PCP model degenerates into the convex PCP as in (1.2) or (1.3).

It is expected that smaller values of and can help -PCP approximate the RPCA in (1.1). It is worth mentioning that many non-convex penalty functions can be applied for the non-convex principal component pursuit model. We use the Schatten- norm and the -norm in this study, because compared with other non-convex penalty functions, they are matrix norms which bear more similar special properties such as the nuclear norm and -norm. The low-rank matrix and the sparse matrix recovery conditions based on the Null Space Property (NSP) have been presented in previous works [FL09][MF12]. In fact, they are extended from the nuclear norm and the convex -norm. It is easy to tune the parameters of and within . Many previous works empirically showed that the -norm improved the recovery performance by comparison with the convex -norm [CWB08]. A similar improvement was recently found in the Schatten- norm by comparison with the convex nuclear norm [MF12]. It is expected that jointly combining them in a model will surpass the recovery performance of the convex PCP.

2.2 Proximal Iteratively Reweighted Algorithm

In this section, we demonstrate how to solve problem (2.2) using the Schatten- norm and -norm regularizers. In fact, the -minimization is non-smooth, non-Lipschitz continuous, and NP-hard [CGWY11]. We use the strategy of shifting to , and to , with , and solve the relaxed problem as follows:

(2.3)

ensures that the zero singular values and nonzero entries in the sparse component have corresponding weightings. The above problem is non-smooth and has two variables. We present a Proximal Iteratively Reweighted Algorithm (PIRA) to solve it.

Intuitively, we need to update and alternately. For fixed in the -th iteration, problem (2.3) can be described as:

(2.4)

To solve the above problem, we linearize the objective function of (2.3) using the Taylor expansion w.r.t. at and add a proximal term. is then updated by minimizing the relaxed function:

(2.5)

where

(2.6)

are the weights corresponding to . They are actually the gradients of w.r.t at , . The backtracking rule can be used to estimate in each iteration [BT09]. Note that problem (2.5) is non-convex. Fortunately, it has a closed form solution as shown in [CDC13].

Lemma 1.

Given , , and , the optimal solution to the following problem:

(2.7)

is given by the weighted singular value threshold:

(2.8)

where is the SVD of , and .

Using Lemma 1, can be updated by:

(2.9)

Input: , , and .
Initialize: , , , , , and .
while not converge do

  1. Update by

  2. Update by

  3. Update the weight vector , , by

  4. Update the weight matrix , , , by (2.12).

end while
Output
: , .

Algorithm 1 Solving Problem (2.3) using PIRA

Note that the main computation for each iteration is one SVD. The iteration is expected to obtain a better estimation of the rank successfully. Even with large rank initially, some small singular values will have large weights and will themselves become zero in the following iterations. Thus, the rank of decreases with each iteration.

Figure 1: The convergences of the subproblems (2.4) and (2.10) are shown in (a) and (b) respectively. The convergence of the global objective function (2.2) is shown in (c).

For fixed in the -th iteration, the solution of problem (2.3) can be derived as:

(2.10)

To solve the above problem, we linearize the objective function of (2.3) using the Taylor expansion w.r.t at and add a proximal term. is then updated by minimizing the relaxed function:

(2.11)

where

(2.12)

are the weights corresponding to . They are actually the gradients of w.r.t. at , , . Note that this problem requires flops. The value of will influence the convergence of the iterations [HYZ08], but we empirically set it to be which shall hold. Note that problem (2.5) is separable. Each can be solved separately using the following closed form solution [HYZ08]:

Lemma 2.

Given , and , the optimal solution to the following problem

(2.13)

is given by

(2.14)

Using Lemma 2, can be updated by

(2.15)

Alternately updating by (2.9), by (2.15) and their weights by (2.6) and by (2.12) leads to the proposed Proximal Iteratively Reweighted Algorithm (PIRA), as shown in Algorithm 1. Note that in each iteration, PIRA solves a weighted nuclear norm minimization problem and a weighted -norm minimization problem. Both have closed form solutions, and their computational costs are the same as for convex optimization.

The detailed procedure of our algorithm is shown in Algorithm 1. We first use a common strategy which relaxes it to the form (2.3) by introducing , and then fix the and separately to obtain the optimal solution (step 2-3). Updated weights and are used for the next iteration based on the current solutions (step 4-7). We will provide the further analysis of our algorithm in the following section.

3 Algorithmic Analysis

In this section, we give a detailed analysis of our algorithm. We first illustrate that the obtained solution is the stationary point for problem (2.5), and the solution is optimal for problem (2.11). We then show that the obtained solutions can approximate the optimal solutions of the original problems (2.4) and (2.10). Finally, the numerical results show that PIRA decreases the objective function (2.3) monotonically.

The experimental data are . is a rank- matrix where and are generated by the Matlab function . Each element of is set to zero with probability and non-zero entries are sampled in the interval [-5, 5] with probability . is the Gaussian noise with i.i.d. . We set to be . As shown in the synthetic data experiment later, is representative.

For the low rank part recovery, the original problem is in the form of (2.4). With linearization, the problem (2.4) can be derived as (2.5), which falls into the general nonconvex low-rank minimization form,

(3.1)

We observe that the penalty function in our problem (2.5) is Schatten- norm, which is continuous, concave and monotonically increasing on

. The loss function

in Eqn. (2.5) is smooth and continuously differentiable with a Lipschitz continuous gradient, that is

(3.2)

where in the -th iteration, is the Lipschitz constant of , and denotes the gradient of [CSZ14]. The solution obtained by PIRA then has the attractive properties defined in the following lemmas.

(a)
(b)
Figure 2: Comparison of low-rank and sparse matrix recovery with varying noise levels , , .
Lemma 3.

The sequence generated by Algorithm 1 satisfies the following properties: (1) decreases monotonically; (2) the sequence is bounded; (3) .

Lemma 4.

Let generated by Algorithm 1 be bounded as shown in Lemma 3. Any accumulation point of is then a stationary point.

Thus, the objective function value (2.5) monotonically decreases, and any limit point of is a stationary point. Then, we empirically demonstrate the convergence of the subproblems 2.4 2.10 and the global problem 2.2. To this end, let denote the subproblem to recover , denotes the subproblem to recover , and is the global problem. In the -th iteration, the objective function of problem is , the objective function of is , and the global objective function is . The Figure 1 reports the convergence of the objective function values in respond to iteration numbers. Specifically, in the -th iteration, the y-value of Figure 1 (a) is . It is shown that the algorithm converges with limited rounds of iterations.

4 Experiments

In this section, we conduct experiments on both synthetic and real visual data to validate the effectiveness of our proposed method -PCP. In the experiment on synthetic data, we mainly discuss the influence of various values and noise levels. For the real-world data sets, we test our method in multiple tasks, such as image denoising and light/shadow removal.

The comparative algorithms include the classical convex solution SPCP [ZLW10], Non-Smooth Augmented Lagrangian algorithm (NSA) [AGI11] and Truncated Nuclear Norm Regularization (TNNR) [HZY12]. We use the solver based on ADMM to solve a subproblem of TNNR in the release codes (denoted as TNNR-ADMM). As TNNR-ADMM could only recover the authentic structure, we compare our method with it in the application of image denoising. For the parameters in our algorithm, is set the same as (2.1) (empirical value). is initialized to be and decreases to (=1.1) after each iteration. and are tuned using 3-fold cross validation. Similarly, we tune the parameters for the comparative algorithms.

All the experiments are conducted with Matlab on a PC with Intel Core2 Q9550 2.83G CPU and 8G RAM.

4.1 Synthetic Data

In this experiment, we verify the effectiveness and robustness of our algorithm by comparing with NSA and SPCP. For each setting of parameters, we report the average result over 10 trials. We generate a rank- matrix as , where and are

generated following Gaussian distribution. The zero elements of

are sampled with probability 1- and the non-zero entries are sampled in the interval [-5, 5] with probability (). We further add Gaussian noise with i.i.d. . is then the observed recovered matrix.

(a) RSE ()
(b) RSE ()
Figure 3: Comparison of low-rank and sparse matrix recovery with varying matrix sizes , , . In the case of , our method -PCP is the same as SPCP.
Rank() Algorithm rank()
= 5 NSA 118 23041 3.32 6.50 0.9986
SPCP 24 9635 2.33 7.43 0.8262
-PCP 5 8023 1.12 3.75 0.9911
= 15 NSA 118 23209 4.67 14.36 0.9978
SPCP 34 10348 3.20 10.87 0.7704
-PCP 15 7969 1.14 4.06 0.9919
= 20 NSA 120 23452 14.66 50.42 0.9941
SPCP 50 12569 6.83 23.78 0.6297
-PCP 20 7921 1.25 4.87 0.9870
= 25 NSA 122 23825 36.82 142.39 0.9775
SPCP 83 18483 21.95 84.89 0.4249
-PCP 26 8077 3.25 14.27 0.9740
Table 1: Comparison of low-rank and sparse matrix recovery with varying underlying ranks of data.

We first examine our proposed -PCP algorithm with different and . The data are generated with different noise levels, i.e., and . For simplicity, we set . When , our -PCP model is actually the SPCP model. We measure the recovery performance based on the Relative Squared Error (RSE) of the low rank part and sparse part as

(4.1)
(4.2)

where and are the recovered matrices. The experimental results are shown in Figure 2. It can be seen that -PCP achieved better recovery performance with smaller values of and . The performance of -PCP with is compatible with that for .

To verify the effectiveness and robustness of our algorithm, we further design two experiments for comparison with other methods. One is to vary the underlying rank of the observed data, and the other to vary the dimension of the matrix. The experimental settings are as follows:

  • We fix and , and vary in the set ;

  • We fix and , and vary in the set .

Following the two directions and , the experimental results for the different algorithms are shown in Table 1 and Figure 3. In all cases, our algorithm outperforms NSA and SPCP in terms of the rank of and the sparsity of . Specifically, Figure 3 records the relative square error of low-rank matrices and sparse matrices. It is obvious that our algorithm with three representative values all outperforms NSA and SPCP. For our own algorithm -PCP, the differences between = 0.1 and = 0.5 are limited. Thus, in the following experiments, we adopt to test our algorithm. Table 1 shows comprehensive results of the recovery errors related to and (column 4-5), and the accuracy of the captured sparse location (column 6). and are the solutions obtained using different algorithms, and , are the groundtruth matrices. represents the number of non-zero entries of . records the percentage of correctly located entries. From the results shown in Table 1, we see that our algorithm and NSA both perform much better than SPCP in capturing the sparse locations in matrix . For recovering the low-rank and sparse matrix, our algorithm obtains the best approximation. In particular, in the case of = 25, compared with NSA () and SPCP (), we obtain relative error of . Furthermore, our algorithm achieves RSE of , which is much better than NSA () and SPCP ().

(a) Original
(b) Noise
(c) TNNR-ADMM
(d) NSA
(e) SPCP
(f) -PCP
Figure 4: Comparison of image recovery using different low rank approximation algorithms. The images in the second and last row are the amplified patches circled by the red Q4 line in the previous row.
(a)
(b)
Figure 5: Comparison of the relative square error (a) and PSNR values (b) on the 50 natural images.

4.2 Image Denoising

The real images may be corrupted by different types of noise. In this experiment, we apply the low-rank approximation algorithms for image denoising. We download 50 images from the Google image search engine. The images are with three channels and of the same size . We add Gaussian noise to of the pixels of each image. Note that the color image consists of three channels: , and . We implement the principal component analysis algorithms for each channel respectively. The image is then recovered by combining the recovered results from the three channels. Some recovered images are shown in Figure 4. It can be seen that our method achieves the best image recovery performance. The images recovered by NSA and SPCP are blurred and some important details are missing. The recovered images of TNNR-ADMM are much clear than NSA and SPCP, but still not as good as our method.

We measure the recovery performance based on the RSE(L) defined in (4.1) and the PSNR (Peak Signal Noise Ratio) [HTG08] value. Figure 5

plots the RSE and PSNR values on 50 tested images. It can be seen that our algorithm obtains the highest PSNR values and the smallest RSE for all the images. Such results indicate that the our low-rank approximation is better than the traditional nuclear norm heuristic in this situation. Though TNNR-ADMM is non-convex method, it is still inferior to our method. It demonstrates the effectiveness of our model and the optimization method.

(a) Original
(b) NSA
(c) SPCP
(d) -PCP
Figure 6: Background extraction using different algorithms. Because of limited space, we only plot the results of three frames (corresponding to three rows).

4.3 Other Applications

To test the generalization of our method, we design two algorithms for two more applications, separation of foreground and background for a surveillance video and removal of light/shadow from facial images.

Specifically, we first apply different approaches to background separation and detection of objects in the foreground in an airport surveillance video [LHGT04]. It contains a sequence of 201 grayscale frames of 144176 pixels during an observation time period. The size of the observed data is .

Figure 6 shows the background extraction results using NSA, SPCP and -PCP algorithms. It can be seen that all three methods are able to separate the background and foreground. To further compare the background recovery results clearly, we mark the ghosting parts of the NSA with rectangles, and mark the same parts with the same rectangles in the original frame, as well as in the recovered background of SPCP and -PCP. Although the SPCP has a better recovered background than NSA, there are ghosting shadows in the recovered background. Our algorithm best recovers the background, removing almost all the shadows.

(a) Subject 1
(b) Subject 2
(c) Subject 3
(d) Subject 4
Figure 7: Shadow removal results for facial images of Subject 1 (a) and Subject 2 (b); Light removed from facial images of Subject 3 (c) and Subject 4 (d). All the images are from the YaleB database. In each subfigure, the first row is the recovered low rank and sparse part by NSA. The second and third rows are the results of SPCP and our method.

We further apply the principal component pursuit algorithms for removing light and shadow on facial images. Such processing is usually very important for face recognition. The reason that PCP can be applied for removing light/shadow on a face is that the captured facial image can be regarded as the sum of the low rank part (common face) and sparse errors (e.g., light/shadow). We test the competing algorithms on the YaleB data set, which consists of 38 subjects under different illumination. Each subject has 64 images with resolution

. Thus the size of the observed matrix is . We apply NSA, SPCP and -PCP on this data set, and plot four example images (1 per individual) in Figure 7. It can be seen that the shadow and light on the faces are removed, and the recovered facial images are very clear. This verifies the effectiveness of our proposed algorithm. The three methods perform well in this situation.

5 Conclusions and Future Work

This study investigates the use of the Schatten- norm and the -norm regularized non-convex principal component pursuit. We further develop an iteratively reweighted algorithm PIRA to solve the non-convex problem. In each iteration, PIRA solves two sub-problems which have closed form solutions. The obtained and are optimal for the linearized sub-problem. We demonstrate the convergence of the objective function based on the calculated low rank and sparse parts by experiments. We present extensive experiments on both synthetic and real-world data to demonstrate the attractive properties and effectiveness of our algorithm. Interesting future work will be applying the joint Schatten- norm and -norm to other low-rank and sparse problems in the area of video denoising.

References

  • [AEB06] Michal Aharon, Michael Elad, and Alfred Bruckstein. K-svd: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing, 54(11):4311–4322, 2006.
  • [AGI11] Necdet Serhat Aybat, Donald Goldfarb, and Garud Iyengar. Fast first-order methods for stable principal component pursuit. arXiv preprint arXiv:1105.2126, 2011.
  • [BT09] Amir Beck and Marc Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences, 2(1):183–202, 2009.
  • [CDC13] Kun Chen, Hongbo Dong, and Kung-Sik Chan. Reduced rank regression via adaptive nuclear norm penalization. Biometrika, 2013.
  • [CGWY11] Xiaojun Chen, Dongdong Ge, Zizhuo Wang, and Yinyu Ye. Complexity of unconstrained minimization. Mathematical Programming, pages 1–13, 2011.
  • [CHH13] Hsien-Hsin Chou, Ling-Yuan Hsu, and Hwai-Tsu Hu. Turbulent-pso-based fuzzy image filter with no-reference measures for high-density impulse noise. IEEE Transactions on Cybernetics, 43(1):296–307, 2013.
  • [CLMW11] Emmanuel J Candès, Xiaodong Li, Yi Ma, and John Wright. Robust principal component analysis? Journal of the ACM, 58(3), 2011.
  • [CSZ14] Lu Canyi, Yan Shuicheng, and Lin Zhouchen. Generalized nonconvex nonsmooth low-rank minimization. In CVPR, 2014.
  • [CWB08] Emmanuel J Candés, Michael B Wakin, and Stephen P Boyd. Enhancing sparsity by reweighted minimization. Journal of Fourier Analysis and Applications, 14(5-6):877–905, 2008.
  • [CZ] Xiaojun Chen and Weijun Zhou. Convergence of the reweighted minimization algorithm for minimization. to appear in Comp. Optim. Appl.
  • [ER06] Ramin Eslami and Hayder Radha. Translation-invariant contourlet transform and its application to image denoising. IEEE Transactions on Image Processing, 15(11):3362–3374, 2006.
  • [Faz02] Maryam Fazel. Matrix rank minimization with applications. PhD thesis, PhD thesis, Stanford University, 2002.
  • [FL01] Jianqing Fan and Runze Li. Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456):1348–1360, 2001.
  • [FL09] S. Foucart and M.J. Lai. Sparsest solutions of underdetermined linear systems via -minimization for . Applied and Computational Harmonic Analysis, 26(3):395–407, 2009.
  • [HTG08] Quan Huynh-Thu and Mohammed Ghanbari. Scope of validity of psnr in image/video quality assessment. Electronics letters, 44(13):800–801, 2008.
  • [HYZ08] Elaine T Hale, Wotao Yin, and Yin Zhang. Fixed-point continuation for ell_1-minimization: Methodology and convergence. SIAM Journal on Optimization, 19(3):1107–1130, 2008.
  • [HZY12] Yao Hu, Debing Zhang, Jieping Ye, Xuelong Li, and Xiaofei He. Fast and accurate matrix completion via truncated nuclear norm regularization. Pacific Journal of Optimization, 35(9):2117–213, 2012.
  • [JLSX10] Hui Ji, Chaoqiang Liu, Zuowei Shen, and Yuhong Xu. Robust video denoising using low rank matrix completion. In CVPR, pages 1791–1798, 2010.
  • [JY09] Shuiwang Ji and Jieping Ye. An accelerated gradient method for trace norm minimization. In ICML, pages 457–464, 2009.
  • [LCM09] Zhouchen Lin, Minming Chen, and Yi Ma. The augmented lagrange multiplier method for exact recovery of corrupted low- rank matrices. UIUC Technical Report UILU-ENG-09-2215, 2009.
  • [Lee80] Jong-Sen Lee. Digital image enhancement and noise filtering by use of local statistics. IEEE Transactions on Pattern Analysis and Machine Intelligence, (2):165–168, 1980.
  • [LHGT04] Liyuan Li, Weimin Huang, Irene Yu-Hua Gu, and Qi Tian. Statistical modeling of complex backgrounds for foreground object detection. IEEE Transactions on Image Processing, 13(11):1459–1472, 2004.
  • [LLY13a] Guangcan Liu, Zhouchen Lin, Shuicheng Yan, Ju Sun, Yong Yu, and Yi Ma. Robust recovery of subspace structures by low-rank representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1):171–184, 2013.
  • [LLY13b] Guangcan Liu, Zhouchen Lin, Shuicheng Yan, Ju Sun, Yong Yu, and Yi Ma. Robust recovery of subspace structures by low-rank representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1):171–184, 2013.
  • [LO00] Xin Li and Michael T Orchard. Spatially adaptive image denoising under overcomplete expansion. In ICIP, volume 3, pages 300–303, 2000.
  • [LXY13] Ming-Jun Lai, Yangyang Xu, and Wotao Yin. Improved iteratively reweighted least squares for unconstrained smoothed ell_q minimization. SIAM Journal on Numerical Analysis, 51(2):927–957, 2013.
  • [MDYY11] Yadong Mu, Jian Dong, Xiaotong Yuan, and Shuicheng Yan. Accelerated low-rank visual recovery by random projection. In

    IEEE Conference on Computer Vision and Pattern Recognition

    , pages 2609–2616, 2011.
  • [MF12] Karthik Mohan and Maryam Fazel. Iterative reweighted algorithms for matrix rank minimization.

    Journal of Machine Learning Research

    , 13:3441–3473, 2012.
  • [NHD12] Feiping Nie, Heng Huang, and Chris HQ Ding. Low-rank matrix recovery via efficient schatten p-norm minimization. In AAAI, 2012.
  • [NWC12] Feiping Nie, Hua Wang, Xiao Cai, Heng Huang, and Chris Ding. Robust matrix completion via joint Schatten -norm and -norm minimization. In ICDM, pages 566–574, 2012.
  • [PGW10] Yigang Peng, Arvind Ganesh, John Wright, Wenli Xu, and Yi Ma. RASL: Robust alignment by sparse and low-rank decomposition for linearly correlated images. In CVPR, pages 763–770, 2010.
  • [PSWS03] Javier Portilla, Vasily Strela, Martin J Wainwright, and Eero P Simoncelli. Image denoising using scale mixtures of gaussians in the wavelet domain. IEEE Transactions on Image Processing, 12(11):1338–1351, 2003.
  • [RFP10] Benjamin Recht, Maryam Fazel, and Pablo A Parrilo. Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM review, 52(3):471–501, 2010.
  • [SCD02] J-L Starck, Emmanuel J Candès, and David L Donoho. The curvelet transform for image denoising. IEEE Transactions on Image Processing, 11(6):670–684, 2002.
  • [SLL14] Ling Shao, Li Liu, and Xuelong Li.

    Feature learning for image classification via multiobjective genetic programming.

    IEEE Transactions on Neural Networks and Learning Systems

    , 25(7):1359 – 1371, 2014.
  • [SYLL08] Ling Shao, Ruomei Yan, Xuelong Li, and Yan Liu. From heuristic optimization to dictionary learning: a review and comprehensive comparison of image denoising algorithms. IEEE Transactions on Cybernetics, 44(7):1001–1013, 2008.
  • [TY10] Kim-Chuan Toh and Sangwoon Yun. An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems. Pacific Journal of Optimization, 6(15):615–640, 2010.
  • [WLLL13] Zhangyang Wang, Houqiang Li, Qing Ling, and Weiping Li. Robust temporal-spatial decomposition and its applications in video processing. IEEE Transactions on Circuits and Systems for Video Technology, 23(3):387–400, 2013.
  • [WYG09] John Wright, Allen Y Yang, Arvind Ganesh, Shankar S Sastry, and Yi Ma. Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(2):210–227, 2009.
  • [YSL13] Ruomei Yan, Ling Shao, and Yan Liu. Nonlocal hierachical dictionary learning using wavelets for image denoising. IEEE Transactions on Image Processing, 22(12):4689 – 4698, 2013.
  • [ZLW10] Zihan Zhou, Xiaodong Li, John Wright, Emmanuel Candes, and Yi Ma. Stable principal component pursuit. In ISIT, pages 1518–1522, 2010.
  • [ZS14] Fan Zhu and Ling Shao. Weakly-supervised cross-domain dictionary learning for visual recognition. International Journal of Computer Vision, 109(1-2):42–59, 2014.
  • [ZT11] Tianyi Zhou and Dacheng Tao. Godec: Randomized low-rank & sparse matrix decomposition in noisy case. In ICML, pages 33–40, 2011.
  • [ZYZH13] Haichao Zhang, Jianchao Yang, Yanning Zhang, and Thomas S Huang. Image and video restorations via nonlocal kernel regression. IEEE Transactions on Cybernetics, 43(3):1035–1046, 2013.