I Introduction
(a) Explicit singular value shrinkage 
(b) CPAbased singular value shrinkage 
(c) Matrix rank minimization 
The lowrank structure inherent in various signals has been widely exploited in many signal processing applications, such as matrix and tensor completion
[1, 2, 3, 4], image decomposition [5, 6], photometric stereo [7, 8], image alignment [9, 10][11], inpainting [12, 13], background modeling [14, 15, 16, 17, 18], color artifact removal [19], cognitive radio [20], and voice separation [21]. In such applications, the lowrank structure is incorporated into a minimization problem involving the rank function or its continuous relaxation. The problem is solved using some iterative algorithms, with which the thresholding of singular values is usually required at each iteration. We refer to this methodology as matrix rank minimization.There are two representative approaches of matrix rank minimization. One is the exact method. It is an ideal formulation, but the resulting problem is very difficult to solve due to the nonconvexity and combinatorial nature of the rank function. The other is the nuclear norm relaxation [22]. Since the nuclear norm, the sum of the singular values of a matrix, is the tightest convex relaxation of the rank function, we can efficiently solve the resulting problem via convex optimization techniques. Weighted nuclear norm relaxation [23, 24] has recently been proposed as a nonconvex but continuous approximation of the rank function.
Essentially, both of the above methods require the thresholding of singular values, which we call singular value shrinkage, at each iteration of certain optimization methods (Fig. 1). That is, most methods for matrix rank minimization must carry out singular value decomposition (SVD) many times. This is a serious problem in terms of computational cost when we handle large matrices, even with highspec computers.
Several methods have been proposed to tackle this issue [25, 26, 27]. The basic concept of [25, 26] is to approximately compute partial singular values and/or vectors. These methods can drastically reduce the computation time of singular value shrinkage but would not be suitable for the matrix rank minimization. Since the number of singular values above a threshold is not identified without the full decomposition, many singular values above a threshold are reduced to zero in each iteration. As a result, large approximation errors are produced, which results in an unstable convergence in the matrix rank minimization. With the other method [27], singular value shrinkage is carried out by computing neither singular values nor vectors, but the reduction in the computation time is still limited. This is because the method requires a complete orthogonal decomposition [28] and the calculation of the inverse of a large matrix. It also leads to large approximation errors, i.e., an unstable convergence in the matrix rank minimization. We consider a method similar to that by Cai and Osher [27]: We only need a “processed” matrix with thresholded singular values.
In this paper, we propose a fast singular value shrinkage method for reducing the computational cost in the matrix rank minimization of high dimensional matrices. Note that the proposed method computes neither singular values nor vectors during the process of singular value shrinkage, similar to the method by Cai and Osher [27]. Furthermore, our method maintains computational precision to lead matrix rank minimization algorithms to stable convergence. The two key tools of our method are described as follows.

Chebyshev polynomial approximation (CPA) [29, 30, 31]: This tool is often used for designing filters in signal processing [32, 33] and is a key tool for reducing computational cost. The applications of CPA have been studied by Saad et al.[34, 35, 36, 37]. With the applications by Saad et al., CPA is used to calculate a vector after being transformed by a matrix with singular value shrinkage. That is, it requires the iterative multiplications of a matrix and vector to derive the Chebyshev polynomials. The concept of the applications has recently been used for improving the performance of image filtering methods such as bilateral filter, nonlocal means, and BM3D [38, 39, 40]. In contrast, we propose a method to obtain a matrix whose singular values are processed by using CPA. Since CPA results in truncation errors, such as ripples in the lowerorder approximations, we also investigate the designs of thresholding functions and appropriate approximation order for reducing approximation errors.

Sparsity of signals: By using CPA, our method can represent singular value shrinkage as a multiplication of matrices. Since the multiplication can be computed efficiently when the matrices are sparse, our method exploits the inherent sparsity of signals in their frequency domain for further acceleration.
Since matrix rank minimization plays a central role in various signal processing tasks, our method offers many promising applications. For this study, we validated the proposed method by using two image processing applications: image inpainting [12] and background modeling[14, 15, 16, 17, 18]. In these applications, target problems are formulated as convex optimization problems involving the nuclear norm so that they can be efficiently solved using the alternating direction method of multiplier (ADMM) [41] with our method.
Although the ADMM is widely known as a robust method for computation errors in each iteration, optimization methods (including the ADMM) with the other fast singular value shrinkage methods [25, 26, 27] do not converge well due to their large approximation errors. In contrast, our CPAbased singular value shrinkage method leads optimization methods to stable convergence. We validated this advantage experimentally by comparing our method with the other fast singular value shrinkage methods in several image processing tasks and a synthetic data.
The preliminary version of this study, without using signal sparsity, analysis of our method, and new applications, has previously been published [42].
This paper is organized as follows. Section II defines notations and preliminaries. We discuss our CPAbased singular value shrinkage method, which is the main contribution in this paper, in Section III. We discuss an approximation order of CPA for reducing the size of approximation errors in Section IV and verification of our method through applications in Section V. Finally, we conclude the paper in Section VI.
Ii Notations and Preliminaries
Iia Notations
Boldface capital and small letters indicate a matrix and a vector, respectively. Superscript is the transpose of a matrix and a vector, and superscript is the inverse of a nonsingular matrix. The matrices and
are the identity matrix and null matrix, respectively. The vector
. The norm for is defined as . We also use CPA as follows.IiB Chebyshev Polynomial Approximation
Let and be a realvalued function defined on the interval and its approximated function by using CPA, respectively. Chebyshev polynomial approximation [29, 30, 31] gives an approximate solution of by using the truncated Chebyshev series:
(1) 
where and denote a Chebyshev coefficient (described later) and an approximation order, respectively. Additionally, denotes the th order Chebyshev polynomials of the first kind, defined as
(2) 
It can also be computed using the stable recurrence relation:
(3) 
The initial condition is defined as and . Since the polynomials consist of cosine functions, the value of is bounded between and for . By using and the orthogonality of the cosine function, is calculated as
(4) 
where .
Iii Singular Value Shrinkage using Chebyshev Polynomial Approximation by Exploiting Sparsity
We discuss singular value shrinkage using CPA. First, the CPA of a matrix form, which can approximately shrink the eigenvalues of a matrix (
eigenvalue shrinkage), is indicated then extended to the singular one.Iiia Chebyshev Polynomial Approximation for Matrix
Let be a full rank matrix and be its eigendecomposition (EVD), where
is the matrix composed of eigenvectors and
is the diagonal matrix with the corresponding eigenvalues. We assume that the eigenvalues are bounded between and , where . Hence, the eigenvalues of are shrunk as(5) 
where is the eigenvalue shrinkage function, and is the filter kernel defined in . In this subsection, we consider the approximated solution of (5) using the CPA.
The CPA of the matrix form [30, 34, 36, 37] gives an approximated solution of the eigenvalue shrinkage function by using truncated Chebyshev series as
(6) 
where and are Chebyshev coefficients and Chebyshev polynomials, respectively, which are defined later. Additionally, is the eigenvalueshifted matrix given by
(7) 
whose eigenvalues are obviously within . Thanks to (7), the th order Chebyshev polynomial of is computed as
(8) 
where . Similarly to (3), the Chebyshev polynomials are obtained using the recurrence relation:
(9) 
Recall that is defined only in the interval . Therefore, the range of the filter kernel is modified by deriving as
(10) 
The term returns the shifted range back to the original range . From (10), can also be represented using as
(11) 
The function , which is referred to as the CPAbased eigenvalue shrinkage function, results in approximate eigenvalue shrinkage. The CPAbased eigenvalue shrinkage actually computes neither eigenvalues nor vectors thanks to the recurrence relation (9).
IiiB CPAbased Singular Value Shrinkage
Let be a rectangular matrix and be its singular value decomposition, where and are orthogonal matrices. The is the singular value matrix represented as
(12) 
Without loss of generality, we can assume . The singular values of are shrunk with the singular value shrinkage function as
(13) 
where is an arbitrary function.
The eigenvalue shrinkage in (5) can be extended to in (13) as [35]
(14) 
where in in (5). Equation (14) is derived as follows. First, (13) can be expanded as
(15)  
When the eigenvalue matrix of is defined as , it is obviously represented using the singular values of as . Consequently, (15) is equally calculated using in (5) as
(16)  
Note that [35] aims to calculate a vector represented as
(17) 
where and are the input and output vectors, respectively. The CPA is applied to to quickly derive in [35]. In contrast, our method is focused on deriving the matrix in (16) itself. When the matrix, whose singular values are shrunk using CPA, is represented as , the explicit SVD of can be avoided. However, deriving the matrix, not the vectors, usually requires enormous computation time because of multiplication of dense matrices. To accelerate the calculation, the sparseness of a matrix is exploited with our method, as indicated below.
Assume that is a matrix composed of an inherently sparse signal. Let
be an arbitrary orthogonal matrix that efficiently sparsifies
, e.g.,is considered as the discrete Fourier transform
[43], discrete cosine transform (DCT) [44], and discrete wavelet transform (DWT) [45]. Note that the matrix is the forward transform, i.e., when let be a column vector, the forward transform is represented as . From the above, (14) is further rewritten with as(18) 
where for simplicity. With the CPAbased eigenvalue shrinkage function in (6) and (11), in (18) is efficiently approximated as
(19) 
The form of (19) enables us to use the sparsity of a signal in its frequency domain.
For further enhancing the sparsity of , its components are thresholded as
(20) 
where and are the th row and th column of and its truncated coefficient, and is an arbitrary small value. We show that this truncation has little effect on the performance of our method and provides recommended settings of in Section V. As a result, the CPAbased eigenvalue shrinkage of is approximately given by
(21) 
In summary, the singular value shrinkage of is approximately represented as
(22) 
where the function is the CPAbased singular value shrinkage function. It can be calculated with the recurrence relation as
(23) 
where
(24) 
in which is the maximum eigenvalue of . The pseudocode of the CPAbased singular value shrinkage is indicated in Algorithm 1.
IiiC Computational Complexity of CPAbased Singular Value Shrinkage
We now discuss the computational complexity of our method. Assume that matrices and have and nonzero elements, respectively. The maximum number of multiplications of nonzero elements required to calculate is represented as in the case of a sparse matrix. The computational complexity of line 9 in Algorithm 1 can be represented as due to the multiplication . At line 10 in Algorithm 1, the computation takes from the multiplication . That is, the total computational complexity is represented as , where represents the maximum value among . From the above, when becomes small, the computational cost is also reduced.
(a)  (b) 
For low computational complexity, the matrix should be constructed so as not to increase the number of its nonzero elements as much as possible in the multiplication of matrices.
Iv Shrinkage Functions and Approximation Order
In this section, we discuss suitable approximation orders for shrinkage functions approximated by CPA, which has small truncation errors. Additionally, we argue that CPA is a reasonable choice for our method among a variety of polynomial approximation methods.
As an introduction, we consider the shrinkage function shown in Fig. 2(a). Let be the hard shrinkage response defined as
(25) 
where is an arbitrary real value and . Chebyshev polynomial approximation gives an approximate response of in (1). As in Fig. 2(b), the approximated response has ripples, which is widely known in digital filter design [46, 47, 32, 48, 49]. Therefore, studying the design of appropriate shrinkage responses and approximation orders is an important topic, even for our method.
Iva Approximation Order
Possible shrinkage responses handled with our method can be expressed as the following generic form:
(a)  (b) 
(c)  (d) 
(26) 
where is a weight function, and and are arbitrary thresholding values. The choices of and determine the characteristics of (26) as follows:

: Hardshrinkage.

: Weighted softshrinkage.

: Softshrinkage^{1}^{1}1Softshrinkage is widely known as in which is an operator choosing the greater one out of and . However, we call softshrinkage because it is finally transformed into in (22)..
Note that we defined in (14), where is an arbitrary shrinkage function for the eigenvalues of .
Responses  Differences  

(a) Hardshrinkage  (b) Weighted softshrinkage  (c) Softshrinkage  Difference in (c) 
(d) Hardshrinkage  (e) Weighted softshrinkage  (f) Softshrinkage  Difference in (f) 
As a result, becomes the hardshrinkage, weighted softshrinkage, or softshrinkage functions when is set as above.
These choices among the shrinkages are shown in Fig. 3. It is clear that hardshrinkage has a sharp transition band (see Fig. 3(a)); therefore, CPA, which is computed as a linear combination of cosine functions, may not approximate it well. In contrast, one can expect that a response that has a smooth transition band is suitable for CPA. To verify this numerically, the approximated responses were compared among hardshrinkage, weighted softshrinkage, and softshrinkage. In this experiment, was used. For weighted softshrinkage, and were used. Additionally, the thresholding value for hardshrinkage was set to .
Figure 3 also shows the approximated shrinkage responses of (26) for various shrinkage conditions. Hardshrinkage yields larger errors than soft ones. Empirically, hardshrinkage requires more than the 50thoder approximation. In contrast, softshrinkages only require 10–20thorder approximations. To be more specific, is recommended for a small weight shown in Fig. 3(b), whereas for the softshrinkage response shown in Fig. 3(d).
IvB Suitability of CPA
There are many polynomial approximations. Even among them, minimax polynomial approximation [50, 46, 47, 51, 52] and least squares approximation [53] are well known as the best approximation in the sense of the minimization of the infinity norm and the least squares error w.r.t the difference between an exact and approximated responses, respectively. To derive polynomial coefficients, their optimization requires a minimization of norm represented as
(27) 
where is an approximated shrinkage response with the above two polynomial approximations. Clearly, (27) requires the exact response for . When is precisely represented using many sampling points, exhibits good performance. However, computational complexity becomes high when many sampling points are used, especially in the case of least squares approximation. Let be the column vector of coefficients for the polynomial approximation. That is, the approximated shrinkage response can be calculated as . Additionally, let and be the column vectors composed of real values, respectively. The Vandermonde matrix is defined as
(28) 
From the above definitions, coefficients of the least squares approximation are calculated as , where is the pseudo inverse of a matrix. The calculation requires high computational cost when and/or are large. In contrast, CPA only requires the inner product of and to derive coefficients of polynomials from (4), where . Additionally, CPA performs better approximation than other optimization methods. To verify the exellent approximation, CPA was compared with minimax approximation and least squares approximation, as shown in Fig. 4. As can be seen, CPA and least squares approximations have a similar oscillation pattern. Chebyshev polynomial approximation sufficiently attenuates ripples, compared with the other methods in the stopband, as shown in the differences of Fig. 4.
V Applications
We compared our CPAbased singular value shrinkage method with the exact and approximate singular value shrinkage methods. Specifically, we applied our method to two applications using nuclear norm relaxation, i.e., inpainting of texture images and background subtraction of videos. Additionally, we compared our CPAbased method with the existing methods, i.e., the exact partial singular value decomposition (PSVD) based method and fast singular value shrinkage methods [25, 26, 27], in Section VF. The computation time and approximation precision were indicated for the comparisons.
Va Experimental Conditions
The applications were implemented with MATLAB R2015b and run on a 3.2GHz Intel Xeon E52667 processor with 512GB RAM. We compared our method with the SVDbased naive method (denoted as SVDbased method) in (13) and EVDbased methods in (15) with respect to approximation precision and computation time. Both SVD and EVDbased methods are exact singular value softshrinkage methods. The EVDbased method^{2}^{2}2The EVDbased method could lead to loss of computational precision compared with the SVDbased one. Though the errors may affect the performance of applications, we did not encounter such a problem in the experiments described in this paper. is usually faster than the SVDbased method and is widely used in many applications. Therefore, the computation time of only the EVDbased method is indicated for the results of the exact methods. With the SVDbased method, the SVD of an arbitrary matrix is first performed, then the obtained singular values are shrunk as , where indicates the th largest singular value of . The EVDbased method uses the relation between singular value shrinkage and eigenvalue shrinkage: the EVD of is first computed, then the obtained eigenvalues are shrunk as , where denotes the th largest eigenvalue of , to derive singular value shrinkage. Also, the shrinkage function in (14) is defined as the softshrinkage case given by from (26) for our method. The DWT [45] was used in (22) to sparsify the signals. We used Haar wavelet transform as the DWT. In the DWT, one level transform was performed and all high frequency components were set to . The selection of a transform method naturally affects the computation time of our method. Therefore, we indicate the effect of the selection in Section VE. To indicate the approximation precision, root mean squared error (RMSE) was used, which was computed using the results of our method and those of the SVD/EVDbased methods. Furthermore, the computation times of all the methods are shown, and the average computation times of the CPAbased/exact singular value shrinkage in each iteration are also indicated. In all applications, we used the 5th, 10th, 15th, and 20thorder approximations. We also used the following optimization tools to solve the above applications.
VB Optimization Tools
VB1 Proximity Operator
Let be the set of all proper lower semicontinuous convex functions^{3}^{3}3A function is called proper lower semicontinuous convex if , is closed in , and in and , respectively. over . The proximity operator [54] of a function of index is defined as
(29) 
The proximity operator plays a central role in the optimization of applications, as discussed in this section. When function is defined as the nuclear norm, i.e., , the proximity operator can be calculated by singular value shrinkage with the thresholding parameter [2]. Therefore, our CPAbased method is applied to the operator in the case of the nuclear norm.
VB2 Alternating Direction Method of Multipliers
The ADMM [41] is an algorithm for solving a convex optimization problem represented as
(30) 
where , and . For arbitrary , , and , the ADMM algorithm is given by
(31) 
We recall a convergence analysis of the ADMM by Eskstein and Bertsekas [41].
Fact 1 (Convergence of the ADMM [41])
VC Texture Image Inpainting [12, 13]
The objective with this application is to recover a missing region (as shown in the later Fig. 6(b)).
Let and be a texture image and a given image with missing regions, respectively. Then, let and be observed and missing regions and and be linear operators extracting pixels in their regions. From the notations, the missing region is represented as . The pixels surrounding with the size of five pixels, as shown in Fig. 5, are defined as . Let and be the DCT matrices in the horizontal and vertical matrix directions, i.e., these matrices transform an image to its frequency domain. Since a regular texture image is basically sparse in its frequency domain, it can be represented as , where is the coefficients on the frequency domain of . Additionally, the set of a normalized dynamic range constraint is defined as . When and are assumed to be low rank and sparse, the reconstruction problem can approximately be solved using the nuclear norm^{5}^{5}5The nuclear norm of is defined as , where . and the norm as
(a)  (b)  (c)  (d)  (e) 
(f)  (g)  (h)  (i)  (j) 
Bricks  

Approximation order  5  10  15  20  EVDbased method 
Total computation time (s)  401.85  373.91  347.53  348.38  464.81 
Average computation time of singular value softshrinkage (s)  1.35  1.64  1.81  2.00  2.48 
RMSE between our method and SVDbased method (10)  9.92  9.83  9.77  9.73   
[6pt] Office windows Approximation order 5 10 15 20 EVDbased method Total computation time (s) 418.50 384.95 366.74 369.93 464.70 Average computation time of singular value softshrinkage (s) 1.39 1.59 1.85 1.87 2.48 RMSE between our method and SVDbased method (10) 8.73 8.75 8.75 8.72 
(32)  
where a positive real value is a regularization parameter, calculates the arithmetic average, and is the operator vectorizing a matrix. The average pixel value on the recovered region is assumed to be identical to that of its surrounding pixel values. As can be seen, (32) is composed of nuclear norm relaxation so that our method can be used for its efficient calculation. Hereafter, we discuss the validation of our method by applying the ADMM to (32) to obtain the optimal solution. In Appendix AA, (32) is converted to the form to which the ADMM is applicable.
Eightbit color images Bricks and Office widows^{6}^{6}6The images are available at http://www.mayang.com/textures/., as shown in Figs. 6(a) and (f), were used for the application, where the size of each color component of the images was 1920 2560. The pixel values of each color component of the images were in the range from 0 to 1. In the application, each color component was inpainted separately. The observed image with the missing regions were defined, as shown in Fig. 6(b) and (g)^{7}^{7}7In the experiment, the images are transposed to “portrait”.. The number of missing pixels in Fig. 6(b) was and that in Fig. 6(g) was . In the ADMM applicable form (see Appendix AA), the column vectors and were initialized by allone vectors. Additionally, for and , the thresholding parameters were set to in Bricks and in Office windows, where the parameters were determined for the fast and stable convergence of the optimization. For fast computation, parallel processing^{8}^{8}8The MATLAB function parfor, which is contained in the parallel computing toolbox, was used for the parallel computing only in the image inpainting method. was performed in the application of the color components.
Figure 6 shows the results of image inpainting with the 20thorder approximation. The resulting image recovered using our method was practically equivalent to that with the SVDbased method by comparing Figs. 6(c), (d), (h), and (i). In Figs. 6(e) and (j), it is clear that the exact and approximated solutions had little differences, which visually indicates the high approximation precision of our method.
Table I lists the computation time and RMSE comparisons. Our method was faster than the EVDbased method while maintaining reconstruction performance. Regarding the total computation times of Bricks and Office windows, our method with the 5thorder and 10thorder approximations was slower than with the 15thorder approximations. This is because our method with the loworder approximations did not converge well due to the low approximate precision.
VD Background Modeling of Video[14, 15, 16, 17, 18]
The objective with this application is to divide a video sequence into background and object sequences (as shown in Fig. 7(a) and (b)).
Let be the th frame of a video sequence. The sequence is rearranged into a matrix as
(33) 
Then, let and be the background sequence and sequence of moving objects of a video. In , pixel values corresponding to are zero and vice versa. The background and moving objects can be assumed to be low rank and sparse; hence, the background modeling is solved as the following convex optimization problem:
(34) 
Problem (34) is also solved using the ADMM.
In the background modeling, an eightbit grayscale video Laboratory^{9}^{9}9This video was recorded with our video camera. It was downsampled and transformed into grayscale for the experiment. was used. The was the th frame of the video in ; hence, the matrix of the sequence was . The pixel values of the video were in the range from 0 to 1. In the ADMM applicable form (see Appendix AB), the column vectors , , and were initialized by allone vectors.
Original and difference  CPAbased method  EVDbased method 

Original frame  (a) Low rank  (c) Low rank 
Difference between (a) and (c)  (b) Sparse  (d) Sparse 
Approximation order  5  10  15  20  EVDbased method 

Total computation time (s)  1200.21  1127.60  1150.27  2192.24  6457.53 
Average computation time of singular value softshrinkage (s)  20.95  21.21  21.79  22.20  30.77 
RMSE between our method and EVDbased method (10)  56.27  23.89  10.86  3.71   
Additionally, the thresholding parameters were set to , where the parameters were determined for the fast and stable convergence of the optimization. For comparison, low rank and sparse components in the 160th frame are shown in Fig. 7 with the 20thorder approximation.
Our method effectively decomposed the video sequences to low rank and sparse sequences, as shown in Figs. 7(a) and (b). They are almost equivalent to those with the EVDbased method; hence, the difference between the low rank images are not displayed even though the difference is amplified (bottom left of Fig. 7).
Table II summarizes the computation times and RMSEs between the background modeling of our method and that of the EVDbased method^{10}^{10}10The SVDbased method ran out of memory in our machine so that the results of the EVDbased method were used for the RMSEs.. Our method was sufficiently faster than the EVDbased method in all approximation orders.
Proposed method (see Section VE for the explanation)  

Used transformation methods of CPAbased method  DWT  Block DCT  DCT 
Total computation time (s)  347.53  344.03  380.23 
Average computation time of singular value softshrinkage (s)  1.81  1.06  1.33 
RMSE between our method and SVDbased method (10)  9.77  3.82  3.81 
[6pt] Existing method (see Section VF for the explanation) Used algorithms Exact PSVD FRSVS [25] NSVS[26] FSVS[27] Total computation time (s) 1935.73 334.59 336.22 1926.12 Average computation time of singular value softshrinkage (s) 6.70 1.63 1.67 21.26 RMSE between existing methods and SVDbased method (10) 12.17 13.94 117.61 26.40
Methods  DCT with adaptive  DCT with fixed  

Used for  Defined in (35)  
Total computation time (s)  358.16  Not converged  416.52  380.23  765.84 
Average computation time of singular value softshrinkage (s)  1.26  1.25  1.26  1.33  5.92 
RMSE between our method and SVDbased method (10)  3.81    3.82  3.81  3.81 
This is because the reduction in computational complexity by thresholding the transformed coefficients described in (21) is effective for our method to have low computational complexity while retaining high approximation precision. However, our method with the 5thorder approximations took more time than that with 10thorder approximation because it did not converge well due to its low approximate precision.
VE Effects of Selections: Transform Matrix and Thresholding Value
The selections of a transform matrix in (22) and a thresholding value in (20) affect computation time and size of approximation error. In this subsection, we indicate these effects experimentally by using the image inpainting method for Bricks. In all experiments, the 15thorder approximation was used for our method.
VE1 Effects of Selected Transform Matrix
We compared the DWT with the DCT and the block diagonal forms of the DCT (block DCT) [32] whose block size was for indicating the differences among chosen transform matrices in (22). The threshold values in (20) were fixed to for the DCTs. The other experimental conditions were the same as those discussed in Section VC.
The results of the proposed method in Table III show the performance comparisons. Our method with the DWT is as fast as that with the block DCT in the total computation time, though the singular value shrinkage with our method with the DWT takes more time than the others. This is because the maximum iteration of our method with the DWT is only 83, whereas those of our method with the block DCT and DCT are 101 and 97, respectively. Therefore, the DWT leads our method to be stable convergence. However, our method with the DWT indicates a higher RMSE than the others since all the high frequency components were removed in it. The fact is not fatal problem because the images derived by using our method with the DWT were very similar to that of the exact one as shown in the previous section. Additionally, the block DCT can substantially sparsify an image compared to the DWT and DCT. Therefore, our method with the block DCT is faster than the others. In spite of fast computation, the results with the block DCT shows an RMSE as low as that of the DCT.
VE2 Effects of Thresholding Value
When is an excessively large value, our method becomes fast but matrix rank minimization cannot converge well due to the errors w.r.t. the reduction in the number of components. For achieving the fast computation and stable convergence, we present a recommended guideline on the thresholding values.
Let be an error at the th iteration of the ADMM, i.e., from Appendix AA and be a function that returns the th largest element in a matrix , where is used as the threshold. Basically, when is a large value, a small does not have any problem to decrease the error. In contrast, when is a small value, i.e., the optimization almost converges, should be large for a stable convergence. For this purpose, we recommend the thresholding percentage for as
(35) 
(a)  (b) 
(c) 
Reduction rate\Rank  10  100  200  500 
1  
10  
20 
where and . This was determined experimentally. Recall that the size of is . To verify (35), it was compared with fixed threshold. Four values , , , and were used for this comparison, where means that all components of a matrix are retained. Additionally, the DCT was exploited for the sparsifying matrix.
The results are listed in Table IV. The adaptive method was faster than the fixed method. Our method with was the fastest but it did not converge well for the large size of approximation errors, where is considered as the limitation of the thresholding percentage in our method. In addition, the singular value shrinkage of our CPAbased method by thresholding the matrix components was about five times faster than that by maintaining those.
VF Comparison with Existing Methods
As previously mentioned, there are several fast singular value shrinkage methods [25, 26, 27]. To illustrate the advantage of our method, we compared it with the singular value shrinkage by using the exact PSVD, fast randomized singular value shrinkage (FRSVS) [25], singular value shrinkage by using the Nyström method (NSVS) [26], and the fast singular value shrinkage without the exact SVD (FSVS) [27]. The experiments were conducted using the image inpainting method for Bricks and a synthetic data. The synthetic data is constructed as a block diagonal matrix whose number of the main diagonal blocks is equal to its rank. Let be the matrix form of the synthetic data with rank , and this is defined as , where , and . In the experiment, 10, 100, 200, and 500 were used, and of elements in were randomly replaced with zero, where . To restore the corrupted data, the optimization problem in (32) was solved without using the average term, i.e., , because the matrices and can hardly be defined for the random corruption. The function svdsechon^{11}^{11}11Available at https://www.mathworks.com/matlabcentral/fileexchange/47132fastsvdandpca was used for carrying out the exact PSVD, and the 500 largest singular values were calculated for Bricks. All preferences of the FRSVS [25] was determined in the original code^{12}^{12}12Available at http://thohkaistackr.wixsite.com/page/projectfrsvt provided by the authors. In the FRSVS, the 550 largest singular values were approximately derived for Bricks, whose number of singular values was experimentally determined for carrying out precise and fast singular value shrinkage. In addition, the 200 largest singular values were derived for the synthetic data in the exact PSVD and the FRSVS. The partial singular values derived in the exact PSVD and the FRSVS were softthresholded: The th partial singular value is shrunk to . Since the Nyström method requires a square matrix to derive partial eigenvalues, it was applied to , as in the EVDbased method. The 500 and 200 largest eigenvalues were calculated for the Bricks and the synthetic data, respectively. Those eigenvalues were then shrunk in the same way as with the EVDbased method, whose number of calculated eigenvalues was experimentally determined from the same reason as the FRSVS. All preferences used in the FSVS were directly used as suggested in [27].
The results for Bricks are indicated in Table III and Fig. 8. Note that the experiments of the existing methods were stopped at the 80th iterations since these methods did not converge. The concept of the FSVS is similar to our method, but it requires longer computation time as shown in Table III and does not converge. For the FRSVS and NSVS, although their average computation times are slightly less than ours, they result in much larger errors. This would be because many singular values or eigenvalues above the threshold were reduced to zero, so that the exact PSVD, FRSVS, and NSVS produced the large errors in each iteration leading to unstable convergence. In contrast, our method is stable and does not affect the convergence of the optimization method because the CPAbased method can shrink the entire singular values.
Figure 9 shows the comparison of errors in the case of the synthetic data in several conditions according to the reduction rate of data elements and the matrix rank. The optimization methods using the existing methods do not converge well when the matrix rank is 500. This is because many singular values above
are discarded, i.e., enormous computation errors are produced in each iteration. From the results, the matrix rank of target data should be estimated beforehand, and then the numbers of partial singular values and vectors should be estimated to be larger than the matrix rank, in order to make the optimization method converged. In contrast, our method can lead the optimization method to stable convergence. It certainly generates some approximation errors, but it can process all singular values, which means that most singular values above
are remained.Vi Conclusion
We proposed a fast thresholding method of singular values without computing singular values and vectors. The key tool of the proposed method is CPA. From CPA characteristics, singular value shrinkage could be computed by a multiplication of matrices. The proposed method was further accelerated using the sparsity of a signal, where the frequency transform was used for obtaining sparse coefficients. Moreover, we studied the approximation order for reducing the size of approximation errors. The experimental results revealed that our method was much faster than the exact methods with high approximation precision in the case of a large data size. In addition, our method can lead the optimization method to be stable convergence in comparison of the existing fast singular value shrinkage methods because of its approximation precision.
Appendix A ADMM Applicable Forms
Aa Texture Image Inpainting
Let , , , and . The 2D DCT matrix is represented as , and the matrix form of and are defined as and . In addition, the indicator functions of the sets
Comments
There are no comments yet.