I Introduction
In the era of big data, the lowrank matrix has become a useful and popular tool to express twodimensional information. One wellknown example is the rating matrix in the recommendation systems representing users’ tastes on products [1]. Since users expressing similar ratings on multiple products tend to have the same interest for the new product, columns associated with users sharing the same interest are highly likely to be the same, resulting in the low rank structure of the rating matrix (see Fig. 1). Another example is the Euclidean distance matrix formed by the pairwise distances of a large number of sensor nodes. Since the rank of a Euclidean distance matrix in the dimensional Euclidean space is at most (if , then the rank is 4), this matrix can be readily modeled as a lowrank matrix [2, 3, 4].
One major benefit of the lowrank matrix is that the essential information, expressed in terms of degree of freedom, in a matrix is much smaller than the total number of entries. Therefore, even though the number of observed entries is small, we still have a good chance to recover the whole matrix. There are a variety of scenarios where the number of observed entries of a matrix is tiny. In the recommendation systems, for example, users are recommended to submit the feedback in a form of rating number, e.g., 1 to 5 for the purchased product. However, users often do not want to leave a feedback and thus the rating matrix will have many missing entries. Also, in the internet of things (IoT) network, sensor nodes have a limitation on the radio communication range or under the power outage so that only small portion of entries in the Euclidean distance matrix is available.
When there is no restriction on the rank of a matrix, the problem to recover unknown entries of a matrix from partial observed entries is illposed. This is because any value can be assigned to unknown entries, which in turn means that there are infinite number of matrices that agree with the observed entries. As a simple example, consider the following matrix with one unknown entry marked ?
(1) 
If is a full rank, i.e., the rank of is two, then any value except can be assigned to . Whereas, if is a lowrank matrix (the rank is one in this trivial example), two columns differ by only a constant and hence unknown element ? can be easily determined using a linear relationship between two columns (? = 10). This example is obviously simple, but the fundamental principle to recover a large dimensional matrix is not much different from this and the lowrank constraint plays a central role in recovering unknown entries of the matrix.
Before we proceed, we discuss a few notable applications where the underlying matrix is modeled as a lowrank matrix.

Recommendation system: In 2006, the online DVD rental company Netflix announced a contest to improve the quality of the company’s movie recommendation system. The company released a training set of half million customers. Training set contains ratings on more than ten thousands movies, each movie being rated on a scale from to [1]
. The training data can be represented in a large dimensional matrix in which each column represents the rating of a customer for the movies. The primary goal of the recommendation system is to estimate the users’ interests on products using the sparsely sampled
^{1}^{1}1Netflix dataset consists of ratings of more than 17,000 movies by more than 2.5 million users. The number of known entries is only about 1% [1]. rating matrix.^{2}^{2}2Customers might not necessarily rate all of the movies. Often users sharing the same interests in key factors such as the type, the price, and the appearance of the product tend to provide the same rating on the movies. The ratings of those users might form a lowrank column space, resulting in the lowrank model of the rating matrix (see Fig. 1). 
Phase retrieval
: The problem to recover a signal not necessarily sparse from the magnitude of its observation is referred to as the phase retrieval. Phase retrieval is an important problem in Xray crystallography and quantum mechanics since only the magnitude of the Fourier transform is measured in these applications
[5]. Suppose the unknown timedomain signal is acquired in a form of the measured magnitude of the Fourier transform. That is,(2) where is the set of sampled frequencies. Further, let
(3) where is the conjugate transpose of . Then, (2) can be rewritten as
(4) (5) (6) (7) where is the rank1 matrix of the waveform . Using this simple transform, we can express the quadratic magnitude as linear measurement of . In essence, the phase retrieval problem can be converted to the problem to reconstruct the rank1 matrix in the positive semidefinite (PSD) cone^{3}^{3}3If
is recovered, then the timedomain vector
can be computed by the eigenvalue decomposition of
. [5]:(8) 
Localization in IoT networks: In recent years, internet of things (IoT) has received much attention for its plethora of applications such as healthcare, automatic metering, environmental monitoring (temperature, pressure, moisture), and surveillance [6, 7, 2]. Since the action in IoT networks, such as fire alarm, energy transfer, emergency request, is made primarily on the data center, data center should figure out the location information of whole devices in the networks. In this scheme, called network localization (a.k.a. cooperative localization), each sensor node measures the distance information of adjacent nodes and then sends it to the data center. Then the data center constructs a map of sensor nodes using the collected distance information [8]. Due to various reasons, such as the power outage of a sensor node or the limitation of radio communication range (see Fig. 2), only small number of distance information is available at the data center. Also, in the vehicular networks, it is not easy to measure the distance of all adjacent vehicles when a vehicle is located at the dead zone. An example of the observed Euclidean distance matrix is
where is the pairwise distance between two sensor nodes and . Since the rank of Euclidean distance matrix is at most +2 in the dimensional Euclidean space ( or ) [3, 4], the problem to reconstruct can be wellmodeled as the LRMC problem.

Image compression and restoration: When there is dirt or scribble in a twodimensional image (see Fig. 3
), one simple solution is to replace the contaminated pixels with the interpolated version of adjacent pixels. A better way is to exploit intrinsic domination of a few singular values in an image. In fact, one can readily approximate an image to the lowrank matrix without perceptible loss of quality. By using clean (uncontaminated) pixels as observed entries, an original image can be recovered via the lowrank matrix completion.

Massive multipleinput multipleoutput (MIMO): By exploiting hundreds of antennas at the basestation (BS), massive MIMO can offer a large gain in capacity. In order to optimize the performance gain of the massive MIMO systems, the channel state information at the transmitter (CSIT) is required [9]. One way to acquire the CSIT is to let each user directly feed back its own pilot observation to BS for the joint CSIT estimation of all users [12]. In this setup, the MIMO channel matrix can be reconstructed in two steps: 1) finding the pilot matrix using the least squares (LS) estimation or linear minimum mean square error (LMMSE) estimation and 2) reconstructing using the model where each column of is the pilot signal from one antenna at BS [10, 11]. Since the number of resolvable paths is limited in most cases, one can readily assume that [12]. In the massive MIMO systems, is often much smaller than the dimension of due to the limited number of clusters around BS. Thus, the problem to recover at BS can be solved via the rank minimization problem subject to the linear constraint [11].
Other than these, there are a bewildering variety of applications of LRMC in wireless communication, such as millimeter wave (mmWave) channel estimation [13, 14], topological interference management (TIM) [15, 16, 17, 18] and mobile edge caching in fog radio access networks (FogRAN) [19, 20].
The paradigm of LRMC has received much attention ever since the works of Fazel [21], Candes and Recht [22], and Candes and Tao [23]. Over the years, there have been lots of works on this topic [5, 57, 48, 49]
, but it might not be easy to grasp the essentials of LRMC from these studies. One reason is because many of these works are highly theoretical and based on random matrix theory, graph theory, manifold analysis, and convex optimization so that it is not easy to grasp the essential knowledge from these studies. Another reason is that most of these works are proposals of new LRMC technique so that it is difficult to catch a general idea and big picture of LRMC from these.
The primary goal of this paper is to provide a contemporary survey on LRMC, a new paradigm to recover unknown entries of a lowrank matrix from partial observations. To provide better view, insight, and understanding of the potentials and limitations of LRMC to researchers and practitioners in a friendly way, we present early scattered results in a structured and accessible way. Firstly, we classify the stateoftheart LRMC techniques into two main categories and then explain each category in detail. Secondly, we present issues to be considered when using LRMC techniques. Specifically, we discuss the intrinsic properties required for lowrank matrix recovery and explain how to exploit a special structure, such as positive semidefinitebased structure, Euclidean distancebased structure, and graph structure, in LRMC design. Thirdly, we compare the recovery performance and the computational complexity of LRMC techniques via numerical simulations. We conclude the paper by commenting on the choice of LRMC techniques and providing future research directions.
Recently, there have been a few overview papers on LRMC. An overview of LRMC algorithms and their performance guarantees can be found in [73]. A survey with an emphasis on firstorder LRMC techniques together with their computational efficiency is presented in [74]. Our work is distinct from the previous studies in several aspects. Firstly, we categorize the stateoftheart LRMC techniques into two classes and then explain the details of each class, which can help researchers to easily determine which technique can be used for the given problem setup. Secondly, we provide a comprehensive survey of LRMC techniques and also provide extensive simulation results on the recovery quality and the running time complexity from which one can easily see the pros and cons of each LRMC technique and also gain a better insight into the choice of LRMC algorithms. Finally, we discuss how to exploit a special structure of a lowrank matrix in the LRMC algorithm design. In particular, we introduce the CNNbased LRMC algorithm that exploits the graph structure of a lowrank matrix.
We briefly summarize notations used in this paper.

For a vector , is the diagonal matrix formed by .

For a matrix , is the th column of .

is the rank of .

is the transpose of .

For , and are the inner product and the Hadamard product (or elementwise multiplication) of two matrices and , respectively, where denotes the trace operator.

, , and stand for the spectral norm (i.e., the largest singular value), the nuclear norm (i.e., the sum of singular values), and the Frobenius norm of , respectively.

is the th largest singular value of .

and are dimensional matrices with entries being zero and one, respectively.

is the
dimensional identity matrix.

If is a square matrix (i.e., ), is the vector formed by the diagonal entries of .

is the vectorization of .
Ii Basics of LowRank Matrix Completion
In this section, we discuss the principle to recover a lowrank matrix from partial observations. Basically, the desired lowrank matrix can be recovered by solving the rank minimization problem
(9) 
where is the index set of observed entries (e.g., in the example in (1)). One can alternatively express the problem using the sampling operator . The sampling operation of a matrix is defined as
Using this operator, the problem (9) can be equivalently formulated as
(10) 
A naive way to solve the rank minimization problem (10) is the combinatorial search. Specifically, we first assume that . Then, any two columns of M are linearly dependent and thus we have the system of expressions for some . If the system has no solution for the rankone assumption, then we move to the next assumption of . In this case, we solve the new system of expressions . This procedure is repeated until the solution is found. Clearly, the combinatorial search strategy would not be feasible for most practical scenarios since it has an exponential complexity in the problem size [76]. For example, when is an matrix, it can be shown that the number of the system expressions to be solved is .
As a costeffective alternative, various lowrank matrix completion (LRMC) algorithms have been proposed over the years. Roughly speaking, depending on the way of using the rank information, the LRMC algorithms can be classified into two main categories: 1) those without the rank information and 2) those exploiting the rank information. In this section, we provide an in depth discussion of two categories (see the outline of LRMC algorithms in Fig. 4).
Iia LRMC Algorithms Without the Rank Information
In this subsection, we explain the LRMC algorithms that do not require the rank information of the original lowrank matrix.
IiA1 Nuclear Norm Minimization (NNM)
Since the rank minimization problem (10) is NPhard [21], it is computationally intractable when the dimension of a matrix is large. One common trick to avoid computational issue is to replace the nonconvex objective function with its convex surrogate, converting the combinatorial search problem into a convex optimization problem. There are two clear advantages in solving the convex optimization problem: 1) a local optimum solution is globally optimal and 2) there are many efficient polynomialtime convex optimization solvers (e.g., interior point method [77] and semidefinite programming (SDP) solver).
In the LRMC problem, the nuclear norm , the sum of the singular values of , has been widely used as a convex surrogate of [22]:
(11) 
Indeed, it has been shown that the nuclear norm is the convex envelope (the “best” convex approximation) of the rank function on the set [21].^{4}^{4}4For any function , where is a convex set, the convex envelope of is the largest convex function such that for all . Note that the convex envelope of on the set is the nuclear norm [21]. Note that the relaxation from the rank function to the nuclear norm is conceptually analogous to the relaxation from norm to norm in compressed sensing (CS) [39, 40, 41].
Now, a natural question one might ask is whether the NNM problem in (11) would offer a solution comparable to the solution of the rank minimization problem in (10). In [22], it has been shown that if the observed entries of a rank matrix are suitably random and the number of observed entries satisfies
(12) 
where is the largest coherence of (see the definition in Subsection IIIA2), then is the unique solution of the NNM problem (11
) with overwhelming probability (see Appendix
B).It is worth mentioning that the NNM problem in (11) can also be recast as a semidefinite program (SDP) as (see Appendix A)
(13) 
where , is the sequence of linear sampling matrices, and are the observed entries. The problem (13) can be solved by the offtheshelf SDP solvers such as SDPT3 [24] and SeDuMi [25] using interiorpoint methods [26, 27, 28, 31, 30, 29]. It has been shown that the computational complexity of SDP techniques is where [30]. Also, it has been shown that under suitable conditions, the output of SDP satisfies in at most iterations where is a positive constant [29]. Alternatively, one can reconstruct by solving the equivalent nonconvex quadratic optimization form of the NNM problem [32]. Note that this approach has computational benefit since the number of primal variables of NNM is reduced from to (). Interested readers may refer to [32] for more details.
IiA2 Singular Value Thresholding (SVT)
While the solution of the NNM problem in (11) can be obtained by solving (13), this procedure is computationally burdensome when the size of the matrix is large.
As an effort to mitigate the computational burden, the singular value thresholding (SVT) algorithm has been proposed [33]. The key idea of this approach is to put the regularization term into the objective function of the NNM problem:
(14) 
where is the regularization parameter. In [33, Theorem 3.1], it has been shown that the solution to the problem (14) converges to the solution of the NNM problem as .^{5}^{5}5In practice, a large value of has been suggested (e.g., for an low rank matrix) for the fast convergence of SVT. For example, when , it requires 177 iterations to reconstruct a 10001000 matrix of rank 10 [33].
Let be the Lagrangian function associated with (14), i.e.,
(15) 
where is the dual variable. Let and be the primal and dual optimal solutions. Then, by the strong duality [77], we have
(16) 
The SVT algorithm finds and in an iterative fashion. Specifically, starting with , SVT updates and as
(17a)  
(17b) 
where is a sequence of positive step sizes. Note that can be expressed as
(18) 
where (a) is because and (b) is because vanishes outside of (i.e., ) by (17b). Due to the inclusion of the nuclear norm, finding out the solution of (18) seems to be difficult. However, thanks to the intriguing result of Cai et al., we can easily obtain the solution.
Theorem 1 ([33, Theorem 2.1]).
Let
be a matrix whose singular value decomposition (SVD) is
. Define for . Then,(19) 
where is the singular value thresholding operator defined as
(20) 
By Theorem 1, the righthand side of (18) is . To conclude, the update equations for and are given by
(21a)  
(21b) 
One can notice from (21a) and (21b) that the SVT algorithm is computationally efficient since we only need the truncated SVD and elementary matrix operations in each iteration. Indeed, let be the number of singular values of being greater than the threshold . Also, we suppose converges to the rank of the original matrix, i.e., . Then the computational complexity of SVT is . Note also that the iteration number to achieve the approximation^{6}^{6}6By approximation, we mean where is the reconstructed matrix and is the optimal solution of SVT. is [33]. In Table I, we summarize the SVT algorithm. For the details of the stopping criterion of SVT, see [33, Section 5].
Over the years, various SVTbased techniques have been proposed [35, 78, 79]. In [78], an iterative matrix completion algorithm using the SVTbased operator called proximal operator has been proposed. Similar algorithms inspired by the iterative hard thresholding (IHT) algorithm in CS have also been proposed [35, 79].
Input  observed entries , 

a sequence of positive step sizes ,  
a regularization parameter ,  
and a stopping criterion  
Initialize  iteration counter 
and ,  
While  do 
using (20)  
End  
Output 
IiA3 Iteratively Reweighted Least Squares (IRLS) Minimization
Yet another simple and computationally efficient way to solve the NNM problem is the IRLS minimization technique [36, 37]. In essence, the NNM problem can be recast using the least squares minimization as
(22) 
where . It can be shown that (22) is equivalent to the NNM problem (11) since we have [36]
(23) 
The key idea of the IRLS technique is to find and in an iterative fashion. The update expressions are
(24a)  
(24b) 
Note that the weighted least squares subproblem (24a) can be easily solved by updating each and every column of [36]. In order to compute , we need a matrix inversion (24b). To avoid the illbehavior (i.e., some of the singular values of approach to zero), an approach to use the perturbation of singular values has been proposed [36, 37]. Similar to SVT, the computational complexity per iteration of the IRLSbased technique is . Also, IRLS requires iterations to achieve the approximation solution. We summarize the IRLS minimization technique in Table II.
Input  a constant , 

a scaling parameter ,  
and a stopping criterion  
Initialize  iteration counter , 
a regularizing sequence ,  
and  
While  do 
Compute a SVD perturbation version of [36]  
End  
Output 
IiB LRMC Algorithms Using Rank Information
In many applications such as localization in IoT networks, recommendation system, and image restoration, we encounter the situation where the rank of a desired matrix is known in advance. As mentioned, the rank of a Euclidean distance matrix in a localization problem is at most ( is the dimension of the Euclidean space). In this situation, the LRMC problem can be formulated as a Frobenius norm minimization (FNM) problem:
(25) 
Due to the inequality of the rank constraint, an approach to use the approximate rank information (e.g., upper bound of the rank) has been proposed [43]. The FNM problem has two main advantages: 1) the problem is wellposed in the noisy scenario and 2) the cost function is differentiable so that various gradientbased optimization techniques (e.g., gradient descent, conjugate gradient, Newton methods, and manifold optimization) can be used to solve the problem.
Over the years, various techniques to solve the FNM problem in (25) have been proposed [43, 44, 45, 46, 47, 48, 49, 50, 51, 57]. The performance guarantee of the FNMbased techniques has also been provided [59, 60, 61]. It has been shown that under suitable conditions of the sampling ratio and the largest coherence of (see the definition in Subsection IIIA2), the gradientbased algorithms globally converges to with high probability [60]. Wellknown FNMbased LRMC techniques include greedy techniques [43], alternating projection techniques [45], and optimization over Riemannian manifold [50]. In this subsection, we explain these techniques in detail.
IiB1 Greedy Techniques
In recent years, greedy algorithms have been popularly used for LRMC due to the computational simplicity. In a nutshell, they solve the LRMC problem by making a heuristic decision at each iteration with a hope to find the right solution in the end.
Let be the rank of a desired lowrank matrix and be the singular value decomposition of where . By noting that
(26) 
can be expressed as a linear combination of rankone matrices. The main task of greedy techniques is to investigate the atom set of rankone matrices representing . Once the atom set is found, the singular values can be computed easily by solving the following problem
(27) 
To be specific, let , and . Then, we have .
One popular greedy technique is atomic decomposition for minimum rank approximation (ADMiRA) [43], which can be viewed as an extension of the compressive sampling matching pursuit (CoSaMP) algorithm in CS [38, 39, 40, 41]. ADMiRA employs a strategy of adding as well as pruning to identify the atom set . In the adding stage, ADMiRA identifies rankone matrices representing a residual best and then adds the matrices to the prechosen atom set. Specifically, if is the output matrix generated in the th iteration and is its atom set, then ADMiRA computes the residual and then adds leading principal components of to . In other words, the enlarged atom set is given by
(28) 
where and are the th principal left and right singular vectors of , respectively. Note that contains at most elements. In the pruning stage, ADMiRA refines into a set of atoms. To be specific, if is the best rank approximation of , i.e.,^{7}^{7}7Note that the solution to (29) can be computed in a similar way as in (27).
(29) 
then the refined atom set is expressed as
(30) 
where and are the th principal left and right singular vectors of , respectively. The computational complexity of ADMiRA is mainly due to two operations: the least squares operation in (27) and the SVDbased operation to find out the leading atoms of the required matrix (e.g., and ). First, since (27) involves the pseudoinverse of (size of ), its computational cost is . Second, the computational cost of performing a truncated SVD of atoms is . Since , the computational complexity of ADMiRA per iteration is . Also, the iteration number of ADMiRA to achieve the approximation is [43]. In Table III, we summarize the ADMiRA algorithm.
Input  observed entries , 

rank of a desired lowrank matrix ,  
and a stopping criterion  
Initialize  iteration counter , 
,  
and  
While  do 
(Augment)  
using (27)  
(Prune)  
(Estimate)  
using (27)  
End  
Output  , 
Yet another wellknown greedy method is the rankone matrix pursuit algorithm [44], an extension of the orthogonal matching pursuit algorithm in CS [42]. In this approach, instead of choosing multiple atoms of a matrix, an atom corresponding to the largest singular value of the residual matrix is chosen.
IiB2 Alternating Minimization Techniques
Many of LRMC algorithms [33, 43] require the computation of (partial) SVD to obtain the singular values and vectors (expressed as ). As an effort to further reduce the computational burden of SVD, alternating minimization techniques have been proposed [45, 46, 47]. The basic premise behind the alternating minimization techniques is that a lowrank matrix of rank can be factorized into tall and fat matrices, i.e., where and . The key idea of this approach is to find out and minimizing the residual defined as the difference between the original matrix and the estimate of it on the sampling space. In other words, they recover and by solving
(31) 
Power factorization, a simple alternating minimization algorithm, finds out the solution to (31) by updating and alternately as [45]
(32a)  
(32b) 
Alternating steepest descent (ASD) is another alternating method to find out the solution [46]. The key idea of ASD is to update and by applying the steepest gradient descent method to the objective function in (31). Specifically, ASD first computes the gradient of with respect to and then updates along the steepest gradient descent direction:
(33) 
where the gradient descent direction and stepsize are given by
(34a)  
(34b) 
After updating , ASD updates in a similar way:
(35) 
where
(36a)  
(36b) 
The lowrank matrix fitting (LMaFit) algorithm finds out the solution in a different way by solving [47]
(37) 
With the arbitrary input of and and , the variables , , and are updated in the th iteration as
(38a)  
(38b)  
(38c) 
where is MoorePenrose pseudoinverse of matrix .
Running time of the alternating minimization algorithms is very short due to the following reasons: 1) it does not require the SVD computation and 2) the size of matrices to be inverted is smaller than the size of matrices for the greedy algorithms. While the inversion of huge size matrices (size of ) is required in a greedy algorithms (see (27)), alternating minimization only requires the pseudo inversion of and (size of and , respectively). Indeed, the computational complexity of this approach is , which is much smaller than that of SVT and ADMiRA when . Also, the iteration number of ASD and LMaFit to achieve the approximation is [46, 47]. It has been shown that alternating minimization techniques are simple to implement and also require small sized memory [84]. Major drawback of these approaches is that it might converge to the local optimum.
IiB3 Optimization over Smooth Riemannian Manifold
In many applications where the rank of a matrix is known in a priori (i.e., ), one can strengthen the constraint of (25) by defining the feasible set, denoted by , as
Note that is not a vector space^{8}^{8}8This is because if and , then is not necessarily true (and thus does not need to belong ). and thus conventional optimization techniques cannot be used to solve the problem defined over . While this is bad news, a remedy for this is that is a smooth Riemannian manifold [53, 48]. Roughly speaking, smooth manifold is a generalization of on which a notion of differentiability exists. For more rigorous definition, see, e.g., [55, 56]. A smooth manifold equipped with an inner product, often called a Riemannian metric, forms a smooth Riemannian manifold. Since the smooth Riemannian manifold is a differentiable structure equipped with an inner product, one can use all necessary ingredients to solve the optimization problem with quadratic cost function, such as Riemannian gradient, Hessian matrix, exponential map, and parallel translation [55]. Therefore, optimization techniques in (e.g., steepest descent, Newton method, conjugate gradient method) can be used to solve (25) in the smooth Riemannian manifold .
In recent years, many efforts have been made to solve the matrix completion over smooth Riemannian manifolds. These works are classified by their specific choice of Riemannian manifold structure. One wellknown approach is to solve (25) over the Grassmann manifold of orthogonal matrices^{9}^{9}9The Grassmann manifold is defined as the set of the linear subspaces in a vector space [55]. [49]. In this approach, a feasible set can be expressed as and thus solving (25) is to find an orthonormal matrix satisfying
(39) 
In [49], an approach to solve (39) over the Grassmann manifold has been proposed.
Recently, it has been shown that the original matrix can be reconstructed by the unconstrained optimization over the smooth Riemannian manifold [50]. Often, is expressed using the singular value decomposition as
(40) 
The FNM problem (25) can then be reformulated as an unconstrained optimization over :
(41) 
One can easily obtain the closedform expression of the ingredients such as tangent spaces, Riemannian metric, Riemannian gradient, and Hessian matrix in the unconstrained optimization [53, 55, 56]. In fact, major benefits of the Riemannian optimizationbased LRMC techniques are the simplicity in implementation and the fast convergence. Similar to ASD, the computational complexity per iteration of these techniques is , and they require iterations to achieve the approximation solution [50].
IiB4 Truncated NNM
Truncated NNM is a variation of the NNMbased technique requiring the rank information .^{10}^{10}10Although truncated NNM is a variant of NNM, we put it into the second category since it exploits the rank information of a lowrank matrix. While the NNM technique takes into account all the singular values of a desired matrix, truncated NNM considers only the smallest singular values [57]. Specifically, truncated NNM finds a solution to
(42) 
where . We recall that is the th largest singular value of . Using [57]
(43) 
we have
(44) 
and thus the problem (42) can be reformulated to
(45) 
This problem can be solved in an iterative way. Specifically, starting from , truncated NNM updates by solving [57]
(46) 
where are the matrices of left and rightsingular vectors of , respectively. We note that an approach in (46) has two main advantages: 1) the rank information of the desired matrix can be incorporated and 2) various gradientbased techniques including alternating direction method of multipliers (ADMM) [80, 81], ADMM with an adaptive penalty (ADMMAP) [82], and accelerated proximal gradient line search method (APGL)[83] can be employed. Note also that the dominant operation is the truncated SVD operation and its complexity is , which is much smaller than that of the NNM technique (see Table V) as long as
Comments
There are no comments yet.