## I Introduction

Matrix completion (e.g., [1, 2, 3]

) is a fundamental problem in signal processing and machine learning, which studies the recovery of a low-rank matrix from an observation of a subset of its entries. It has attracted a lot attention from researchers and practitioners and there are various motivating real-world applications including recommender systems and the Netflix challenge (see a recent overview in

[4]). A popular approach for matrix completion is to find a matrix of minimal rank satisfying the observation constraints. Due to the non-convexity of the rank function, popular approaches are convex relaxation (see, e.g., [5]) and nuclear norm minimization. There is a rich literature, both in establishing performance bounds, developing efficient algorithms and providing performance guarantees. Recently there has also been new various results for non-convex formulations of matrix completion problem (see, e.g., [6]).Existing conditions ensuring recovery of the minimal rank matrix are usually formulated in terms of missing-at-random entries and under an assumption of the so-called bounded-coherence (see a survey for other approaches in [4]

; we do not aim to give a complete overview of the vast literature). These results are typically aimed at establishing the recovery with a high probability.

In addition, there has been much work on low-rank matrix recovery (see, e.g., [7], which studies a related problem: the uniqueness conditions for minimum rank matrix recovery with random linear measurements of the true matrix; here the linear measurements correspond to inner product of a measurement mask matrix with the true matrix, and hence, the observations are different from that in matrix completion).With a deterministic pattern of observed entries, a complete characterization of the identifiable matrix for matrix completion remains an important yet open question: under what conditions for the pattern, there will be (at least locally) unique solution? Recent work [8] provides insights into this problem by studying the so-called completable problems and establishing conditions ensuring the existence of at most finitely many rank- matrices that agree with all its observed entries. A related work [9]

studied this problem when there is a sparse noise that corrupts the entries. The rank estimation problem has been discussed in

[10, 11], and related tensor completion problem in

[12]: the goal in these works are different though; they aim to find upper and lower bound for the true rank, whereas our rank selection test in Section IV determines the most plausible rank from a statistical point of view.In this paper, we aim to answer the question from a somewhat different point of view and to give a geometric perspective. In particular, we consider the solution of the Minimum Rank Matrix Completion (MRMC) formulation, which leads to a non-convex optimization problem. We address the following questions: (i) Given observed entries arranged according to a (deterministic) pattern, by solving the MRMC problem, what is the minimum achievable rank? (ii) Under what conditions, there will be a unique matrix that is a solution to the MRMC problem? We give a sufficient condition (which we call the well-posedness condition) for the local uniqueness of MRMC solutions, and illustrate how such condition can be verified. We also show that such well-posedness condition is generic using the concept of characteristic rank. In addition, we also consider the convex relaxation and nuclear norm minimization formulations.

Based on our theoretical results, we argue that given observations of an matrix, if the minimal rank is less than , then the corresponding solution is unstable in the sense that an arbitrary small perturbation of the observed values can make this rank unattainable. On the other hand if , then almost surely the solution is not (even locally) unique (cf., [13]). This indicates that except on rare occasions, the MRMC problem cannot have both properties of possessing unique and stable solutions. Consequently, what makes sense is to try to solve the minimum rank problem approximately and hence to consider low-rank approximation approaches (such as an approach mentioned in [4, 14]) as a better alternative to the MRMC formulation.

We also propose a sequential statistical testing procedure to determine the ‘true’ rank from noisy observed entries. Such statistical approach can be useful for many existing low-rank matrix completion algorithms, which require a pre-specification of the matrix rank, such as the alternating minimization approach to solving the non-convex problem by representing the low-rank matrix as a product of two low-rank matrix factors (see, e.g., [15, 4, 16]).

The paper is organized as follows. In the next section, we introduce the considered setting and some basic definitions. In Section II we present the problem set-up, including the MRMC, LRMA, and convex relaxation formulations. Section III contains the main theoretical results. A statistical test of rank is presented in Section IV. In Section V we present numerical results related to the developed theory. Finally Section VI concludes the paper. All proofs are transferred to the Appendix.

We use conventional notations. For we denote by the least integer that is greater than or equal to . By

we denote the Kronecker product of matrices (vectors)

and , and by column vector obtained by stacking columns of matrix . We use the following matrix identity for matrices of appropriate order(1) |

By we denote the linear space of symmetric matrices and by writing we mean that matrix is positive semidefinite. By we denote the

-th largest singular value of matrix

. Bywe denote the identity matrix of dimension

.## Ii Matrix completion and problem set-up

Consider the problem of recovering an data matrix of low rank when observing a small number of its entries, which are denoted as , . We assume that and . Here is an index set of cardinality . The low-rank matrix completion problem, or matrix completion problem, aims to infer the missing entries, based on the available observations , , by using a matrix whose rank is as small as possible.

Low-rank matrix completion problem is usually studied under a missing-at-random model, under which the necessary and sufficient conditions for perfect recovery of the true matrix are known [17, 18, 19, 20, 21, 22]. Study of deterministic sampling pattern is relatively rare. This includes the finitely rank- completability problem in [8], which shows the conditions for the deterministic sampling pattern such that there exists at most finitely many rank- matrices that agrees with its observed entries. In this paper, we study a related but different problem, i.e., when will the matrix have a unique way to be completed, given a fixed sampling pattern. This is a fundamental problem related to the identifiability of a low-rank matrix given an observation pattern .

### Ii-a Definitions

Lt us introduce some necessary definitions. Denote by the matrix with the specified entries , , and all other entries equal zero. Consider , the complement of the index set , and define

This linear space represents the set of matrices that are filled with zeros at the locations of the unobserved entries. Similarly define

By we denote the projection onto the space , i.e., for and for . By this construction, is the affine space of all matrices that satisfy the observation constraints. Note that and the dimension of the linear space is , while .

We say that a property holds for almost every (a.e.) , or almost surely, if the set of matrices for which this property does not hold has Lebesgue measure zero in the space .

### Ii-B Minimum Rank Matrix Completion (MRMC)

Since the true rank is unknown, a natural approach is to find the minimum rank matrix that is consistent with the observations. This goal can be written as the following optimization problem referred to as the Minimum Rank Matrix Completion (MRMC),

(2) |

In general, the rank minimization problem is non-convex and NP-hard to solve. However, this problem is fundamental to various efficient heuristics derived from here. Largely, there are two categories of approximation heuristics: (i) approximate the rank function with some surrogate function such as the nuclear norm function, (ii) or solve a sequence of rank-constrained problems such as the matrix factorization based method, which we will discuss below. Approach (ii) requires to specify the target rank of the recovered matrix beforehand, which we will present a novel statistical test next.

### Ii-C Low Rank Matrix Approximation (LRMA)

Consider the problem

(3) |

where is the given data matrix, and is a discrepancy between matrices . For example, let with being the Frobenius norm. Define the set of matrices of rank

(4) |

Then (3) becomes the least squares problem

(5) |

The least squares approach although is natural, is not the only one possible. For example, in the statistical approach to Factor Analysis the discrepancy function is based on the Maximum Likelihood method and is more involved (e.g., [23]).

### Ii-D SDP formulations: Trace and nuclear norm minimization

An alternative approach to the MRMC problem, which has been studied extensively in the literature, is the convex relaxation formulation (e.g., [1, 5]). Let be the symmetric index set corresponding to the index set , i.e., when , if and only if ; and if , then . By we denote the symmetric index set complement of . Define

and

Define , , a symmetric matrix of the following form, that contains the data,

The MRMC problem (2) can be formulated in the following equivalent form

(6) |

Minimization in (6) is performed over matrices which are complement to in the sense of having zero entries at all places corresponding to the specified values , . We consider a more general minimum rank problem of the form (6) in that we allow the index set to be a general symmetric subset of , with a given matrix . Note that and .

As a heuristic it was suggested in [5] to approximate problem (6) by the following trace minimization problem

(7) |

which is equivalent to the following nuclear norm minimization problem

(8) |

Problem (7) is a special case of the following general SDP problem (if we introduce a weight matrix ):

(9) |

The above formulation is a semidefinite programming (SDP) problem and can be solved efficiently, e.g., by using the singular value thresholding algorithm [24]. Therefore, it has been commonly adopted as an approximation to the minimum rank problem.

## Iii Main theoretical results

To gain insights into the identifiability issue of matrix completion, we aim to answer the following two related questions: (i) what is achievable minimum rank (the optimal value of problem (2)), and (ii) whether the minimum rank matrix, i.e., the optimal solutions to (2), is unique given a problem set-up. These result will also help to gain insights in the tradeoff in the theoretical properties of other matrix completion formulations, including LRMA and SDP formulations, compared with the original MRMC formulation.

We show that given observations of an matrix: (i) if the minimal rank is less than , then the corresponding solution is unstable: an arbitrary small perturbation of the observed values can make this rank unattainable; (ii) if , then almost surely the solution is not (even locally) unique (cf., [13]). This indicates that except on rare occasions, the MRMC problem cannot have both properties of possessing unique and stable solutions. Consequently, LRMA approaches (also used in [4, 14]) could be a better alternative to the MRMC formulation. Moreover, we argue that the nuclear norm minimization approach is not (asymptotically) statistically efficient (Section III-H).

### Iii-a Rank reducibility

We denote by the optimal value of problem (2). That is, is the minimal rank of an matrix with prescribed elements , . Clearly, depends on the index set and values . A natural question is what values of can be attained. Recall that (2) is a non-convex problem and may have multiple solutions.

In a certain generic sense it is possible to give a lower bound for the minimal rank . Let us consider intersection of a set of low-rank matrices and the affine space of matrices satisfying the observation constraints. Define the (affine) mapping as

As it has been pointed out before, the image of mapping defines the space of feasible points of the MRMC problem (2). It is well known that is a smooth, , manifold with

(10) |

It is said that the mapping intersects transverally if for every either , or and the following condition holds

(11) |

where and denotes the tangent space to at (we will give explicit formulas for the tangent space in equations (18) and (19) below.)

By using a classical result of differential geometry, it is possible to show that for almost every (a.e.) , , the mapping intersects transverally (this holds for every ) (see [13] for a discussion of this result). Transversality condition (11) means that the linear spaces and together span the whole space . Of course this cannot happen if the sum of their dimensions is less than the dimension of . Therefore transversality condition (11) implies the following dimensionality condition

(12) |

In turn the above condition (12) can be written as

(13) |

or equivalently , where

(14) |

That is, if , then the transversality condition (11) cannot hold and hence for a.e. it follows that for all .

Now if intersects transverally at (i.e., condition (11) holds), then the intersection forms a smooth manifold near the point . When , this manifold has dimension greater than zero and hence the corresponding rank solution is not (locally) unique. This leads to the following (for a formal discussion of these results we can refer to [13]).

###### Theorem III.1 (Generic lower bound and non-uniqueness of solutions).

It follows from part (i) of Theorem III.1 that for a.e. . Generically (i.e., almost surely) the following lower bound for the minimal rank holds

(16) |

and (2) may have unique optimal solution only when . Of course such equality could happen only if is an integer number. As Example III.1 below shows, for any integer satisfying (16), there exists an index set such that the corresponding MRMC problem attains the minimal rank for a.e. . In particular this shows that the lower bound (16) is tight. When we have a square matrix , it follows that

(17) |

For and small we can approximate

For example, for and we have , and hence the bound (16) becomes .

###### Example III.1 (Tightness of the lower bound for ).

For consider data matrix of the following form Here, the three sub-matrices , , , of the respective order , and , represent the observed entry values. Cardinality of the corresponding index set is , i.e., here . Suppose that the matrix is nonsingular, i.e., its rows are linearly independent. Then any row of matrix can be represented as a (unique) linear combination of rows of matrix . It follows that the corresponding MRMC problem has (unique) solution of rank . In other words, the rank of the completed matrix will be equal to (the rank of the sub-matrix ) and there will be a unique matrix that achieves this rank. Now suppose that some of the entries of the matrices and are not observed, and hence cardinality of the respective index set is less than , and thus . In that case the respective minimal rank still is , provided matrix is nonsingular, although the corresponding optimal solutions are not unique. In particular, if , i.e., only the entries of matrix are observed, then and the minimum rank is .

### Iii-B Uniqueness of solutions of the MRMC problem

Following Theorem III.1, for a given matrix and the corresponding minimal rank , the question is whether the corresponding solution of rank is unique. Although, the set of such matrices is “thin” (in the sense that it has Lebesgue measure zero), this question of uniqueness is important, in particular for the statistical inference of rank (discussed in Section IV). Available results, based on the so-called Restricted Isometry Property (RIP) for low-rank matrix recovery from linear observations and based on the coherence property for low-rank matrix completion, assert that for certain probabilistic (Gaussian) models such uniqueness holds with high probability. However for a given matrix it could be difficult to verify whether the solution is unique (some sufficient conditions for such uniqueness are given in [8, Theorem 2], we will comment on this below.)

Let us consider the following concept of local uniqueness of solutions.

###### Definition III.1.

We say that an matrix is a locally unique solution of problem (2) if and there is a neighborhood of such that for any , .

Note that rank is a lower semicontinuous function of matrix, i.e., if is a sequence of matrices converging to matrix , then . Therefore local uniqueness of actually implies existence of the neighborhood such that for all , , i.e., that at least locally problem (2) does not have optimal solutions different from . The definition (III.1) is closely related to the finitely rank- completability condition introduced in [8], which assumes that the MRMC problem has a finite number of rank solutions. Of course if problem (2) has a non locally unique solution of rank , then the finitely rank- completability condition cannot hold.

We now will introduce some constructions associated with the manifold of matrices of rank . There are several equivalent forms how the tangent space to the manifold at can be represented. In one way it can be written as

(18) |

In an equivalent form this tangent space can be written as

(19) |

where is an matrix of rank such that (referred to as a left side complement of ) and is an matrix of rank such that (referred to as a right side complement of ). We also use the linear space of matrices orthogonal (normal) to at , denoted by . A matrix is orthogonal to at if and only if for all . By (18) this means that

Since and matrices and are arbitrary, it follows that the normal space can be written as

(20) |

###### Definition III.2 (Well-posedness condition).

We say that a matrix is well-posed, for problem (2), if and the following condition holds

(21) |

Condition (21) (illustrated in Figure 1) is a natural condition having a simple geometrical interpretation. Intuitively, it means that the null space of the observation operator does not have any non-trivial matrix that lies in the tangent space of low-rank matrix manifold. Hence, there cannot be any local deviations from the optimal solution that still satisfy the measurement constraints. This motivates us to introduce the well-posedness condition that guarantees a matrix to be locally unique solution. Note that this is different from the so-called null space property [25] or the descent cone condition [4], which are for recovering sparse vectors, since the geometry therein is for sparse vectors whereas here we are dealing with manifold formed by low-rank matrices.

Now we can give sufficient conditions for local uniqueness:

###### Theorem III.2 (Sufficient conditions for local uniqueness).

###### Remark III.1.

Suppose that condition (21) does not hold, i.e., there exists nonzero matrix . This means that there is a curve starting at and tangential to , i.e., and . Of course if moreover for all near , then solution is not locally unique. Although this is not guaranteed, i.e., the sufficient condition (21) may be not necessary for local uniqueness of the solution , violation of this condition implies that solution is unstable in the sense that for some matrices close to the distance is of order . In that sense, the well-posedness condition is necessary for local stability of solutions.

### Iii-C Verifiable form of well-posedness condition

Below we present an equivalent form of the well-posedness condition that can be verified algebraically. By Theorem III.2 we have that if matrix is well-posed, then is a locally unique solution of problem (2). Note that condition (21) implies that . That is, condition (21) implies that or equivalently . By Theorem III.1 we have that if , then the corresponding optimal solution cannot be locally unique almost surely. Note that since the space is orthogonal to the space , by duality arguments condition (21) is equivalent to the following condition

(22) |

By using formula (19) it is also possible to write condition (21) in the following form

(23) |

where is a left side complement of and is a right side complement of . Recall that Column vector of matrix corresponding to component of vector , is , where is the -th column of matrix and is the -th row of matrix . Condition (23) means that the column vectors , , are linearly independent. Then we obtain the following verifiable condition for checking the well-posedness of a given solution:

###### Theorem III.3 (Equivalent condition of well-posedness).

Matrix satisfies condition (21) if and only if for any left side complement and right side complement of , the column vectors , , are linearly independent.

A consequence of the theorem is that if is well-posed, then necessarily , since vectors have dimension . Since , this is equivalent to . That is, the well-posedness cannot happen if . This of course is not surprising in view of discussion of Section III-A.

Theorem III.3 also implies the following necessary condition for well-posedness of in terms of the pattern of the index set , which is related to the completability condition in [8] that each row and each column has at least observations. If matrix is well-posed for problem (2), then at each row and each column of there are at least elements of the index set . Indeed, suppose that in row there are less than elements of . This means that the set has cardinality greater than . Let be a left side complement of and be a right side complement of . Since rows of are of dimension , we have then that vectors , , are linearly dependent, i.e., for some , not all of them zero. Then

(24) |

This contradicts the condition for vectors , , to be linearly independent. Similar arguments can be applied to the columns of matrix . This necessary condition for well-posedness is not surprising since if there is a row with less than elements of , then this row in not uniquely defined in the corresponding rank solution (cf., [8]). However, although necessary, the condition for the index set to have at each row and each column at least elements is not sufficient to ensure well-posedness as shown by Theorem III.5 below. Note that by definition the matrices and are of full rank.

### Iii-D Generic nature of the well-posedness

In a certain sense the well-posedness condition is generic, as we explain below. Denote by and the respective sets of matrices of rank . Consider the set viewed as a subset of , and mapping defined as

Note that the sets and are open and connected, and hence the set is open and connected, and the components of mapping are polynomial functions.

Let be the Jacobian of mapping . That is, is matrix of partial derivatives of taken with respect to a specified order of the components of the corresponding matrices. Let us consider the following concept associated with rank and index set (cf., [26]).

###### Definition III.3.

We refer to

(25) |

as the characteristic rank of mapping and say that is a regular point of if . We say that is regular if is regular for some .

Since is linear, the Jacobian is the same for all , i.e., for any and . Hence if a point is regular for some , then is regular for any . Therefore regularity actually is a property of points . Since for , and the dimension of manifold is it follows that where

(26) |

###### Theorem III.4.

The following holds. (i) Almost every point is regular. (ii) The set of regular points forms an open subset of . (iii) For any regular point , the corresponding matrix satisfies the well-posedness condition (21) if and only if the characteristic rank is equal to . (iv) If and a point is regular, then for any in a neighborhood of there exists such that .

The significance of Theorem III.4 is that this shows that for given rank and index set , either in which case a.e. satisfies the well-posedness condition (21), or in which case condition (21) does not hold for all and generically rank solutions are not locally unique.

We have that a necessary condition for is that each row and each column of the considered matrix has at least observed entries. Another necessary condition is for the index set to be irreducible (see Theorem III.5). Whether these two conditions are sufficient for to hold remains an open question, but numerical experiments, reported in Section V, indicate that in a certain probabilistic sense chances of occurring not well posed solution are negligible when is slightly less than .

### Iii-E Global uniqueness of solutions for special cases

In some rather special cases it is possible to give verifiable conditions for global uniqueness of minimum rank solutions. The following conditions are straightforward extensions of well known conditions in Factor Analysis (cf., [27, Theorem 5.1] ).

###### Assumption III.1.

Suppose that: (i) for a given index , there exist index sets and such that , , and and , (ii) the submatrix of corresponding to rows and columns is nonsingular.

For example, for part (i) of the above assumption means existence of indexes and such that .

###### Proposition III.1.

Suppose that Assumption III.1 holds for an index . Then the minimum rank , and for any matrix such that it follows that .

Clearly part (ii) of Assumption III.1 implies that . The other result of the above proposition follows by observing that the submatrix of corresponding to rows and columns has rank and hence zero determinant, and applying Shur complement for the element . Note that provided the part (i) holds, part (ii) is generic in the sense that it holds for a.e. .

If Assumption III.1 holds for every , then the uniqueness of the solution follows. This is closely related to [8, Theorem 2], but is not the same. It is assumed in [8] that every column of has observed entries. For example, consider matrix with 3 observed entries, . The only unobserved entry, corresponding to the index , satisfies Assumption III.1 and rank one matrix, with all entries equal 1, is the unique solution of the MRMC problem. On the other hand the first column of matrix has only one observed entry.

###### Remark III.2.

This result has been observed in an much earlier paper by Wilson and Worcester [28], where an example was constructed of two symmetric matrices of rank 3 with the same off-diagonal and different diagonal elements. If we define the index set as , then this can be viewed as an example of two different locally unique solutions of rank 3. Note that here and . That is and generically (almost surely) rank cannot be reduced below . We will discuss this example further in Section V.

### Iii-F Identifiable

Our results can also be used to determine wether observation patterns is identifiable. First note that uniqueness of the minimum rank solution is invariant with respect to permutations of rows and columns of matrix . This motivates to introduce the following definition.

###### Definition III.4.

We say that the index set is reducible if by permutations of rows and columns, the set can be represented as the union of two disjoined sets and for some and . Otherwise we say that is irreducible.

Reducibility of the index set means that by permutations of rows and columns, matrix can be represented in the block diagonal form

(27) |

where matrices and are of order and , respectively, with observed entries , , and , . Some entries of matrices and can also be zero if the corresponding entries of matrix are zeros.

###### Theorem III.5 (Reducible index set).

If the index set is reducible, then any minimum rank solution is not locally (and hence globally) unique.

As it was shown in Theorem III.2, if is not locally unique, then it cannot be well-posed. Therefore if the index set is reducible, then any minimum rank solution is not well-posed. Of course even if is reducible, it still can happen that in each row and column there are at least elements of the index set . That is, the condition of having elements of the index set in each row and column is not sufficient to ensure the well-posedness property.

###### Remark III.3.

Reducibility/irreducibility of the index set can be verified in the following way. Consider the undirected graph with the set of vertices , and edges between two vertices iff or . Then is irreducible iff has only one connected component. A connected component of is a subgraph in which any two vertices are connected to each other by paths, and which is connected to no additional vertices in the supergraph . There are algorithms of running time which can find every vertex that is reachable from a given vertex of , and hence to determine a connected component of , e.g., the well known breadth-first search algorithm [29, Section 22.2]. Note that the number of vertices in is , which could be much smaller than .

### Iii-G Uniqueness of rank one solutions

In this section we discuss uniqueness of rank one solutions of the MRMC problem (2). We show that in case of the minimum rank one, irreducibility of is sufficient for the global uniqueness. We assume that all , , and that every row and every column of the matrix has at least one element . Let be a solution of rank one of problem (2), i.e., there are nonzero column vectors and such that with .

Recall that permutations of the components of vector corresponds to permutations of the rows of the respective rank one matrix, and permutations of the components of vector corresponds to permutations of the columns of the respective rank one matrix. It was shown in Theorem III.5 that if the index set is reducible, then solution cannot be locally unique. In case of rank one solution the converse of that also holds.

###### Theorem III.6 (Global uniqueness for rank one solution).

Suppose that is irreducible, for all , and every row and every column of the matrix has at least one element , . Then any rank one solution is globally unique.

### Iii-H Semidefinite relaxations

Consider the trace minimization problem (9) (which can be viewed as a generalized version of the nuclear norm minimization problem), and assume that the matrix is positive definite. The (Lagrangian) dual of problem (9) is the problem

(28) |

For , with , problem (28) can be written (note that for ) as

(29) |

We have the following uniqueness results for the SDP approach, which is a consequence of (cf., [30, Theorem 5.2] and [13, Proposition 8]) (we also provide justification in the appendix):

###### Theorem III.7.

However, we have the following observation, which comes as a consequence of [13, Theorem 2]:

###### Remark III.4.

Consider the minimum trace (MT) problem (7). Suppose that the matrix is observed with errors: , where

is random matrix such that

converges in distribution to a random matrixwhose entries have zero means and finite positive second order moments (we discuss a similar model for the MRMC in section

III-I below). Let and be optimal solutions of the MT problems of the form (7) for matrices and , respectively. Then under mild regularity conditions
Comments

There are no comments yet.