# Binary component decomposition Part II: The asymmetric case

This paper studies the problem of decomposing a low-rank matrix into a factor with binary entries, either from {± 1} or from {0,1}, and an unconstrained factor. The research answers fundamental questions about the existence and uniqueness of these decompositions. It also leads to tractable factorization algorithms that succeed under a mild deterministic condition. This work builds on a companion paper that addresses the related problem of decomposing a low-rank positive-semidefinite matrix into symmetric binary factors.

## Authors

• 14 publications
• 12 publications
• ### Binary Component Decomposition Part I: The Positive-Semidefinite Case

This paper studies the problem of decomposing a low-rank positive-semide...
07/31/2019 ∙ by Richard Kueng, et al. ∙ 0

• ### CUR Decompositions, Approximations, and Perturbations

03/22/2019 ∙ by Keaton Hamm, et al. ∙ 0

• ### Low-Rank Approximation from Communication Complexity

In low-rank approximation with missing entries, given A∈R^n× n and binar...
04/22/2019 ∙ by Cameron Musco, et al. ∙ 0

• ### Low-rank Matrix Completion in a General Non-orthogonal Basis

This paper considers theoretical analysis of recovering a low rank matri...
12/14/2018 ∙ by Abiy Tasissa, et al. ∙ 0

• ### An improved analysis and unified perspective on deterministic and randomized low rank matrix approximations

We introduce a Generalized LU-Factorization (GLU) for low-rank matrix ap...
10/01/2019 ∙ by James Demmel, et al. ∙ 0

• ### Study of Compressed Randomized UTV Decompositions for Low-Rank Matrix Approximations in Data Science

In this work, a novel rank-revealing matrix decomposition algorithm term...
06/08/2019 ∙ by M. Kaloorazi, et al. ∙ 0

• ### Inference for linear forms of eigenvectors under minimal eigenvalue separation: Asymmetry and heteroscedasticity

A fundamental task that spans numerous applications is inference and unc...
01/14/2020 ∙ by Chen Cheng, et al. ∙ 10

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1. Motivation

Constrained matrix decompositions are among the basic methods for unsupervised data analysis. These techniques play a role in many scientific and engineering fields, ranging from environmental engineering [PT94] and neuroscience [OF96] to signal processing [Com94] and statistics [ZHT06]. Constrained factorizations are powerful tools for identifying latent structure in a matrix; they also support data compression, summarization, and visualization.

The literature contains a number of frameworks [TB99, CDS02, Tro04, Sre04, Wit10, Jag11, Bac13, Ude15, BE16, Bru17, HV19] for thinking about constrained matrix factorization and for developing algorithms that pursue these factorizations. Nevertheless, we still lack theory that fully justifies these approaches. For instance, researchers have only attained a partial understanding of which factorization models are identifiable and which ones we can compute provably using efficient algorithms.

The purpose of this paper and its companion [KT19] is to develop foundational results on factorization models that we call binary component decompositions. In these models, one (or both) of the factors takes values in the set or in the set . Binary component decompositions are appropriate when the latent factors reflect an exclusive choice. From a mathematical perspective, these constrained factorizations also happen to be among the easiest ones to understand.

In this second paper, we consider the problem of factorizing a rectangular matrix into a binary factor and an unconstrained matrix of weights. We develop results on existence, uniqueness, tractable computation, and robustness to gross errors. Our analysis builds heavily on the work in the companion paper [KT19], which treats the problem of decomposing a positive-semidefinite matrix into symmetric binary factors.

### 1.1. Notation

We rely on standard notation from linear algebra and optimization. Scalars are written with lowercase Roman or Greek letters (); lowercase bold letters (

) denote (column) vectors; uppercase bold letters (

) denote matrices. We reserve calligraphic letters () for sets. The symbol suppresses universal constants.

Throughout, and are natural numbers. We work in the real linear spaces and equipped with the standard inner product and the associated norm topology featuring . The standard basis vector has a one in the th coordinate and zeros elsewhere, while is the vector of ones; the dimension of these vectors depends on context. The map transposes a vector or matrix. The binary operator

is the Schur (i.e., componentwise) product of vectors. The closed and open probability simplices are the sets

 Δr={τ∈Rr:τi≥0 and r∑i=1τi=1}andΔ+r={τ∈Rr:τi>0 and r∑i=1τi=1}.

We write for the linear space of symmetric real matrices. The symbol

denotes the identity matrix, and

denotes the matrix of ones; their dimensions are determined by the context. The dagger refers to the Moore–Penrose pseudoinverse. A positive-semidefinite (psd) matrix is a symmetric matrix that satisfies for all vectors with compatible dimension. The statement means that is psd, and means that is strictly positive definite, i.e. for all vectors with compatible dimension.

## 2. Sign component decomposition and binary component decomposition

We begin with a short discussion of the singular-value decomposition and its properties (Section

2.1). Afterward, we introduce the two factorizations that we treat in this paper, the sign component decomposition (Section 2.2) and the binary component decomposition (Section 2.3). We present our main results on situations where these factorizations are uniquely determined and when they can be computed using efficient algorithms. An outline of the rest of the paper appears in Section 2.5.

### 2.1. The singular-value decomposition

We begin with the singular-value decomposition (SVD), the royal emperor among all matrix factorizations. Let be a rectangular matrix. For some natural number , we can decompose this matrix as

 B=r∑i=1σiuivti. (2.1)

In this expression, and are orthonormal families of left and right singular vectors associated with the positive singular values . We can also convert the decomposition (2.1) into a matrix factorization:

 (2.2)

The matrices and are orthonormal; that is, and .

The singular-value decomposition is intimately connected to the problem of finding a best low-rank approximation of a matrix [Mir60]. Indeed, for any unitarily invariant norm ,

 minrankL=k∥B−L∥=∥∥ ∥∥B−k∑i=1σiuivti∥∥ ∥∥%foreach$k=1,…,r$.

This variational property has a wide range of consequences, both theoretical and applied.

The singular-value decomposition also holds a distinguished place in statistics because of its connection with principal component analysis [Jol02]. Given a data matrix with standardized111A vector is standardized if its entries sum to zero and its Euclidean norm equals one. rows, we can perform a singular-value decomposition to express , where . In this setting, the left singular vectors are called principal components, the directions in which the columns of exhibit the most variability. The entries of the matrix are called weights or loadings; they are the coefficients with which we combine the principal components to express the original data points.

On the positive side of the ledger, the singular-value decomposition (2.1)–(2.2) always exists, and it is uniquely determined when the (nonzero) singular values are distinct. Moreover, we can compute the singular-value decomposition, up to a fixed (high) accuracy, by means of highly refined algorithms, in polynomial time.

On the negative side, we cannot impose constraints on the singular vectors to enforce prior knowledge about the data. Second, we generally cannot assign an interpretation or meaning to the singular vectors, without committing the sin of reification. Moreover, the orthogonality of singular vectors may not be an appropriate constraint in applications. Structured matrix factorizations are designed to address one or more of these shortcomings.

### 2.2. Sign component decomposition

In this project, we consider matrix factorization models where one of the factors is required to take binary values. In this section, we treat the case where the entries of the binary factor are limited to the set . In Section 2.3, we turn to the case where the entries are drawn from the set .

#### 2.2.1. The decomposition

As before, assume that is a rectangular matrix. We seek a decomposition of the form

 B=SWtwhereS∈{±1}n×randW∈Rm×r. (2.3)

This factorization can also be written in vector notation as

 B=r∑i=1siwtiwheresi∈{±1}nandwi∈Rm. (2.4)

We call (2.3)–(2.4) an (asymmetric) sign component decomposition of the matrix . The left factor is called the sign component; its columns are also called sign components. The right factor is unconstrained; its entries are called weights or loadings. See Figure 2.1 for an illustration.

It is not hard to show that each matrix admits a plethora of distinct sign component decompositions (2.3) where the inner dimension is ; see Proposition 4.1. It is more interesting to consider a low-rank matrix and to search for minimal decompositions, those where the inner dimension of the factorization (2.3) equals the rank of .

###### Remark 2.1 (Matrix sign function).

The sign component decomposition must not be confused with the matrix sign function, which is a spectral computation related to the polar factorization [Hig08, Chap. 5].

#### 2.2.2. Schur independence

The sign component decomposition (2.3)–(2.4) has a combinatorial quality, which suggests that it might be hard to find. Remarkably, there is a large class of matrices for which we can tractably compute a minimal sign component decomposition. The core requirement is that the sign components must be somewhat different. The following definition [LP96, KT19] encapsulates this idea.

###### Definition 2.2 (Schur independence of sign vectors).

A set of sign vectors is Schur independent when the set

 {e}∪{si⊙sj:1≤i

By extension, we also say that the sign matrix is Schur independent when its columns form a Schur independent set.

Let us summarize the basic properties of Schur independent sets [LP96, Tro18, KT19].

###### Fact 2.3 (Schur independence).

Assume that the set of sign vectors is Schur independent. We have the following consequences.

1. The family is linearly independent.

2. Each subset of is Schur independent.

3. For any choice of signs, the set remains Schur independent.

4. The cardinality of the set satisfies .

5. We can determine whether or not is Schur independent in polynomial time.

Schur independence is best understood as a kind of “general position” property for sign vectors. Roughly speaking, almost all collections of sign vectors are Schur independent, provided that the cardinality meets the bound stated in Fact 2.3(4). This intuition is quantified in the paper [Tro18].

#### 2.2.3. Computation

The main result of this paper is an algorithm for computing the minimal asymmetric sign component decomposition of a low-rank matrix. This algorithm succeeds precisely when the sign component is Schur independent. Moreover, this condition is sufficient to ensure that the sign component decomposition is essentially unique.

###### Theorem I (Sign component decomposition).

Let be a matrix that admits a sign component decomposition where

1. The sign matrix is Schur independent;

2. The weight matrix has full column rank.

Then the minimal sign component decomposition (with inner dimension ) is determined up to simultaneous sign flips and permutations of the columns of the factors. Algorithm 1 computes this decomposition in time polynomial in .

The uniqueness claim is a consequence of Theorem 4.4, while the computational claim follows from Theorem 5.1.

Theorem I identifies a rich set of factorizable matrices for which exact identification is always tractable and essentially unique. Moreover, existing denoising techniques allow us to compute the factorization in the presence of gross errors; see Section 7. Small perturbations appear more challenging; we will study this problem in future work.

It is surprising that the exact sign component decomposition is tractable. Most existing approaches to structured matrix factorization only produce approximations, and many of these approaches lack rigorous guarantees. The companion paper [KT19, Sec. 8] contains a discussion of the related work.

### 2.3. Binary component decomposition

The asymmetric sign component decomposition also serves as a primitive that allows us to compute other discrete matrix factorizations. In this section, we turn to the problem of producing a decomposition where one component takes values in the set .

#### 2.3.1. The decomposition

Suppose that is a rectangular matrix. We consider a decomposition of the form

 C=ZWtwhereZ∈{0,1}n×randW∈Rm×r. (2.5)

The vector formulation of this decomposition is

 C=r∑i=1ziwtiwherezi∈{0,1}nandwi∈Rm. (2.6)

We refer to (2.5)–(2.6) as an (asymmetric) binary component decomposition of the matrix . The left factor is called the binary component, and its columns are also called binary components. The right factor is unconstrained; we refer to it as a weight matrix.

Every matrix admits a superabundance of distinct binary component decompositions (2.5) where the inner dimension . We focus on the case where the matrix has low rank, and the factorization is minimal; that is, the inner dimension in (2.5) equals the rank of .

#### 2.3.2. Schur independence

We can reduce the problem of computing a binary component decomposition to the problem of computing a sign component decomposition.

To do so, we first observe that there is an affine map that places the binary vectors and sign vectors in one-to-one correspondence:

 F:{0,1}n→{±1}nwhereF:z↦2z−eandF−1:s↦12(s+e). (2.7)

We can extend the map to a matrix by applying it to each column. This correspondence suggests that there should also be a concept of Schur independence for binary vectors. Here is the notion that suits our purposes.

###### Definition 2.4 (Schur independence of binary vectors).

A set of binary vectors is Schur independent when the set

 {e}∪{zi:1≤i≤r}∪{zi⊙zj:1≤i

By extension, we say that a binary matrix is Schur independent when its columns compose a Schur independent set.

The following result [KT19, Prop. 6.3] describes the precise connection between the two flavors of Schur independence.

###### Fact 2.5 (Kueng & Tropp).

The binary matrix is Schur independent if and only if the sign matrix is Schur independent.

#### 2.3.3. Computation

With these definitions at hand, we can state our main result on binary component decompositions.

###### Theorem II (Binary component decomposition).

Let be a matrix that admits a binary component decomposition where

1. The binary matrix is Schur independent;

2. The weight matrix has full column rank.

Then the minimal binary component decomposition (with inner dimension ) is determined up to simultaneous permutation of the columns of the factors. Algorithm 2 computes the decomposition in time polynomial in .

The uniqueness claim is established in Theorem 6.4, and the computational claim appears in Theorem 6.5.

### 2.4. The planted sign basis problem

Theorem I and Theorem II allow us to solve some interesting combinatorial problems in linear algebra.

###### Problem 2.6 (Planted sign basis).

Let be an -dimensional subspace that admits a sign basis:

 L=span{s1,…,sr}where % each si∈{±1}n.

Given the subspace , find a sign basis for the subspace.

To clarify, we can assume that the problem data is a matrix whose range equals the -dimensional subspace . We must output a set of sign vectors that generates the subspace. The brute force approach may require us to sift through around families of sign vectors. Is it possible to solve the problem more efficiently?

Let us outline a solution for Problem 2.6 in the case where has a sign basis that is Schur independent. This is a rather mild deterministic condition, provided that the dimension of the subspace satisfies . The hypothesis also guarantees that the basis is determined up to permutation and sign flips, per Theorem I.

Here is how we solve the problem. Let be a matrix whose range coincides with the subspace . A Schur independent set is linearly independent, so we can write the matrix in the form , where and the weight matrix has full column rank. As a consequence, we can apply Algorithm 1 to the matrix to obtain a sign component decomposition . Theorem I ensures that the columns of coincide with the columns of up to sign flips and permutations. In other words, the columns of compose the (unique) sign basis that generates . In summary, we can solve Problem 2.6 for any subspace that is spanned by a Schur independent family of sign vectors.

A similar procedure, using Algorithm 2, allows us to solve a variant of Problem 2.6 where we seek a planted binary basis for a subspace. Indeed, if a subspace is generated by a Schur independent family of binary vectors, then we can identify the basis up to permutation.

We continue with a discussion about symmetric sign component decompositions in Section 3. In Section 4, we develop basic results about existence and uniqueness of asymmetric sign component decompositions. Section 5 explains how to compute an sign component decomposition. We turn to binary component decomposition in Section 6. Finally, in Section 7, we state some results on robustness of sign component decomposition which we prove in the appendices. For a discussion of related work, see the companion paper [KT19, Sec. 8].

## 3. Symmetric sign component decomposition

This section contains a summary of the principal results from the companion paper [KT19]. These results play a core role in our study of asymmetric factorizations.

### 3.1. Signed permutations

Matrix factorizations are usually not fully determined because they are invariant under some group of symmetries. For example, consider the decomposition of a psd matrix as the outer product of two symmetric factors:

 A=BBt=(BQ)(BQ)tfor each% orthogonal Q.

Each of the factorizations on the right is equally valid, because there is no constraint that forbids rotations.

For binary component decompositions, permutations compose the relevant symmetry group.

###### Definition 3.1 (Permutation).

A permutation on letters is an element of the symmetric group . A permutation acts on via the linear map . This linear map can be represented by the permutation matrix whose entries take the form where and are zero otherwise. A permutation matrix is orthogonal: .

For sign component decompositions, the signed permutations make up the relevant symmetry group.

###### Definition 3.2 (Signed permutation).

A signed permutation on letters is a pair consisting of a permutation on letters and a sign vector . The signed permutation acts on via the linear map . This linear map can also be represented by the signed permutation matrix whose entries satisfy when and are otherwise zero. Each signed permutation matrix is orthogonal.

### 3.2. Symmetric sign component decomposition

In the companion paper [KT19]

, we explored the problem of computing a (symmetric) sign component decomposition of a correlation matrix. This research provides the foundation for the asymmetric sign component decomposition. Let us take a moment to present the principal definitions and results from the associated work.

Let be a correlation matrix; that is, is psd with all diagonal entries equal to one. We say that has a symmetric sign component decomposition when

 A=Sdiag(τ)St% whereS∈{±1}n×r and τ∈Δ+r. (3.1)

In vector form,

 A=r∑i=1τisistiwhere% si∈{±1}nand(τ1,…,τr)∈Δ+r.

The sign matrix is called the sign component, while the positive diagonal matrix, , is a list of convex coefficients. Not all correlation matrices admit a symmetric sign component decomposition, nor does the factorization need to be uniquely determined; see [KT19] for a full discussion.

The situation improves markedly when the sign component is Schur independent. In this case, the sign component decomposition is essentially unique, and we can compute it by means of an efficient algorithm [KT19, Thm. I].

###### Fact 3.3 (Kueng & Tropp).

Let be a correlation matrix that admits a sign component decomposition:

 A=Sdiag(τ)St% whereS∈{±1}n×r is Schur independent and τ∈Δ+r.

Then the sign component decomposition of is determined up to signed permutation. Moreover, with probability one, Algorithm 3 computes the sign component decomposition. That is, the output is a pair where the sign matrix and the convex coefficients (), for a signed permutation matrix .

A major ingredient in the proof of Fact 3.3 is a characterization of the set of correlation matrices that are generated by a Schur independent family of sign vectors [KT19, Thm. 3.6].

###### Fact 3.4 (Kueng & Tropp).

Suppose that is a Schur independent sign matrix, and let be the orthogonal projector onto . Then

 {Sdiag(τ)St:τ∈Δr}={X∈Hn:trace(PX)=n% {\rm and} diag(X)=e {\rm and} X≽0}. (3.2)

Fact 3.4 is a powerful tool for working with sign component decompositions. Indeed, we can compute the projector onto the range of a Schur independent sign matrix directly from any particular correlation matrix with . As a consequence, the identity (3.2) provides an alternative representation for the set of all correlation matrices with sign component , which allows us to optimize over this set. Fact 3.4 also plays a critical role in our method for computing an asymmetric sign component decomposition.

## 4. Existence and uniqueness of the asymmetric sign component decomposition

In this section, we begin our investigation of the asymmetric sign component decomposition. We lay out some of the basic questions, and we start to deliver the answers.

### 4.1. Questions

This paper addresses four fundamental problems raised by the definition (2.3)–(2.4) of the asymmetric sign component decomposition:

1. Existence: Which matrices admit a sign component decomposition?

2. Uniqueness: When is the sign component decomposition unique, modulo symmetries?

3. Computation: How can we find a sign component decomposition in polynomial time?

4. Robustness: How can we find a sign component decomposition from a noisy observation?

This section treats the structural questions about existence and uniqueness of the sign component decomposition, and Section 5 explains how we can compute the factorization. Last, Section 7 describes some situations where we can extract a sign component decomposition from imperfect data.

### 4.2. Existence

We quickly dispatch the first question, which concerns the existence of asymmetric sign component decompositions.

###### Proposition 4.1 (Sign component decomposition: Existence).

Every matrix admits a sign component decomposition (2.3) with inner dimension .

###### Proof.

Let be a nonsingular matrix of signs. Define the second factor . ∎

As an aside, we remark that nonsingular sign matrices are ubiquitous. Indeed, a uniformly random element of is nonsingular with exceedingly high probability [Tik18].

Proposition 4.1 ensures that every matrix has an exorbitant number of sign component decompositions. Therefore, we need to burden the factorization with extra conditions before it is determined uniquely. We intend to focus on minimal factorizations, where the target matrix has rank , and the number of sign components coincides with the rank.

### 4.3. Symmetries

Like many other matrix factorizations, the sign component decomposition has some symmetries that we can never resolve. Before we can turn to the question of uniqueness, we need to discuss invariants of the factorization.

Signed permutations preserve the sign component decomposition (2.3)–(2.4) in the following sense. Suppose that has the sign component decomposition

For a signed permutation on letters with associated signed permutation matrix , we have

 B=SΠΠ−1Wt=(SΠ)(WΠ)t =[ξ1sπ(1)…ξrsπ(r)][ξ1wπ(1)…ξrwπ(r)]t.

Observe that remains a sign matrix. Therefore, and are both sign component decompositions of .

We have no cause to prefer one of the sign component decompositions induced by a signed permutation over the others. Thus, it is appropriate to treat them all as equivalent.

###### Definition 4.2 (Sign component decomposition: Equivalence).

Suppose that and are two sign component decompositions (2.3) with the same inner dimension . We say that the decompositions are equivalent if there is a signed permutation matrix for which and .

Alternatively, consider two sign component decompositions and with the same number of terms. The decompositions are equivalent if there is a signed permutation on letters for which and for each .

### 4.4. The role of Schur independence

As we have just seen, signed permutations preserve the class of sign component decompositions of a given matrix. Meanwhile, the proof of Proposition 4.1 warns us that we can sometimes map one sign component decomposition to an inequivalent decomposition via an invertible transformation. Remarkably, we can preclude the latter phenomenon by narrowing our attention to Schur independent sign matrices. In this case, sign permutations are the only invertible transformations that respect the sign structure.

###### Proposition 4.3 (Schur independence: Transformations).

Let be a Schur independent sign matrix, and let

be an invertible matrix. Then

is a sign matrix if and only if is a signed permutation.

###### Proof.

If is a signed permutation, then it is immediate that is a sign matrix. The reverse implication is the more interesting fact.

Introduce notation for the columns of the matrices under discussion:

For each index , the th column of the matrix satisfies

 ~sk=Sqk=r∑i=1⟨ei, qk⟩si.

By assumption, is a sign vector, so

 e=~sk⊙~sk=r∑i,j=1⟨ei, qk⟩⟨ej, qk⟩(si⊙sj)=(r∑i=1⟨ei, qk⟩2)e+2∑i

Schur independence of the matrix ensures that the family is linearly independent. As a consequence,

 r∑i=1⟨ei, qk⟩2=1% and⟨ei, qk⟩⟨ej, qk⟩=0when i≠j.

Since solves this quadratic system, it must be a signed standard basis vector: for a sign and an index . Since the matrix is invertible, it must be the case that is a permutation on letters. It follows that is a signed permutation. ∎

### 4.5. Uniqueness

With this preparation, we can delineate circumstances where the (minimal) sign component decomposition of a low-rank matrix is unique up to equivalence.

###### Theorem 4.4 (Sign component decomposition: Uniqueness).

Consider a matrix that admits a sign component decomposition . Assume that

1. The sign matrix is Schur independent;

2. The weight matrix has full column rank.

Then all minimal sign component decompositions of (with inner dimension ) are equivalent.

###### Proof.

The sign matrix has full column rank because it is Schur independent (Fact 2.3(1)), while the weight matrix has full column rank by assumption. We discover that the matrix has rank . Therefore, every sign component decomposition of has inner dimension at least , and the distinguished decomposition has the minimal inner dimension.

Suppose that is another sign component decomposition with inner dimension . Since has rank , both factors and must have full column rank. As a consequence, there is an invertible transformation for which . Since is a Schur independent sign matrix and is a sign matrix, Proposition 4.3 forces to be a signed permutation. Now, we have the chain of identities

Since the matrix has full column rank, we can cancel to see that . The signed permutation is orthogonal, so it follows that .

To summarize, we have been given two sign component decompositions with inner dimension . We have shown that they are related by and for a signed permutation . Therefore, the two decompositions are equivalent. ∎

Theorem 4.4 describes conditions under which the minimal sign component decomposition of a matrix is uniquely determined. It is natural to demand that both the left and the right factors have full column rank. The geometry of the factorization problem dictates the stronger requirement that the sign matrix is Schur independent. As we have discussed, most families of sign vectors are Schur independent, so this condition holds for a rich class of matrices.

## 5. Computation of the asymmetric sign component decomposition

In this section, we derive and justify Algorithm 3, which computes the asymmetric sign component decomposition of a matrix whose sign component is Schur independent. We establish the following result.

###### Theorem 5.1 (Sign component decomposition: Computation).

Consider a matrix that admits a sign component decomposition . Assume that

1. The sign matrix is Schur independent;

2. The weight matrix has full column rank.

Then, with probability one, Algorithm 1 identifies the minimal sign component decomposition, up to signed permutation. That is, the output is a pair where and for a signed permutation .

We prove Theorem 5.1 below in Section 5.2.

### 5.1. Factorization and semidefinite programming

Although constrained matrix factorization is viewed as a challenging problem, certain aspects are simpler than they appear. In particular, we can expose properties of the components of a matrix factorization by means of a semidefinite constraint.

###### Fact 5.2 (Factorization constraint).

Let be a matrix. The semidefinite relation

 [XBBtY]≽0 (5.1)

enforces a factorization of in the following sense.

1. If , then (5.1) holds when and .

2. If (5.1) holds, then we can decompose into factors and that satisfy and . The inner dimension meets the bound .

We omit the easy proof, because we do not use this result directly.

The factorization constraint (5.1) does not give us direct access to the factors and . Nevertheless, we can place restrictions on the variables and to limit the possible values that the factors and can take. If the conditions are strong enough, it is sometimes possible to determine the factors completely, modulo symmetries.

###### Example 5.3 (From SVD to eigenvalue decomposition).

Let be a matrix. Consider the semidefinite program

 minimizeX∈Hn,Y∈Hm12(trace(X)+trace(Y))subject to[XBBtY]≽0.

Every minimizer takes the form and where is a singular value decomposition. We can find the left and right singular vectors of

by computing the eigenvalue decompositions of

and . As a side note, the minimal value of the optimization problem is the Schatten 1-norm (i.e., the sum of singular values) of the matrix .

As we will see, a more elaborate version of the procedure in Example 5.3 allows us to compute an asymmetric sign component decomposition. To develop this approach, we require ingredients (Fact 3.3 and Fact 3.4) from our work on symmetric sign component decomposition.

### 5.2. Overview of algorithm and proof of Theorem 5.1

Given an input matrix with a Schur independent sign component , our aim is to find the (unknown) asymmetric sign component decomposition. We reduce this challenge to the solved problem of computing a symmetric sign component decomposition of a correlation matrix. In this section, we outline the procedure, along with the proof of Theorem 5.1. Algorithm 1 encapsulates the computations, and some details of the argument are postponed to the next sections.

The first step is to construct a correlation matrix whose symmetric sign component decomposition has the same sign factor as the input matrix . To that end, construct the orthogonal projector onto the range of . Then solve the semidefinite program (SDP)

 minimizeX∈Hn,Y∈Hm trace(Y) (5.2) subject to trace(PX)=n and diag(X)=e; [XBBtY]≽0.

Fact 5.2 shows that the semidefinite constraint in (5.2) links the variables and to a factorization of . Meanwhile, courtesy of Fact 3.4, the equality constraints in (5.2) force the variable to be a correlation matrix whose range equals the range of . The following lemma packages these claims.

###### Proposition 5.4 (Factorization SDP).

Instate the assumptions of Theorem 5.1. Let be the unique minimizer of the optimization problem (5.2). Then where is the sign component of and .

We prove a more detailed version of Proposition 5.4 below in Section 5.4.

The next step is to extract the sign component of the correlation matrix that solves (5.2). According to Proposition 5.4, the correlation matrix meets the requirements of Fact 3.3. Therefore, we can invoke Algorithm 3, the symmetric sign component decomposition method, to obtain a factorization

 X⋆=~Sdiag(~τ)~Stwhere~S=SΠ for% a signed permutation Π.

We cannot resolve the signed permutation, but the computed sign component is equivalent with the designated sign component .

To complete the sign component decomposition, it remains to determine the weight matrix. We may do so by solving the linear system

 find ~W∈Rm×rsubject toB=~S~Wt.

The solution exists because . The solution is unique because has Schur independent columns, and so its columns are also linearly independent (Fact 2.3(1)).

The pair yields a sign component decomposition of the matrix that is equivalent with the specified decomposition . This observation completes the proof of Theorem 5.1.

### 5.3. Positive-semidefinite matrices

It remains to establish Proposition 5.4. The argument depends on core properties of psd matrices, which we collect here. For references, see [Bha97, Bha07].

###### Fact 5.5 (Conjugation rule).

Conjugation respects the semidefinite order in the following sense.

1. If , then for each matrix with compatible dimensions.

2. If has full column rank and , then .

###### Fact 5.6 (Schur complements).

Assume that is a (strictly) positive-definite matrix. Then

 [XKKtY]≽0if and % only ifY≽KtX−1K.

Related results hold when is merely psd.

###### Fact 5.7 (Trace is monotone).

Let and be psd matrices that satisfy . Then , and equality holds precisely when .

### 5.4. The Factorization SDP

We are now prepared to prove Proposition 5.4, which describes the solution of the factorization SDP (5.2). The proposition follows instantly from a more precise lemma.

###### Lemma 5.8 (Factorization SDP).

Instate the assumptions of Theorem 5.1. Construct the orthogonal projector onto the range of . Define the positive-definite diagonal matrix

Then the unique solution to the semidefinite optimization problem (5.2) is the pair

 X⋆=(traceD)−1SDStandY⋆=(traceD)WD−1Wt.
###### Proof.

Recall that for a Schur independent sign matrix and a matrix with full column rank.

First, we argue that a feasible point of the factorization SDP (5.2) must be a correlation matrix of the form

 X=Sdiag(τ)St% for τ∈Δr. (5.3)

Indeed, the block matrix constraint in (5.2) ensures that , and the constraint makes a correlation matrix. At the same time, since the matrix has full column rank,

 range(P)=range(B)=range(SWt)=range(S).

Fact 3.4 shows that the constraint isolates the family of correlation matrices. This establishes the claim.

Next, substitute the expression (5.3) into the block matrix constraint in (5.2) and use the condition to factorize:

 [XBBtY]=[S00I][diag(τ)WtWY][S00I]t≽0.

Since is Schur independent, it has full column rank (Fact 2.3(1)). Therefore, the conjugation rule (Fact 5.5) implies that the psd constraint in the last display is equivalent with the condition

 [diag(τ)WtWY]≽0. (5.4)

Now, we can recognize that is a strictly positive-definite matrix. Indeed, owing to (5.4), the relation would imply that the corresponding column of the weight matrix equals zero, but this is impossible because has full column rank.

Apply the Schur complement rule (Fact 5.6) to the matrix (5.4) to confirm that

The objective function, , of the semidefinite program (5.2) is strictly monotone with respect to the semidefinite order (Fact 5.7). The variable is otherwise unconstrained, so the SDP achieves its minimum if and only if

 Y=Wdiag(τ)−1Wt.

It remains to determine the vector that minimizes the trace of .

To that end, calculate that

Equality holds if and only if the quantities are identical for all indices . Since , we may conclude that the minimizer has coordinates

 (τ⋆)k=∥wk∥ℓ2(r∑i=1∥wi∥ℓ2)−1for each % index k.

In summary, we have shown that the unique matrices that optimize (5.2) take the form

Identify the diagonal matrix from the statement to complete the proof. ∎

## 6. Asymmetric binary component decomposition

In this section, we develop a procedure (Algorithm 2) for computing an asymmetric binary component decomposition (2.5)–(2.6). We prove Theorem II, which states that the algorithm succeeds under a Schur independence condition. Our approach reduces the problem of computing a binary component decomposition to the problem of computing a sign component decomposition of a related matrix.

### 6.1. Correspondence between binary vectors and sign vectors

As we have discussed, there is a one-to-one correspondence between sign vectors and binary vectors (2.7). The correspondence between asymmetric sign component decompositions and binary component decompositions, however, is more subtle because they are invariant under different transformation. Indeed, does not change if we flip the sign of both and . On the other hand, the matrix completely determines the vectors and .

### 6.2. Reducing binary component decomposition to sign component decomposition

Given a matrix that has a binary component decomposition, we can apply a simple transformation to construct a related matrix that admits a sign component decomposition

###### Proposition 6.1 (Binary component decomposition: Reduction).

Consider a matrix that has a binary component decomposition

 C=ZWtwhereZ∈{0,1}n×r and W∈Rm×r.

Construct the matrix

 B=F(C)=2C−E∈Rn×m.

Then admits a sign component decomposition with inner dimension :

 B=[Se][WWe−e]twhereS=F(Z). (6.1)

Recall that is a matrix of ones with appropriate dimensions.

###### Proof.

The result follows from a straightforward calculation:

 B