Stability and complexity of mixed discriminants

06/13/2018
by   Alexander Barvinok, et al.
0

We show that the mixed discriminant of n positive semidefinite n × n real symmetric matrices can be approximated within a relative error ϵ >0 in quasi-polynomial n^O( n -ϵ) time, provided the distance of each matrix to the identity matrix in the operator norm does not exceed some absolute constant γ_0 >0. We then deduce a similar result for the mixed discriminant of doubly stochastic n-tuples of matrices from the Marcus - Spielman - Srivastava bound on the roots of the mixed characteristic polynomial.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

10/04/2017

Some facts on Permanents in Finite Characteristics

In the present article we essentially extend the result for the permanen...
01/12/2018

Computing permanents of complex diagonally dominant matrices and tensors

We prove that for any λ > 1, fixed in advance, the permanent of an n × n...
10/09/2020

Deterministic computation of the characteristic polynomial in the time of matrix multiplication

This paper describes an algorithm which computes the characteristic poly...
03/08/2022

The infinity norm bounds and characteristic polynomial for high order RK matrices

This paper shows that t_m ≤𝐀_∞≤√(t_m) holds, when 𝐀∈ℝ^m × m is a Runge-K...
02/06/2019

On maximum volume submatrices and cross approximation for symmetric semidefinite and diagonally dominant matrices

The problem of finding a k × k submatrix of maximum volume of a matrix A...
01/05/2019

Sum-of-square-of-rational-function based representations of positive semidefinite polynomial matrices

The paper proves sum-of-square-of-rational-function based representation...
11/26/2017

Approximating the Permanent of a Random Matrix with Vanishing Mean

Building upon a recent approach pioneered by Barvinok [4, 5, 7, 8] we pr...

1. Introduction and main results

Mixed discriminants were introduced by A.D. Alexandrov [Al38] in his work on mixed volumes and what was later called “the Alexandrov-Fenchel inequality”. Mixed discriminants generalize permanents and also found independent applications in problems of combinatorial counting, see, for example, Chapter 5 of [BR97], as well as in determinantal point processes [C+17], [KT12]. Recently, they made a spectacular appearance in the “mixed characteristic polynomial” introduced by Marcus, Spielman and Srivastava in their solution of the Kadison-Singer problem [M+15]. Over the years, the problem of computing or approximating mixed discriminants efficiently attracted some attention [GS02], [Gu05], [CP16].

In this paper, we establish some stability properties of mixed discriminants (the absence of zeros in certain complex domains) and, as a corollary, construct efficient algorithms to approximate the mixed discriminant of some sets of matrices. For example, we show that the mixed discriminant of positive semidefinite matrices can be approximated within a relative error in quasi-polynomial time, provided the distance of each matrix to the identity matrix in the operator norm does not exceed some absolute constant . We also consider the case when all matrices have rank 2, shown to be -hard by Gurvits [Gu05], and provide a quasi-polynomial approximation algorithm in a particular situation.

(1.1) Definitions and properties

Let be an -tuple of complex matrices. The mixed discriminant of is defined by

The determinant in the right hand side is a homogeneous polynomial of degree in complex variables and is the coefficient of the monomial . It is not hard to see that is a polynomial of degree in the entries of : assuming that for , we have

where is the symmetric group of permutations of . It follows from (1.1.1) that is linear in each argument , and symmetric under permutations of , see, for example, Section 4.5 of [Ba16].

Mixed discriminants appear to be the most useful when the matrices are positive semidefinite real symmetric (or complex Hermitian), in which case . For a real

-vector

, let denote the matrix with the -th entry equal . It is then not hard to see that

where is the matrix with columns , see, for example, Section 4.5 of [Ba16]. Various applications of mixed discriminants are based on (1.1.2). Suppose that are finite sets of vectors. Let us define

From the linearity of in each argument, we obtain

One combinatorial application of (1.1.3) is as follows: given a connected graph with vertices, color the edges of in colors. Then the number of spanning trees containing exactly one edge of each color is naturally expressed as a mixed discriminant. More generally, this extends to counting “rainbow bases” in regular matroids with colored elements, cf. Chapter 5 of [BR97]. Another application of (1.1.3) is in determinantal point processes [C+17].

The mixed discriminant of (positive semidefinite) matrices generalizes the permanent of a (non-negative) matrix. Namely, given diagonal matrices , we consider an matrix whose -th row is the diagonal of . It is easy to see that

where the permanent of is defined by

We note that if are positive semidefinite then is a non-negative matrix and that the permanent of any non-negative square matrix can be interpreted as the mixed discriminant of positive semidefinite matrices.

In their solution of the Kadison-Singer problem, Marcus, Spielman and Srivastava defined the mixed characteristic polynomial of matrices by

[M+15], see also [MS17]. The coefficients of can be easily expressed as mixed discriminants, see Section 5.1. Finding efficiently the partition in the Weaver’s reformulation of the Kadison-Singer conjecture (the existence of such a partition is proven in [M+15]) reduces to bounding the roots of the mixed characteristic polynomial, which in turn makes computing the coefficients of the polynomial (which are expressed as mixed discriminants) of interest. It follows from our results that for any non-negative integer and , fixed in advance, one can approximate the coefficient of in quasi-polynomial time, provided and the distance from each matrix to in the operator norm does not exceed an absolute constant .

Let be an complex matrix. For a set , let be the submatrix of consisting of the entries with . Gurvits [Gu05] noticed that computing the mixed discriminant of positive semidefinite matrices of rank 2 reduces to computing the sum

where for

the corresponding term is equal to 1. Indeed, applying a linear transformation if necessary, we assume that for

we have , where is the standard basis of and are some vectors in . From the linearity of the mixed discriminant in each argument and (1.1.2), it follows that

where is the Gram matrix of vectors , that is, and is the standard scalar product in , see [Gu05] for details. Computing a more general expression

for an integer is of interest in discrete determinantal point processes [KT12].

As a ramification of our approach, we present a quasi-polynomial algorithm of complexity approximating (1.1.5) within relative error provided for any , fixed in advance, where is the operator norm.

(1.2) Computational complexity

Since mixed discriminants generalize permanents, they are at least as hard to compute exactly or to approximate as permanents. Moreover, it appears that mixed discriminants are substantially harder to deal with than permanents. It is shown in [Gu05] that it is a -hard problem to compute even when for . In particular, computing (1.1.5) for a positive definite matrix and is a

-hard problem. In contrast, the permanent of a matrix with at most 2 non-zero entries in each row is trivial to compute. A Monte Carlo Markov Chain algorithm of Jerrum, Sinclair and Vigoda

[J+04] approximates the permanent of a non-negative matrix in randomized polynomial time. Nothing similar is known or even conjectured to work for the mixed discriminant of positive semidefinite matrices. A randomized polynomial time algorithm from [Ba99] approximates the mixed discriminant of positive semidefinite matrices within a multiplicative factor of for , where is the Euler constant. A deterministic polynomial time algorithm of [GS02] approximates the mixed discriminants of positive semidefinite matrices within a multiplicative factor of . As is shown in Section 4.6 of [Ba16], for any , fixed in advance, the scaling algorithm of [GS02] approximates the mixed discriminant within a multiplicative factor of

provided the largest eigenvalue of each matrix

is within a factor of of its smallest eigenvalue.

A combinatorial algorithm of [CP16] computes the mixed discriminant exactly in polynomial time for some class of matrices (of bounded tree width).

Our first result establishes the absence of complex zeros of if all lie sufficiently close to the identity matrix. In what follows, denotes the operator norm a matrix, which in the case of a real symmetric matrix is the largest absolute value of an eigenvalue.

(1.3) Theorem

There is an absolute constant (one can choose ) such that if are real symmetric matrices satisfying

then for , we have

We note that under the conditions of the theorem, the mixed discriminant is not confined to any particular sector of the complex plane (in other words, the reasons for the mixed discriminant to be non-zero are not quite straightforward). For example, if , then rotates times around the origin as ranges over the unit circle.

Applying the interpolation technique, see

[Ba16], [PR17], we deduce that the mixed discriminant can be efficiently approximated if the matrices are close to in the operator norm. Let be matrices satisfying the conditions of Theorem 1.3. Since in the simply connected domain (polydisc) , we can choose a branch of in that domain. It turns out that the logarithm of the mixed discriminant can be efficiently approximated by a low (logarithmic) degree polynomial.

(1.4) Theorem

For any there is a constant and for any , for any positive integer , there is a polynomial

in the entries of real symmetric matrices and complex such that and

provided

where is the constant in Theorem 1.4 and

We show that the polynomial can be computed in quasi-polynomial time, where the implicit constant in the “” notation depends on alone. In other words, Theorem 1.4 implies that the mixed discriminant of positive definite matrices can be approximated within a relative error in quasi-polynomial time provided for each matrix , the ratio of any two eigenvalues is bounded by a constant , fixed in advance. We note that the mixed discriminant of such -tuples can vary within an exponentially large multiplicative factor .

Theorem 1.4 shows that the mixed discriminant can be efficiently approximated in some open domain in the space of -tuples of symmetric matrices. A standard argument shows that unless -hard problems can be solved in quasi-polynomial time, the mixed discriminant cannot be computed exactly in any open domain in quasi-polynomial time: if such a domain existed, we could compute the mixed discriminant exactly at any -tuple as follows: we choose a line through the desired -tuple and an -tuple in the domain; since the restriction of the mixed discriminant onto a line is a polynomial of degree , we could compute it by interpolation from the values at points in the domain.

We deduce from the Marcus - Spielman - Srivastava bound on the roots of the mixed characteristic polynomial [MS17] the following stability result for mixed discriminants.

(1.5) Theorem

Let be the positive real solution of the equation . Suppose that are positive semidefinite matrices such that and for . Then

As before, the interpolation argument produces the following algorithmic corollary.

(1.6) Theorem

For any , where is the constant of Theorem 1.5, there is a constant , and for any , and any positive integer there is a polynomial

in the entries of real symmetric matrices and complex such that and

provided are positive semidefinite matrices such that

and .

Again, the polynomial is constructed in quasi-polynomial time.

Some remarks are in order. An -tuple of positive semidefinite matrices satisfying (1.6.1) is called doubly stochastic. Gurvits and Samorodnitsky [GS02] proved that an -tuple of positive definite matrices can be scaled (efficiently, in polynomial times) to a doubly stochastic -tuple, that is, one can find an matrix , a doubly stochastic -tuple , and positive real numbers such that for , see also Section 4.5 of [Ba16] for an exposition. Then we have

and hence computing the mixed discriminant for any -tuple of positive semidefinite matrices reduces to that for a doubly stochastic -tuple. The -tuple naturally plays the role of the “center” of the set of all doubly stochastic -tuples. Let us contract the convex body of all doubly stochastic -tuples towards its center with a constant coefficient , . Theorem 1.5 implies that the mixed discriminants of all contracted -tuples are efficiently (in quasi-polynomial time) approximable. In other words, there is “core” of the convex body of doubly stochastic -tuples, where the mixed discriminant is efficiently approximable, and that core is just a scaled copy (with a constant, small but positive, scaling coefficient) of the whole body.

Finally, we address the problem of computing (1.1.5). First, we prove the following stability result.

(1.7) Theorem

For an complex matrix and a set , let be the submatrix of consisting of the entries with . For an integer , we define a polynomial

with constant term corresponding to . If then

Consequently, by interpolation we obtain the following result.

(1.8) Theorem

For any and any integer there is a constant and for any and integer there is a polynomial

in the entries of an complex matrix such that and

provided is an matrix such that .

The polynomial is constructed in quasi-polynomial time.

We prove Theorem 1.3 in Sections 2 and 3. We prove Theorem 1.4 in Section 4. In Section 5, we prove Theorems 1.5 and 1.6 and in Section 6, we prove Theorems 1.7 and 1.8.

2. Preliminaries

(2.1) From matrices to quadratic forms

Let be the standard scalar product in . With an real symmetric matrix we associate a quadratic form ,

Given quadratic forms , we define their mixed discriminant by

where is the matrix of . This definition does not depend on the choice of an orthonormal basis in (as long as the scalar product remains fixed): if we change the basis, the matrices change as

for some orthogonal matrix

and all , and hence the mixed discriminant does not change.

The advantage of working with quadratic forms is that it allows us to define the mixed discriminant of the restriction of the forms onto a subspace. Namely, if are quadratic forms and is a subspace with , we make into a Euclidean space with the scalar product inherited from and define the mixed discriminant for the restrictions .

We will use the following simple lemma.

(2.2) Lemma

Let be quadratic forms and suppose that

where are real numbers and are unit vectors. Then

where is the orthogonal complement to .

Demonstration Proof

This is Lemma 4.6.3 from [Ba16]. We give its proof here for completeness. By the linearity of the mixed discriminant in each argument, it suffices to check the formula when , where is a unit vector. Let be the matrices of in an orthonormal basis, where is the -th basis vector and hence is the matrix where the -th entry is and all other entries are .

It follows from (1.1.1) that

where is the upper left submatrix of . We observe that is the matrix of the restriction . ∎

(2.3) Comparing two restrictions

Let be quadratic forms and let be unit vectors (we assume that . We would like to compare and . Let , so is a subspace of codimension 2. Let us identify and with as Euclidean spaces (we want to preserve the scalar product but do not worry about bases) in such a way that gets identified with . Hence the quadratic forms get identified with some quadratic forms and the quadratic forms get identified with some quadratic forms for .

We have

Besides

Let us denote

Hence are quadratic forms and by (2.3.1) we have for all . It follows then that

(2.4) Lemma

Suppose that . Let be linear forms. For , let be quadratic forms and let be some other quadratic forms. Then

Demonstration Proof

Since the restriction of a linear form onto a subspace is a linear form on the subspace, repeatedly applying Lemma 2.2, we reduce the general case to the case of , in which case the mixed discriminant in question is just . On the other hand, for all real we have

and hence

It follows by Definition 1.1 that . ∎

(2.5) Corollary

Suppose that and let be quadratic forms. Let be unit vectors such that and for , let us define quadratic forms and as in Section 2.3. Let for . Then

Moreover,

If , the second sum in the right hand side is empty.

Demonstration Proof

Since for , the proof follows by the linearity of the mixed discriminant in each argument and by Lemma 2.4. ∎

Finally, we will need a simple estimate.

(2.6) Lemma

Let and be complex numbers such that and for some . Then

Demonstration Proof

We write and for some real and such that and . Then

3. Proof of Theorem 1.3

We prove Theorem 1.3 by induction on . Following Section 2, we associate with (now complex) matrices (now complex-valued) quadratic forms where is the quadratic form with matrix . If is a subspace then the restriction of onto is just , where is the restriction of onto . The induction is based on the following two lemmas.

(3.1) Lemma

Let us fix such that . Let , , be quadratic forms and let be complex numbers such that . Let us define

and suppose that the following conditions hold:

Then for any unit vector , we have

Demonstration Proof

We have

where

are the orthonormal eigenvectors of

and are the corresponding eigenvalues. In particular, from condition (2) of the lemma, we have

Since

from Lemma 2.2, we obtain by the linearity of the mixed discriminant

Let us choose a unit vector . Then

for some such that .

Combining (3.1.1) and (3.1.2), we get

From Lemma 2.6,

and hence

The proof then follows from (3.1.3). ∎

(3.2) Lemma

Let us fix such that

Let , , be quadratic forms and let be complex numbers such that . Let us define

and suppose that the following conditions hold:

Then for any two unit vectors , we have

for some such that

DemonstrationProof

As in Section 2.3, let us construct the quadratic forms for and the corresponding forms . Clearly,

and

Let

From condition (2) of the lemma, we have

From Corollary 2.5,

and

If then the second sum is absent in the right hand side of (3.2.2).

We can write

where and are eigenvalues of with the corresponding unit eigenvectors and . By (3.2.1) we have

Applying Lemma 2.2, we obtain

where the final inequality follows by condition (1) of the lemma. For we just have

Similarly, if , for every , we obtain

where each of the four terms in the right hand side is if the corresponding intersection of subspaces or fails to be -dimensional. Hence we get