1. Introduction and main results
(1.1) Linear equations in 0-1 vectors
Let be an real matrix and let be a real -vector, where . As is well-known, the problem of finding if there is a solution to the system of linear equations in 0-1 variables
is NP-hard, while counting all such solutions in a P-hard problem. Motivated by the general difficulty of the problem and inspired by ideas from statistical physics, we suggest a way of “smoothed counting”, which in some non-trivial cases turns out to be computationally feasible, at least in theory, and gives some information about “near-solutions” that satisfy the equations within a certain error.
Let us fix some , , interpreted as “weights” of the equations in (1.1.1). Suppose further, that are independent Bernoulli random variables, so that
for some . Our goal is to compute the expectation
Hence every solution to (1.1.1) is counted in (1.1.3) with weight 1, while any other 0-1 vector is counted with a weight that is exponentially small in the number of violated constraints and the “severity” of violations. Clearly, (1.1.3) is always an upper bound on the probability that is a solution to (1.1.1), and that for larger we get sharper bounds.
The choice of probabilities is motivated by the specifics of the problem. For example, if we pick for all
then the probability distribution concentrates around vectors satisfying, so we zoom in on the solutions of (1.1.1) having approximately coordinates equal 1. In Sections 1.4 and 1.5 we give examples of some reasonable choices of the probability distribution in (1.1.2).
Given matrix , vector and weights , we consider the polynomial
in complex variables , where we agree that . Hence the expected value (1.1.3) is written as
To compute the value of at a particular point
we use the interpolation method, see Section 1.3 for details. For that method to work, one should show that that
for all in some connected open set containing points and . We establish a sufficient condition for
for all in a polydisc
We prove the following main result.
Suppose that each column of the matrix contains not more than non-zero entries for some integer . Given real numbers for , we define
as long as
In Section 1.4, we give an example of the values of the parameters allowed by the theorem in the case of smoothed counting of perfect matchings in hypergraphs, where we have for all . We also note that if all , the constraints simplify to for and
The immediate conclusion from Theorem 1.2 is the following algorithmic result.
Let us fix a and let be as in Theorem 1.2. Then for any given such that
and any , the value of
can be approximated within relative error in time, where the implicit constant in the “” notation depends on only. For that, we define a univariate polynomial
Thus , we need to approximate and by Theorem 1.2 we have
As discussed in [Ba16], Section 2.2, under these conditions, one can approximate in within relative error in from the values of the derivatives
where we agree that . From (1.1.4), we have
The direct enumeration of all 0-1 vectors with takes time and since , we get the complexity. Here we assume that for any , the computation of the expression
takes unit time. In the bit model of computation, the complexity of the algorithm acquires an additional factor of
Suppose that we are in a situation where Theorem 1.2 allows us to approximate the expectation (1.1.3) efficiently. If, additionally, for all and , then for any , we can compute efficiently a particular 0-1 vector such that
This is done by the standard application of the method of conditional expectations, see, for example, Chapter 5 of [MR95]. Indeed, the algorithm allows us to compute any conditional expectation, defined by any set of constraints of the type or . Imposing a condition of this type reduces to computing the value of , where is a submatrix of obtained by deleting columns corresponding to the prescribed values of and where for we have for (here we use that for all and ). Hence Theorem 1.2 guarantees that the value of can still be efficiently approximated. Successively testing for the conditions or , and choosing each time the one with the larger conditional expectation, we compute the desired vector , while increasing the complexity roughly by a factor of .
(1.4) Example: perfect matchings in hypergraphs
Let be a -hypergraph with set of vertices and set of edges. Thus the edges of are some subsets such that . A perfect matching in is a collection of edges such that every vertex belongs to exactly one edge from . As is well-known, to decide whether contains a perfect matching is an NP-hard problem if and to count all perfect matchings is a P-hard problem if , cf. Problem SP2 in [A+99] and Chapter 17 of [AB09].
For each edge we introduce a 0-1 variable . Then the solutions of the system of equations
are in one-to-one correspondence with perfect matchings in : given a solution we select those edges for which . We note that every column of the coefficient matrix contains not more than non-zero entries (which are all 1’s). The right hand side of the system is the vector of all 1’s, , where for all .
Suppose now that the hypergraph is -uniform, that is, for all and -regular for some , that is, each vertex is contained in exactly edges , which is the case in many symmetric hypergraphs, such as Latin squares and cubes, see [LL13], [LL14], [Ke18], [Po18]. Then and each perfect matching contains exactly edges. Let be a random collection edges, where each edge is picked into independently at random with probability
so that the expected number of selected edges is exactly .
For a collection of edges and a vertex , let be the number of edges that contain . If we choose for some , then the expectation (1.1.3) reads as
Computing (1.4.2) reduces to computing
To apply Theorem 1.2, we fix some and choose
and some sufficiently small absolute constant , we can make sure that the conditions of Theorem 1.2 are satisfied with
and hence we can approximate (1.4.2) within any prescribed relative error in quasi-polynomial time provided
for some absolute constant . Moreover, as we discussed in Section 1.3, for any , we can find in quasi-polynomial time a particular collection of edges such that
The dependence of the allowable in (1.4.2) on is likely to be optimal, or very close to optimal. Indeed, if we could have allowed, for example, for some fixed in (1.4.2), we would have been able to approximate (1.4.2) efficiently with any , and hence compute the probability of selecting a perfect matching with an arbitrary precision. Given a hypergraph and an integer , let us construct the hypergraph as follows. We have and the vertices of are the “clones” of the vertices of , so that each vertex of has clones in . Each edge corresponds to a unique edge such that and consists of the clones of each vertex in . We assign the probabilities . Thus if is a -uniform hypergraph then is -uniform, and if is -regular then is also -regular. On the other hand, for a collection of edges of and the corresponding collection , we have
Hence if we could choose in (1.4.2), by applying our algorithm to the hypergraph instead of , we would have computed (1.4.2) for with , and we could have achieved an arbitrarily large by choosing large enough.
(1.5) The maximum entropy distribution
Let be the polytope
We define the entropy function
and for , with the standard agreement that at or the corresponding terms are .
Suppose that the polytope has a non-empty relative interior, that is, contains a point where for .
One reasonable choice for the probabilities in (1.1.2) is the maximum entropy distribution obtained as the solution to the optimization problem:
This is a convex optimization problem, for which efficient algorithms are available [NN94]. Let be a vector of independent Bernoulli random variables defined by (1.1.2), where is the optimal solution in (1.5.1). Then . Moreover, for every point , we have
and hence we get a bound on the number of 0-1 points in :
In this case, our “smoothed counting” provides an improvement
by frequently an exponential in factor.
For example, the distribution (1.4.1) in Section 1.4 is clearly the maximum entropy distribution and for large we get an about of factor improvement, compared to the maximum entropy bound.
We prove Theorem 1.2 in Section 2. In Section 3, we make some concluding remarks regarding smoothed counting of integer points and connections to the earlier work [BR19].
2. Proof of Theorem 1.2
We start with establishing a zero-free region in what may be considered as a Fourier dual functional. In what follows, we denote the imaginary unit by , so as to use for indices.
For and let be real numbers and let be complex numbers. Suppose that
and some and that
and some integer , so that the matrix has at most non-zero entries in each column.
Before we embark on the proof of Proposition 2.1, we do some preparations.
Let be the discrete cube of all vectors
where for . Let be a set of indices and let us fix some for all . The set
is called a face of the cube. The indices are fixed indices of and indices are its free indices. We define the dimension by , the cardinality of the set of free indices. Thus a face of dimension consists of points. The cube itself is a face of dimension , while every vertex is a face of dimension 0.
For a function and a face , we define
Suppose that is a free index of and let and be the faces of defined by the constraint and respectively. Then
Furthermore, if and and if the angle between non-zero complex numbers and , considered as vectors in , does not exceed for some , we have
cf. Lemma 3.6.3 of [Ba16]. The inequality (2.2.1) is easily obtained by bounding the length of from below by the length of its orthogonal projection onto the bisector of the angle between and .
More generally, suppose that for every face , every free index of and the corresponding faces and of , we have that , and the angle between the two non-zero complex numbers does not exceed . Let be a set of some free indices of . For for an assignment of signs, let be the face of obtained by fixing the coordinates with to . Then
and iterating (2.2.1) we obtain
Finally, we will use the inequality
which can be obtained as follows. Since is a convex function on the interval , we have
on the interval. Integrating, we obtain
which is equivalent to (2.2.3).
(2.3) Proof of Proposition 2.1
For a given complex vector , satisfying the conditions of the theorem, we consider the function defined by
for . To simplify the notation somewhat, for a face , we denote just by .
We prove by induction on the following statement.
(2.3.1) Let be a face of dimension . Then . Moreover, if and if is a free index of then for the faces the angle between complex numbers and , considered as vectors in , does not exceed
We obtain the desired result when is the whole cube.
Since for , the statement (2.3.1) holds if , and hence is a vertex of the cube.
Suppose now that (2.3.1) holds for all faces of dimension and lower. Let be a face of dimension . Since by the induction hypothesis on the polydisc of vectors , satisfying
we can choose a branch of the function
For , let us introduce a function defined by
for . Hence we have
where we use as a shorthand for . Our goal is to bound
which will allow us to bound the angle by which rotates as changes inside the polydisc (2.3.2).
If then by (2.3.3), we have
For an index , let
Hence . Suppose first that . For an assignment of signs, let be the face of obtained by fixing for all . Applying the induction hypothesis to and its faces, by (2.2.2) we get
On the other hand, the function is constant on every face , and hence from (2.3.3), we obtain
Therefore, by (2.3.4), we obtain the bound
If then is constant on and from (2.3.3) and (2.3.4) we get
so (2.3.5) holds as well.
Now we are ready to complete the induction step. Let be a face of dimension . Let be a free index of and let be the faces obtained by fixing and respectively. Then and by the induction hypothesis, we have and . We need to prove that the angle between and does not exceed . To this end, we note that
Applying (2.3.5) with , we conclude that the angle between and does not exceed
Using (2.2.3), we obtain
and hence the angle between and does not exceed
which completes the proof. ∎
The following corollary can be considered as a “Fourier dual” statement to Proposition 2.1.
For and , let and be real numbers and let be complex numbers. Suppose that
and some integer and that
by Proposition 2.1. ∎
Next, we take a limit in Corollary 2.4
For and , let and be real numbers and let be complex numbers. Suppose that
and some integer and that