Smoothed counting of 0-1 points in polyhedra

03/09/2021
by   Alexander Barvinok, et al.
0

Given a system of linear equations ℓ_i(x)=β_i in an n-vector x of 0-1 variables, we compute the expectation of exp{- ∑_i γ_i (ℓ_i(x) - β_i)^2}, where x is a vector of independent Bernoulli random variables and γ_i >0 are constants. The algorithm runs in quasi-polynomial n^O(ln n) time under some sparseness condition on the matrix of the system. The result is based on the absence of the zeros of the analytic continuation of the expectation for complex probabilities, which can also be interpreted as the absence of a phase transition in the Ising model with a sufficiently strong external field. As an example, we consider the problem of "smoothed counting" of perfect matchings in hypergraphs.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

05/22/2020

More on zeros and approximation of the Ising partition function

We consider the problem of computing ∑_x e^f(x), where f(x)=∑_ij a_ijξ_i...
08/29/2018

Counting Independent Sets in Cocomparability Graphs

We show that the number of independent sets in cocomparability graphs ca...
02/24/2011

Counting Solutions of Constraint Satisfiability Problems:Exact Phase Transitions and Approximate Algorithm

The study of phase transition phenomenon of NP complete problems plays a...
03/08/2021

A Fully Polynomial Parameterized Algorithm for Counting the Number of Reachable Vertices in a Digraph

We consider the problem of counting the number of vertices reachable fro...
05/11/2007

Determining full conditional independence by low-order conditioning

A concentration graph associated with a random vector is an undirected g...
06/16/2020

Testing systems of real quadratic equations for approximate solutions

Consider systems of equations q_i(x)=0, where q_i: R^n ⟶ R, i=1, …, m, ...

1. Introduction and main results

(1.1) Linear equations in 0-1 vectors

Let be an real matrix and let be a real -vector, where . As is well-known, the problem of finding if there is a solution to the system of linear equations in 0-1 variables

is NP-hard, while counting all such solutions in a P-hard problem. Motivated by the general difficulty of the problem and inspired by ideas from statistical physics, we suggest a way of “smoothed counting”, which in some non-trivial cases turns out to be computationally feasible, at least in theory, and gives some information about “near-solutions” that satisfy the equations within a certain error.

Let us fix some , , interpreted as “weights” of the equations in (1.1.1). Suppose further, that are independent Bernoulli random variables, so that

for some . Our goal is to compute the expectation

Hence every solution to (1.1.1) is counted in (1.1.3) with weight 1, while any other 0-1 vector is counted with a weight that is exponentially small in the number of violated constraints and the “severity” of violations. Clearly, (1.1.3) is always an upper bound on the probability that is a solution to (1.1.1), and that for larger we get sharper bounds.

The choice of probabilities is motivated by the specifics of the problem. For example, if we pick for all

then the probability distribution concentrates around vectors satisfying

, so we zoom in on the solutions of (1.1.1) having approximately coordinates equal 1. In Sections 1.4 and 1.5 we give examples of some reasonable choices of the probability distribution in (1.1.2).

Given matrix , vector and weights , we consider the polynomial

in complex variables , where we agree that . Hence the expected value (1.1.3) is written as

To compute the value of at a particular point

we use the interpolation method, see Section 1.3 for details. For that method to work, one should show that that

for all in some connected open set containing points and . We establish a sufficient condition for

for all in a polydisc

We prove the following main result.

(1.2) Theorem

Suppose that each column of the matrix contains not more than non-zero entries for some integer . Given real numbers for , we define

Suppose that

and that

Then

as long as

In Section 1.4, we give an example of the values of the parameters allowed by the theorem in the case of smoothed counting of perfect matchings in hypergraphs, where we have for all . We also note that if all , the constraints simplify to for and

The immediate conclusion from Theorem 1.2 is the following algorithmic result.

(1.3) Computing

Let us fix a and let be as in Theorem 1.2. Then for any given such that

and any , the value of

can be approximated within relative error in time, where the implicit constant in the “” notation depends on only. For that, we define a univariate polynomial

Thus , we need to approximate and by Theorem 1.2 we have

As discussed in [Ba16], Section 2.2, under these conditions, one can approximate in within relative error in from the values of the derivatives

where we agree that . From (1.1.4), we have

while

The direct enumeration of all 0-1 vectors with takes time and since , we get the complexity. Here we assume that for any , the computation of the expression

takes unit time. In the bit model of computation, the complexity of the algorithm acquires an additional factor of

Suppose that we are in a situation where Theorem 1.2 allows us to approximate the expectation (1.1.3) efficiently. If, additionally, for all and , then for any , we can compute efficiently a particular 0-1 vector such that

This is done by the standard application of the method of conditional expectations, see, for example, Chapter 5 of [MR95]. Indeed, the algorithm allows us to compute any conditional expectation, defined by any set of constraints of the type or . Imposing a condition of this type reduces to computing the value of , where is a submatrix of obtained by deleting columns corresponding to the prescribed values of and where for we have for (here we use that for all and ). Hence Theorem 1.2 guarantees that the value of can still be efficiently approximated. Successively testing for the conditions or , and choosing each time the one with the larger conditional expectation, we compute the desired vector , while increasing the complexity roughly by a factor of .

(1.4) Example: perfect matchings in hypergraphs

Let be a -hypergraph with set of vertices and set of edges. Thus the edges of are some subsets such that . A perfect matching in is a collection of edges such that every vertex belongs to exactly one edge from . As is well-known, to decide whether contains a perfect matching is an NP-hard problem if and to count all perfect matchings is a P-hard problem if , cf. Problem SP2 in [A+99] and Chapter 17 of [AB09].

For each edge we introduce a 0-1 variable . Then the solutions of the system of equations

are in one-to-one correspondence with perfect matchings in : given a solution we select those edges for which . We note that every column of the coefficient matrix contains not more than non-zero entries (which are all 1’s). The right hand side of the system is the vector of all 1’s, , where for all .

Suppose now that the hypergraph is -uniform, that is, for all and -regular for some , that is, each vertex is contained in exactly edges , which is the case in many symmetric hypergraphs, such as Latin squares and cubes, see [LL13], [LL14], [Ke18], [Po18]. Then and each perfect matching contains exactly edges. Let be a random collection edges, where each edge is picked into independently at random with probability

so that the expected number of selected edges is exactly .

For a collection of edges and a vertex , let be the number of edges that contain . If we choose for some , then the expectation (1.1.3) reads as

Computing (1.4.2) reduces to computing

To apply Theorem 1.2, we fix some and choose

By choosing

and some sufficiently small absolute constant , we can make sure that the conditions of Theorem 1.2 are satisfied with

and hence we can approximate (1.4.2) within any prescribed relative error in quasi-polynomial time provided

for some absolute constant . Moreover, as we discussed in Section 1.3, for any , we can find in quasi-polynomial time a particular collection of edges such that

The dependence of the allowable in (1.4.2) on is likely to be optimal, or very close to optimal. Indeed, if we could have allowed, for example, for some fixed in (1.4.2), we would have been able to approximate (1.4.2) efficiently with any , and hence compute the probability of selecting a perfect matching with an arbitrary precision. Given a hypergraph and an integer , let us construct the hypergraph as follows. We have and the vertices of are the “clones” of the vertices of , so that each vertex of has clones in . Each edge corresponds to a unique edge such that and consists of the clones of each vertex in . We assign the probabilities . Thus if is a -uniform hypergraph then is -uniform, and if is -regular then is also -regular. On the other hand, for a collection of edges of and the corresponding collection , we have

Hence if we could choose in (1.4.2), by applying our algorithm to the hypergraph instead of , we would have computed (1.4.2) for with , and we could have achieved an arbitrarily large by choosing large enough.

(1.5) The maximum entropy distribution

Let be the polytope

We define the entropy function

and for , with the standard agreement that at or the corresponding terms are .

Suppose that the polytope has a non-empty relative interior, that is, contains a point where for .

One reasonable choice for the probabilities in (1.1.2) is the maximum entropy distribution obtained as the solution to the optimization problem:

This is a convex optimization problem, for which efficient algorithms are available [NN94]. Let be a vector of independent Bernoulli random variables defined by (1.1.2), where is the optimal solution in (1.5.1). Then . Moreover, for every point , we have

and hence we get a bound on the number of 0-1 points in :

see [BH10] for details. This bound turns out to be of interest in some situations, see, for example, [PP20].

In this case, our “smoothed counting” provides an improvement

by frequently an exponential in factor.

For example, the distribution (1.4.1) in Section 1.4 is clearly the maximum entropy distribution and for large we get an about of factor improvement, compared to the maximum entropy bound.

We prove Theorem 1.2 in Section 2. In Section 3, we make some concluding remarks regarding smoothed counting of integer points and connections to the earlier work [BR19].

2. Proof of Theorem 1.2

We start with establishing a zero-free region in what may be considered as a Fourier dual functional. In what follows, we denote the imaginary unit by , so as to use for indices.

(2.1) Proposition

For and let be real numbers and let be complex numbers. Suppose that

and some and that

and some integer , so that the matrix has at most non-zero entries in each column.

If

Then

Before we embark on the proof of Proposition 2.1, we do some preparations.

(2.2) Preliminaries

Let be the discrete cube of all vectors

where for . Let be a set of indices and let us fix some for all . The set

is called a face of the cube. The indices are fixed indices of and indices are its free indices. We define the dimension by , the cardinality of the set of free indices. Thus a face of dimension consists of points. The cube itself is a face of dimension , while every vertex is a face of dimension 0.

For a function and a face , we define

Suppose that is a free index of and let and be the faces of defined by the constraint and respectively. Then

Furthermore, if and and if the angle between non-zero complex numbers and , considered as vectors in , does not exceed for some , we have

cf. Lemma 3.6.3 of [Ba16]. The inequality (2.2.1) is easily obtained by bounding the length of from below by the length of its orthogonal projection onto the bisector of the angle between and .

More generally, suppose that for every face , every free index of and the corresponding faces and of , we have that , and the angle between the two non-zero complex numbers does not exceed . Let be a set of some free indices of . For for an assignment of signs, let be the face of obtained by fixing the coordinates with to . Then

and iterating (2.2.1) we obtain

Finally, we will use the inequality

which can be obtained as follows. Since is a convex function on the interval , we have

on the interval. Integrating, we obtain

which is equivalent to (2.2.3).

(2.3) Proof of Proposition 2.1

For a given complex vector , satisfying the conditions of the theorem, we consider the function defined by

for . To simplify the notation somewhat, for a face , we denote just by .

We prove by induction on the following statement.

(2.3.1) Let be a face of dimension . Then . Moreover, if and if is a free index of then for the faces the angle between complex numbers and , considered as vectors in , does not exceed

We obtain the desired result when is the whole cube.

Since for , the statement (2.3.1) holds if , and hence is a vertex of the cube.

Suppose now that (2.3.1) holds for all faces of dimension and lower. Let be a face of dimension . Since by the induction hypothesis on the polydisc of vectors , satisfying

we can choose a branch of the function

For , let us introduce a function defined by

for . Hence we have

and

Therefore,

where we use as a shorthand for . Our goal is to bound

which will allow us to bound the angle by which rotates as changes inside the polydisc (2.3.2).

If then by (2.3.3), we have

For an index , let

Hence . Suppose first that . For an assignment of signs, let be the face of obtained by fixing for all . Applying the induction hypothesis to and its faces, by (2.2.2) we get

On the other hand, the function is constant on every face , and hence from (2.3.3), we obtain

Therefore, by (2.3.4), we obtain the bound

If then is constant on and from (2.3.3) and (2.3.4) we get

so (2.3.5) holds as well.

Now we are ready to complete the induction step. Let be a face of dimension . Let be a free index of and let be the faces obtained by fixing and respectively. Then and by the induction hypothesis, we have and . We need to prove that the angle between and does not exceed . To this end, we note that

Applying (2.3.5) with , we conclude that the angle between and does not exceed

Using (2.2.3), we obtain

and hence the angle between and does not exceed

which completes the proof. ∎

The following corollary can be considered as a “Fourier dual” statement to Proposition 2.1.

(2.4) Corollary

For and , let and be real numbers and let be complex numbers. Suppose that

and some integer and that

Then

provided

Demonstration Proof

We have

Consequently,

by Proposition 2.1. ∎

Next, we take a limit in Corollary 2.4

(2.5) Corollary

For and , let and be real numbers and let be complex numbers. Suppose that

and some integer and that

Then

provided