 # On the maximal minimal cube lengths in distinct DNF tautologies

Inspired by a recent article by Anthony Zaleski and Doron Zeilberger, we investigate the question of determining the largest k for which there exists boolean formulas in disjunctive normal form (DNF) with n variables, none of whose conjunctions are `parallel', and such that all of them have at least k literals. Using a SAT solver, we answer some of the questions they left open. We also determine the corresponding numbers for DNFs obeying certain symmetries.

Comments

There are no comments yet.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1. Problem Statement

We consider boolean formulas with variables . A literal is a variable or a negated variable, e.g., or . A cube is a conjunction of literals, e.g., . The length of a cube is the number of distinct literals appearing in it. A formula in disjunctive normal form (DNF) is a disjunction of cubes, e.g., . Such a DNF is called a tautology if it evaluates to true for all assignments of the variables. For example is a tautology. It consists of two cubes of length 1 and one cube of length 2.

Inspired by work of Erdös , Zaleski and Zeilberger  have recently considered DNFs in which all cubes have distinct supports. The support of a cube is the set of variables occurring in it. For example the support of the cube is the singleton set , the support of the cube is the singleton set , while the support of the cube is the set . This implies that the DNF has distinct supports. On the other hand the Hamlet question does not have distinct supports. They call these formulas distinct DNFs. Inspired by a study of covering systems, Zaleski and Zeilberger want to know, for any given , what is the largest such that there is a distinct DNF tautology with variables only consisting of cubes of length at least .

Using a greedy algorithm, they searched for distinct DNF tautologies with a prescribed number of variables and a prescribed minimal cube length. The largest minimal cube length for which they found formulas are as follows:

 n k 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 1 2 3 4 4 5 6 6? 7 8 9 9?

These are only lower bounds for the optimal values of . However, by a density argument it can be shown that the optimal must satisfy the inequality , which gives rise to upper bounds. The numbers given in the table above turn out to match the upper bounds except for and (indicated by question marks), where they are off by one.

As a variant of the problem, Zaleski and Zeilberger also wanted to know, for any given , what is the largest such that there is a distinct DNF tautology with variables only consisting of cubes of length exactly . In this case, the density argument implies that such a must satisfy , which again gives an upper bound. With their greedy approach, they determined the following lower bounds. Again, mismatches with the upper bound are indicated by a question mark.

It is clear that there is no solution for and , so in this case the upper bound is too pessimistic and is the right value.

For and the computations reported in the present paper imply that the values and are also correct. We were not able to confirm the entry for in the first table with about one year of computation time. We did not attempt to confirm the entries for in the first or in the second table.

We add two refinements to the problem. First, we introduce an additional parameter  which bounds the lengths of the cubes from above. For any particular choice , we want to know the largest such that there is a distinct DNF tautology with variables only consisting of cubes of length at least and at most . The special case corresponds to the first variant of Zaleski and Zeilberger and the special case corresponds to the second variant. We think that the intermediate cases are also of interest.

Our second refinement concerns symmetries. Letting permutations act on the indices of the variables, we say that a DNF is invariant under a certain subgroup of  if every maps the DNF to itself. For example, the DNF is invariant under the cyclic group . For the groups , , and , and for various choices of and , we have determined the largest such that there is a distinct DNF tautology with variables consisting of cubes of lengths at least and at most  which are invariant under the given group.

## 2. SAT Encoding

Our results were obtained with the help of a SAT solver [5, 6], using a rather straightforward encoding of the problem. For each cube, we introduced one boolean variable that indicates whether or not this cube is going to be a part of the DNF we are looking for. Note that this creates variables, a quantity that grows quickly when or increase. For example, in the case and , where we were unable to complete the computation, we were dealing with 33024 variables.

In order to enforce that the DNF is a tautology, we specify for every assignment a clause saying that at least one of the cubes that becomes true under this assignment must be selected. In order to enforce that the DNF be distinct, we have to specify clauses which encode the requirement that for every possible support, at most one of the cubes having this support can be selected. There are many ways to encode a constraint of the form “at most one”, and their pros and cons are discussed extensively in the literature [2, 4]. For our purpose, the so-called binary encoding seemed to work well.

Finally, in order to enforce invariance under a certain group, we chose a set of generators and added for each cube and each generator a clause that says “if is selected, then also ”.

The encoding as described so far is sufficient for proving existence or non-existence of a distinct DNF tautology for any prescribed , and any prescribed group. In order to speed up the computations in practice, we may add some further constraints. One idea is to add clauses which forbid to select two cubes where one is strictly contained in the other. This is clearly a valid restriction, because when there is a solution that has two cubes that are contained in one another, we can discard the smaller one from it and obtain another solution. However, it turns out that this particular idea floods the formula with too many additional clauses and slows down the computation rather than speeding it up.

It is more efficient to break the symmetry of the problem, a standard technique in the context of SAT solving . Clearly, when there is a distinct DNF for certain and a certain group, then permuting all the variables in some way will yield another solution. Also replacing a certain variable by its negation (and canceling double negation introduced by that) turns a solution into a new solution. Since we dropped the idea to forbid cubes that are contained in other cubes, we can restrict the search to a solution containing a cube of length , and because we are free to permute and negate variables, we may assume this cube to be .

Adding the variable for this cube to the formula allows for an appreciable amount of simplification (called unit propagation  in SAT jargon). We are left with the freedom to permute the variables and the variables . By the first, it is fair to enforce an assumption that the variables are indexed in such a way that when a cube with support is selected, there is some such that

appear negated in it and the remaining variables do not. This assumption may still leave some degrees of freedom, which can be used to make a similar restriction as to which cubes with support

may be selected. The freedom to permute the variables is exploited by restricting the search to DNFs such for every , the cube is only selected when is also selected.

## 3. Results

We have written a Python script that produces the SAT instances described in the previous section, and we have used Biere’s award-winning SAT solver Treengeling  to solve them. The results are summarized in the following tables, in which appears increased towards the right and grows downwards. Entries with are left blank because they are equivalent to .

By the density argument, the maximal for a particular choice of and must satisfy the inequality . In the following table, an entry is boxed when it does not match this bound. For the entries marked with a question mark, we have not been able to prove that the we found is really optimal, but the long and successless search is at least some indication that the bound is not reached in these cases. For , the SAT solver is able to show that distinct DNF tautologies with , , , respectively, do not exist, although their existence would not be in conflict with the density bound.

2 3 4 5 6 7 8 9 10
2 1 1 2 2 2 2 2 2 2
3 1 2 2 3 3 3 3 3
4 2 3 3 4 4 4 4
5 3 4 4 5 5 5
6 4 4 5 5 6
7 4 5 6 6
8 5 6 6
9 6 6
10 6

The next two tables contain our results about distinct DNF tautologies invariant under certain groups. We have investigated the cyclic group , the dihedral group , the alternating group , and the full symmetric group . The table on the left lists the numbers for and , which turn out to be identical. Boxed entries highlight the differences to the previous table. The question marks refer to the search for , which for three entries did not terminate in a reasonable amount of time. Interestingly, it follows from the previous table that the entry for is correct, but while the SAT solver was able to prove this in the (seemingly harder) case without invariant constraints, it did not succeed with the constraints for . The computations for all entries terminated in presence of the constraints for .

2 3 4 5 6 7 8 9 10
2 1 1 1 1 1 1 1 1 1
3 1 2 2 2 3 3 3 3
4 2 2 3 3 4 4 4
5 2 3 4 4 5 5
6 3 4 5 5 6
7 4 5 6 6
8 5 6 6
9 6 6
10 6
2 3 4 5 6 7 8 9 10 11 12 13 14
2 1 1 1 1 1 1 1 1 1 1 1 1 1
3 1 2 2 2 2 2 2 2 2 2 2 2
4 2 2 3 3 3 3 3 3 3 3 3
5 2 3 3 4 4 4 4 4 4 4
6 3 3 4 4 5 5 5 5 5
7 3 4 4 5 5 6 6 6
8 4 4 5 5 6 6 7
9 4 5 5 6 6 7
10 5 5 6 6 7
11 5 6 6 7
12 6 6 7
13 6 7
14 7

The table on the right lists the numbers for and , which also turn out to be the same. For these groups, the invariant constraints make the problem easier, so that we were able to cover slightly larger values of and . All given numbers have been proved to be optimal. Note that a regular pattern emerges: we seem to have the formula .

## References

•  Armin Biere. CaDiCaL, Lingeling, Plingeling, Treengeling, YalSAT Entering the SAT Competition 2017. Proc. Proceedings of SAT Competition 2017 - Solver and Benchmark Descriptions, 2017.
•  Jiangchao Chen. A new SAT encoding of the at-most-one constraint. Proc. Constraint Modelling and Reformulation, 2010.
•  Paul Erdös. On integers of the form and some related problems. Summa Brasil. Math., 1950.
•  Alan M. Frisch and Paul A. Giannaros. SAT encodings of the at-most-k constraint: some old, some new, some fast, some slow. Proc. of the Tenth Int. Workshop of Constraint Modelling and Reformulation, 2010.
•  Biere, Armin, Marijn Heule, and Hans van Maaren, eds. Handbook of satisfiability. Vol. 185. IOS press, 2009.
•  Donald E. Knuth. Satisfiability. The Art of Computer Programming, Volume 4, Fascicle 6, Addison-Wesley, 2015.
•  Karem A. Sakallah. Symmetry and Satisfiability. Handbook of Satisfiability, 2009.
•  Anthony Zaleski and Doron Zeilberger. Boolean Function Analogs of Covering Systems. https://arxiv.org/abs/1801.05097