The proof search problem for a given proof system asks, given a tautology, to find an approximately shortest proof of it. Clearly, the computational complexity of such problems is of fundamental importance for automated theorem proving. In particular, among the proof systems for propositional logic, Resolution deserves special attention since most modern implementations of satisfiability solvers are based on it.
Following , we say that the proof search problem for Resolution is solvable in polynomial time, or more succinctly, that Resolution is automatizable, if there is an algorithm that, given a contradictory CNF formula as input, outputs a Resolution refutation of in time polynomial in , where is the size of , and is the length of a shortest Resolution refutation of . It is clear that the concept of automatizability applies not only to Resolution but to any refutation or proof system, and one can ask for automating algorithms that run in quasi-polynomial time, subexponential time, etc..111The time of the automating algorithm is not measured in but in because can be much larger than . We use , and not just , because a Resolution refutation need not use all clauses in , but the algorithm should be given the opportunity to at least read all of .
In this paper we show that Resolution is not automatizable unless . The assumption is clearly optimal since implies that it is. To prove our result we give a direct and efficient reduction from , the satisfiability problem for 3-CNF formulas. The reduction is so efficient that it also rules out quasi-polynomial and subexponential time automating algorithms for Resolution under the corresponding hardness assumptions. More precisely, let QP and SUBEXP denote the classes of problems that are decidable in quasi-polynomial time , and in subexponential time , respectively. Then our main result reads:
Resolution is not automatizable in subexponential time unless .
Resolution is not automatizable in quasi-polynomial time unless .
Resolution is not automatizable in polynomial time unless .
That Resolution is not automatizable in polynomial time has been known under a stronger assumption from parameterized complexity theory, using a more contrived reduction : we review the literature below. The first two statements in Theorem 1 give the first evidence that Resolution is not automatizable in quasi-polynomial or subexponential time. As in the third statement, their assumptions are also optimal in that and imply that Resolution can be automated in quasi-polynomial and subexponential time, respectively.
History of the problem
The complexity of the proof search problem has been extensively investigated. Krajíček and Pudlák  showed that Extended Frege systems222We refer to the textbook [25, Chapter 4] for a definition of this and the following systems. All notions relevant to state and prove our results are going to be defined later. are not automatizable assuming RSA is secure against . Subsequently, Bonet et al. showed this for Frege  and bounded depth Frege systems  assuming the Diffie-Hellman key exchange is secure against polynomial or, respectively, subexponential size circuits.
In fact, these results rule out feasible interpolation, an influential concept introduced to proof complexity by Krajíček[24, 26]. We refer to [29, Chapters 17, 18] for an account. If a system with feasible interpolation has short refutations of the contradictions that state that a pair of NP problems are not disjoint, then the pair can be separated by small circuits. Hence, feasible interpolation can be ruled out by finding short proofs of the disjointness of an NP pair that is hard to separate. Such hardness assumptions turn up naturally in cryptography  which explains the type of assumptions that were used in the results above.
The failure of feasible interpolation for a natural system implies (cf. [4, Theorem 3]) that is not even weakly automatizable in the sense that it would be polynomially simulated (see ) by an automatizable system. Hence, the above results left open whether weak proof systems, in particular those having feasible interpolation such as Resolution , were (weakly) automatizable. We refer to  for a survey, and focus from now on on Resolution.
Pudlák showed [33, Corollary 2] that the weak automatizability of a proof system is equivalent to the (polynomial time) separability of its, so-called, canonical NP-pair . This is, informally, the feasibility of distinguishing between satisfiable formulas and those with short refutations. Hence, to rule it out it suffices to reduce some inseparable disjoint NP pair to it. Atserias and Maneva  found in this respect useful pairs associated to two player games. The two NP sets collect the games won by the respective players, and separation means deciding the game. Following [5, 22], Beckmann et al.  showed that Resolution is not weakly automatizable unless parity games are decidable in polynomial time. Note, however, that this might well be the case, in fact, parity games are decidable in quasi-polynomial time .
Moreover, some non-trivial automating algorithms are known. Beame and Pitassi  observed that treelike Resolution is automatizable in quasi-polynomial time. For general Resolution there is an algorithm that, when given a 3-CNF formula with variables that has a Resolution refutation of length at most , computes a refutation in time . This follows from the size-width trade-off of Ben-Sasson and Wigderson . Indeed, it is trivial to find a refutation of width at most in time if there is one (and, in general, time is necessary ).
However, the automatizability of Resolution is unlikely. First, Buss et al.  showed, assuming only , that automatization is not possible in linear time. In fact, they proved more. They considered the optimization problem of finding, given a contradictory CNF, a Resolution refutation that is as short as possible. They reduced to it the optimization problem MMCSA of finding, given a monotone circuit, a satisfying assignment that has Hamming weight as small as possible. Known PCP theorems imply that this problem is not approximable with ratio , so the same holds for finding short Resolution refutations. This argument can be adapted to many other refutation systems (see ).
But the main convincing evidence that Resolution is not automatizable, before the result of this paper, was achieved by Alekhnovich and Razborov . By a different and ingenious reduction they showed that if Resolution, or even treelike Resolution, were automatizable, then the parameterized decision version of MMCSA would have, in the terminology of parameterized complexity theory (see [16, Proposition 5]), an fpt algorithm with constant approximation ratio. Now, the same paper  also established “the first nontrivial parameterized inapproximability result” [19, p.9] by further deriving a randomized fpt algorithm for the parameterized decision version of MMCSA, a well-known W[P]-complete problem (see e.g. [20, Theorem 3.14]). The randomized fpt algorithm has subsequently been derandomized by Eickmeyer et al. , hence Resolution is not automatizable unless .
Since this W[P]-hardness result applies not only to Resolution but even to treelike Resolution, which is automatizable in quasipolynomial time, Alekhnovich and Razborov stated that the “main problem left open” [1, Section 5] is whether general Resolution is automatizable in quasi-polynomial time. We consider Theorem 1 as an answer to this question.
The direct approach for proving the NP-hardness of the proof search problem is of course the following: find a polynomial time computable function that maps satisfiable instances of to CNF formulas with short Resolution refutations, and unsatisfiable instances of to CNF formulas without polynomially longer Resolution refutations. This is exactly what we do, and the construction is very explicit.
An idea of how such a map could be defined is implicit in . Pudlák [33, Theorem 2] maps a formula to , for some suitable for his context, where is a CNF formula that describes, in a natural way, a Resolution refutation of of length . He used this function to show that the canonical pair of Resolution is symmetric. In particular, he showed that, if is satisfiable, then has a short Resolution refutation. This refutation proceeds naturally by using a satisfying assignment for as a guide to find a true literal in each line of the alleged refutation, line by line one after another, until it gets stuck at the final empty clause. Conversely, we would like to show that, if is unsatisfiable, then is hard for Resolution. Intuitively, this should be the case: refuting means proving a lower bound and “our experience rather suggests that proving lower bounds is difficult” – this is what Pudlák [33, Section 3] states about a similar formula for strong proof systems.
However, even after considerable time and effort, we failed to prove a Resolution length lower bound for . We bypass the issue by considering a harder version of . The harder is obtained from relativizing seen as the propositional encoding of a first order formula with a built-in linear order, following the general relativization technique of Dantchev and Riis . The linear order is necessary for Pudlák’s upper bound to still go through. On the other hand, a random restriction argument in the style of  reduces a length lower bound for to a certain width lower bound for . The bulk of the current work is in establishing this width lower bound for , when is unsatisfiable. It is proved using a variant of the conditions from , a particular formalization of a Prover-Adversary argument as e.g. in ; the wording is meant to point out some analogy with forcing conditions . This is not straightforward. The main obstacle overcome by our variant is the presence of the built-in linear order of . In fact, Dantchev and Riis [18, Section 5] point out explicitly that their arguments fail in the presence of a built-in linear order.
The proof size estimation problem
Besides being computable in polynomial time, the map that takes to (for some suitable ) is indeed a very efficient gap-creating reduction: the upper bound is polynomial when is satisfiable, and the lower bound is (weakly) exponential when is unsatisfiable. If for a CNF formula we write for the length of a shortest Resolution refutation of , then we show:
There are reals and a polynomial time computable function that maps every 3-CNF formula to a CNF formula such that, for being the size of :
if is satisfiable, then ;
if is unsatisfiable, then .
Moreover, and can be chosen arbitrarily close to and , respectively, which means that it is NP-hard to approximate the minimal Resolution refutation length to within .
The computational problem of computing minimal proof lengths also has a long history. For first-order logic, the problem dates back to Gödel’s famous letter to von Neumann; we refer to  for a historical discussion, to  for a proof of Gödel’s claim in the letter, and to  for some more recent results. In propositional logic, the problem has been shown to be NP-hard for a particular Frege system by Buss , and for Resolution by Iwama . Alekhnovitch et al.  showed that the minimal Resolution refutation length cannot be approximated to within any fixed polynomial unless : for every there are functions and , computable in non-uniform polynomial time, such that for every CNF formula of sufficiently large size we have either or depending on the satisfiability . This falls short to rule out automatizability because has exponential growth. Earlier, Iwama  found uniformly computable such functions with polynomially bounded but his gap was only versus , so also falls short to rule out automatizability.
In Section 2 we introduce some notation and basic terminology from propositional logic. Section 3 presents Resolution refutations as finite structures. Section 4 is devoted to and proves the width lower bound when is unsatisfiable (Lemma 4). Section 5 discusses the relativized formula , the refutation length upper bound when is satisfiable (Lemma 11), and the refutation length lower bound when is unsatisfiable (Lemma 10). Theorems 2 and 1 are derived from these lemmas in Section 6. In Section 7 we discuss some open issues. Finally, for easiness of reference, in Appendix A we give the detailed lists of clauses for the formulas and .
For we let and understand that . A partial function from a set to a set is a function with domain included in and image included in . We view partial functions from to
as sets of ordered pairs. For any set , the restriction of to is . The restriction of with image is .
We fix some notation for propositional logic. Let be a set of propositional variables that take truth values in , where denotes false and denotes true. A literal is a variable or its negation , also denoted . We also write for and for . A clause is a set of literals, that we write as a disjunction of its elements. A clause is non-tautological if it does not contain both a variable and its negation. The size of a clause is the number of literals in it. A CNF formula, or CNF, is a set of clauses, that we write as a conjunction of its elements. A -CNF, where , is a CNF in which all clauses have size at most . The size of a CNF is the sum of the sizes of its clauses.
An assignment, or restriction, is a partial map from the set of variables to . If is an assignment and is a literal, then satisfies if and ; it falsifies if and . If is a clause, then satisfies if it satisfies some literal of ; it falsifies if it falsifies every literal of . The restriction of by , denoted , is if satisfies and if falsifies ; if neither satisfies nor falsifies , then is the clause obtained from by removing all the falsified literals of , i.e., . If is a CNF, then is the CNF that contains for those which are neither satisfied nor falsified by , and that contains the empty clause if some is falsified by .
A clause is a weakening of clause if . A clause is a resolvent of clauses and if there is a variable such that and , and ; we then speak of the resolvent of and on , that we denote by . We also say that is obtained from and by a cut on .
Let be a CNF. A Resolution proof from is a sequence of non-tautological clauses, where and, for all , it holds that is a weakening of a clause in , or there are such that is a weakening of a resolvent of and . The length of the proof is ; each is a line. A Resolution refutation of is a proof from that ends with the empty clause, i.e., . We let denote the minimal such that has a Resolution refutation of length ; if is satisfiable we let . For a sequence of clauses let be obtained from by removing 1’s and replacing 0’s by the empty clause. It is clear that if is a Resolution refutation of of length , then is a Resolution refutation of of length at most .
3 Refutations as structures
For this section we fix a CNF with variables and clauses . We view Resolution refutations of of length as finite structures with a ternary relation and four unary functions :
The meaning of is that the literal is in . For each exactly one of or is non-zero. The meaning of is that is a weakening of the resolvent of and on , where and , and and . The meaning of is that is a weakening of the clause of . Formally, a structure of type (1) is a refutation of of length if the following hold for all , , , and :
|(R1)||or , but not both;|
|(R2)||if , then both and ;|
|(R4a)||if and , then ;|
|(R4b)||if and , then ;|
|(R5a)||if , , and , then ;|
|(R5b)||if , , and , then ;|
|(R6)||if and appears in , then ;|
In words, (R1) determines, for every line , whether it is a weakening of an initial clause, i.e., , or a weakening of a resolvent, i.e., . In the first case by (R6). In the second case, by (R4) and (R5), with (R2) and (R3) ensuring that and are earlier lines in the sequence. Finally, (R7) ensures no is tautological, and (R8) ensures is empty.
We give an example that will play a crucial role in the proof of the width lower bound.
We use to denote the full-tree Resolution refutation of . It has length
and its clauses are arranged in the form of a full binary tree of height with internal nodes and leaves. This tree has one node at level for every that is labelled by the clause
that is, the unique clause in these variables falsified by the assignment that maps to . In particular, the root of the tree is labelled by the empty clause and, for and , the clause that labels node is the resolvent of the clauses and that label the children nodes and on the variable , i.e., . Since is unsatisfiable, every clause that labels a leaf is a weakening of some clause of .
To view this refutation as a structure of type (1) we have to identify the nodes with numbers in . We first identify the leafs, i.e., the nodes with , with the numbers , then we identify the nodes on level , i.e., the nodes with , with the numbers in and so on, with the root getting .
Let for . We set if , and if . We set if , and if and is, say, smallest such that is a weakening of . We set if . If we set and . Finally, if and only if and .
4 Non-relativized formula REF
Given a CNF with variables and non-tautological clauses , and a natural number , we describe a CNF formula that is satisfiable if and only if has a refutation of length . Its variables are:
for , , indicating that .
for , indicating that .
for , indicating that .
for , indicating that .
for , indicating that .
Clearly, any assignment to these variables describes a ternary relation and binary relations , , and . The clauses of are listed in Table 4 of Appendix A. This set of clauses is satisfied precisely by those assignments that describe refutations of of length . Conversely, given a structure as in (1) the associated assignment satisfies if and only if is a refutation of of length ; this assignment maps variables , , , , and to 1 or 0 depending on whether, respectively, , , , , and or not.
The index is mentioned in the variables
Observe that if , then is not mentioned in or . The index-width of a clause is the number of indices mentioned by some variable occurring in the clause. Observe that all clauses of have index-width at most two. The index-width of a Resolution refutation is the maximum index-width of its clauses.
For all integers with and every unsatisfiable CNF with variables, every Resolution refutation of has index-width at least .
Fix an unsatisfiable CNF with variables and clauses. For this proof let denote the formula and let denote the formula , where is the length of the full-tree Resolution refutation of from Example 3, which exists for because it is unsatisfiable. Let be the assignment associated to .
Let be an integer such that and note that since and . We partition into intervals where
In the notation of Example 3, is the set of many nodes at the top levels of the full binary tree. For , the -th block is the set of nodes at level of the full binary tree. In particular, is the set of leaves.
Likewise, we partition into intervals where
Observe that ; let be the bijection defined by so that for all it holds that
Observe that for all :
with the second following from and .
Let be the collection of partial functions such that:
|(H3)||if , then ,|
|(H4)||if with , then .|
In words, condition (H4) says that preserves membership in matching intervals, and (H3) says that the 0-intervals are kept intact through the fixed bijection . Preserving the intervals has the following important consequence:
For every and the following hold:
if , then ,
if , then .
Property 1 follows from (H1) and (H2). To prove 2 we distinguish several cases: If , then by (H4), hence and by (H2), which is smaller than . If for some , then by (H4), hence , and by (H4) again, which is smaller than . If , then first note that by (H3). We distinguish the cases whether or not. In case , we have . Since , by (2) we have . In case , we have , so by (H4), which is smaller than . The proof of 3 is analogous to that of 2. ∎
For a set , let
A condition is a pair , where and are functions in , such that
We say a condition extends if , i.e., extends as a function. Observe, since ,
We define a partial truth assignment that sets the variables of as follows. Note that if , , and are variables of , then , , and are variables of which are evaluated by . The assignment is defined precisely on the variables of that mention some . For such it maps
to , for all and ;
to , for all ;
to , for all ;
to or indicating whether , for all ;
to or indicating whether , for all .
Note that and belong to for every , so is defined in the last two cases.
Clearly, if a condition extends , then . For , the restriction of to , denoted , is the pair where is the restriction of to , and is the restriction of with image .
If is a condition and , then is a condition and .
The requirement that and belong to is obviously satisfied since (H1)-(H4) are preserved by restrictions to subsets that contain . (C1) and (C2) are clear, so is a condition. The inclusion is clear since extends . ∎
If is a condition with and , then there exists a condition that extends and such that .
It is clear that . Write and . We have to find such that . Assume at least one of is not in . Then it is distinct from (i.e., ), say it is in . If , we find as the pre-images of under . Otherwise and we choose such that is injective. This can be done because by (3) or (4), so by (5)
It is clear that . ∎
If is a condition and is a clause of , then .
Let , write and assume is defined on all variables of . Then is defined on all indices mentioned by . We distinguish by cases according to the type (A1)-(A21) of .
In case is of type (A1), i.e., for some , then equals and this is 1 because is a clause of . Case (A2) is similar.
In case (A3), ( and) satisfies for . Note , so is well-defined. Hence . Case (A4) is similar.
In case (A5), equals and this is 1 because is a clause of . Case (A6) is similar.
In case (A7), or is distinct from and then, respectively, or is falsified by . Hence . Case (A8) is similar.
In case (A9), equals . But this is 1 since is a clause of . Case (A10) is similar.
In case (A11), note implies , so by Claim 5 (1). Then is a leaf and . Hence , so . Case (A12) is similar.
In case (A13), note implies . But by (C1) and Claim 5 (2). Case (A14) is similar.
In case (A15), implies and . Hence (by (C1)) and . Further, implies and . Hence falsifies the clause of , a contradiction. Cases (A16)-(A18) are similar.
In case (A19), implies that falsifies the clause of , a contradiction. Case (A20) is similar.
In case (A21), implies and falsifies the . But this is a clause of since by (H3) – contradiction.
This finishes the proof of Claim 8. ∎
We are ready to finish the proof of the lemma. Let be the set of conditions with . Assume that there exists a Resolution refutation of of index-width smaller than . Let where and note that , so . The assignment is empty and falsifies the empty clause, the last clause of the refutation. Let be the earliest clause in the refutation such that for some condition . In particular, is defined on all variables of . By Claim 8, is not a weakening of a clause from . Hence, is obtained by a cut of earlier clauses and on some variable. Let be the index mentioned by this variable. Choose according to Claim 7. Then is defined on all variables in , , and and extends the partial assignment , so falsifies . By soundness it falsifies or , say, it falsifies . Let be the restriction of to the indices mentioned in . Then falsifies and by Claim 6. This contradicts the choice of . ∎
The width lower bound in the previous lemma does not have much to do with Resolution, a more general version can be formulated using the notions of semantic refutations and Poizat width from . The notion of a Poizat tree is straightforwardly adapted to the many-sorted structures coding refutations. Then define index Poizat width like Poizat width but using the index height of a Poizat tree: the maximum over its branches of the number of indices from appearing in queries of the branch. Then the conclusion of the above lemma can be strengthened to: every semantic refutation of contains a formula of index Poizat width at least .
5 Relativized formula RREF
Given a CNF formula with variables and clauses, and a natural number , we define the CNF formula as follows. We again write for the variables and for the clauses of