First-order automated theorem provers, commonly based on refinements and extensions of resolution and superposition calculi [Vampire, EProver, Spass, spassT, Beagle, cruanes2015extending, prover9-mace4]
, have recently achieved a high degree of maturity. Proof production is a key feature that has been gaining importance, as proofs are crucial for applications that require certification of a prover’s answers or that extract additional information from proofs (e.g. unsat cores, interpolants, instances of quantified variables). Nevertheless, proof production is non-trivial[SchulzAPPA], and the most efficient provers do not necessarily generate the shortest proofs. One reason for this is that efficient resolution provers use refinements that restrict the application of inference rules. Although fewer clauses are generated and the search space is reduced, refinements may exclude short proofs whose inferences do not satisfy the restriction.
Longer and larger proofs take longer to check, may consume more memory during proof-checking and occupy more storage space, and may have a larger unsat core, if more input clauses are used in the proof, and a larger Herbrand sequent, if more variables are instantiated [B10, B12, B16, ResolutionHerbrand, Reis]. For these technical reasons, it is worth pursuing efficient algorithms that compress proofs after they have been found. Furthermore, the problem of proof compression is closely related to Hilbert’s 24th Problem [Hilbert24Problem], which asks for criteria to judge the simplicity of proofs. Proof length is arguably one possible criterion for some applications.
For propositional resolution proofs, as those typically generated by SAT- and SMT-solvers, there is a wide variety of proof compression techniques. Algebraic properties of the resolution operation that are potentially useful for compression were investigated in [bwp10]. Compression algorithms based on rearranging and sharing chains of resolution inferences have been developed in [Amjad07] and [Sinz]. Cotton [CottonSplit]
proposed an algorithm that compresses a refutation by repeatedly splitting it into a proof of a heuristically chosen literaland a proof of , and then resolving them to form a new refutation. The algorithm [RedRec] searches for locally redundant subproofs that can be rewritten into subproofs of stronger clauses and with fewer resolution steps. Bar-Ilan et al. [RP08] and Fontaine et al. [LURPI] described a linear time proof compression algorithm based on partial regularization, which removes an inference when it is redundant in the sense that its pivot literal already occurs as the pivot of another inference in every path from to the root of the proof.
In contrast, although proof output has been a concern in first-order automated reasoning for a longer time than in propositional SAT-solving, there has been much less work on simplifying first-order proofs. For tree-like sequent calculus proofs, algorithms based on cut-introduction[BrunoLPAR, Hetzl] have been proposed. However, converting a DAG-like resolution or superposition proof, as usually generated by current provers, into a tree-like sequent calculus proof may increase the size of the proof. For arbitrary proofs in the Thousands of Problems for Theorem Provers (TPTP) [TPTP] format (including DAG-like first-order resolution proofs), there is an algorithm [LPARCzech] that looks for terms that occur often in any Thousands of Solutions from Theorem Provers (TSTP) [TPTP] proof and abbreviates them.
The work reported in this paper is part of a new trend that aims at lifting successful propositional proof compression algorithms to first-order logic. Our first target was the propositional () algorithm [LURPI], which delays resolution steps with unit clauses, and we lifted it to a new algorithm that we called () algorithm [GFOLU]. Here we continue this line of research by lifting the RecyclePivotsWithIntersection () algorithm [LURPI], which improves the RecyclePivots () algorithm [RP08] by detecting nodes that can be regularized even when they have multiple children.
Section 2 introduces the well-known first-order resolution calculus with notations that are suitable for describing and manipulating proofs as first-class objects. Section 3 summarizes the propositional algorithm. Section 4 discusses the challenges that arise in the first-order case (mainly due to unification), which are not present in the propositional case, and conclude with conditions useful for first-order regularization. Section 5 describes an algorithm that overcomes these challenges. Section 6 presents experimental results obtained by applying this algorithm, and its combinations with , on hundreds of proofs generated with the theorem prover on TPTP benchmarks [TPTP] and on randomly generated proofs. Section 7 concludes the paper.
It is important to emphasize that this paper targets proofs in a pure first-order resolution calculus (with resolution and factoring rules only), without refinements or extensions, and without equality rules. As most state-of-the-art resolution-based provers use variations and extensions of this pure calculus and there exists no common proof format, the presented algorithm cannot be directly applied to the proofs generated by most provers, and even had to be specially configured to disable ’s extensions in order to generate pure resolution proofs for our experiments. By targeting the pure first-order resolution calculus, we address the common theoretical basis for the calculi of various provers. In the Conclusion (Section 7), we briefly discuss what could be done to tackle common variations and extensions, such as splitting and equality reasoning. Nevertheless, they remain topics for future research beyond the scope of this paper.
2 The Resolution Calculus
As usual, our language has infinitely many variable symbols (e.g. , , , , , …), constant symbols (e.g. , , , , , …), function symbols of every arity (e.g , , , , …) and predicate symbols of every arity (e.g. , , , ,…). A term is any variable, constant or the application of an -ary function symbol to terms. An atomic formula (atom) is the application of an -ary predicate symbol to terms. A literal is an atom or the negation of an atom. The complement of a literal is denoted (i.e. for any atom , and ). The underlying atom of a literal is denoted (i.e. for any atom , and ). A clause is a multiset of literals. denotes the empty clause. A unit clause is a clause with a single literal. Sequent notation is used for clauses (i.e. denotes the clause ). (resp. , ) denotes the set of variables in the term (resp. in the literal and in the clause ). A substitution is a mapping from variables to, respectively, terms . The application of a substitution to a term , a literal or a clause results in, respectively, the term , the literal or the clause , obtained from , and by replacing all occurrences of the variables in by the corresponding terms in . A literal matches another literal if there is a substitution such that . A unifier of a set of literals is a substitution that makes all literals in the set equal. We will use to denote that subsumes , when there exists a substitution such that .
The resolution calculus used in this paper has the following inference rules:
Definition 2.1 (Resolution).
: : : where and are substitutions such that . The literals and are resolved literals, whereas and are its instantiated resolved literals. The pivot is the underlying atom of its instantiated resolved literals (i.e. or, equivalently, ).
Definition 2.2 (Factoring).
: : where is a unifier of and for any .
A resolution proof is a directed acyclic graph of clauses where the edges correspond to the inference rules of resolution and factoring, as explained in detail in Definition 2.3. A resolution refutation is a resolution proof with root .
Definition 2.3 (First-Order Resolution Proof).
A directed acyclic graph , where is a set of nodes and is a set of edges labeled by literals and substitutions (i.e. , where is the set of all literals and is the set of all substitutions, and denotes an edge from node to node labeled by the literal and the substitution ), is a proof of a clause iff it is inductively constructible according to the following cases:
Axiom: If is a clause, denotes some proof , where is a new (axiom) node.
Resolution111This is referred to as “binary resolution” elsewhere, with the understanding that “binary” refers to the number of resolved literals, rather than the number of premises of the inference rule.: If is a proof and is a proof , where and satisfy the requirements of Definition 2.1, then denotes a proof s.t.
where is a new (resolution) node and denotes the root node of .
Factoring: If is a proof such that satisfies the requirements of Definition 2.2, then denotes a proof s.t.
where is a new (factoring) node, and denotes the root node of . ∎
An example first-order resolution proof is shown below.
: : : : : : :
The nodes , , and are axioms. Node is obtained by resolution on and where , , and . The node is obtained by a factoring on with . The node is the result of resolution on and with , , . Lastly, the conclusion node is the result of a resolution of and , where , , , and . The directed acyclic graph representation of the proof (with edge labels omitted) is shown in Figure 1.
This section explains () [LURPI], which aims to compress irregular propositional proofs. It can be seen as a simple but significant modification of the algorithm described in [RP08], from which it derives its name. Although in the worst case full regularization can increase the proof length exponentially [Tseitin], these algorithms show that many irregular proofs can have their length decreased if a careful partial regularization is performed.
We write to denote a proof-context with a single placeholder replaced by the subproof . We say that a proof of the form is irregular.
Consider an irregular proof and assume, without loss of generality, that and , as in the proof of below. The proof of can be written as , or where is the sub-proof of .
: : : :
: Then, if is replaced by within the proof-context , the clause subsumes the clause , because even though the literal of is propagated down, it gets resolved against the literal of later on below in the proof. More precisely, even though it might be the case that while , it is necessarily the case that and . In this case, the proof can be regularized as follows.
Although the remarks above suggest that it is safe to replace by within the proof-context , this is not always the case. If a node in has a child in , then the literal might be propagated down to the root of the proof, and hence, the clause might not subsume the clause . Therefore, it is only safe to do the replacement if the literal gets resolved in all paths from to the root or if it already occurs in the root clause of the original proof .
These observations lead to the idea of traversing the proof in a bottom-up manner, storing for every node a set of safe literals that get resolved in all paths below it in the proof (or that already occurred in the root clause of the original proof). Moreover, if one of the node’s resolved literals belongs to the set of safe literals, then it is possible to regularize the node by replacing it by one of its parents (cf. Algorithm 1).
The regularization of a node should replace a node by one of its parents, and more precisely by the parent whose clause contains the resolved literal that is safe. After regularization, all nodes below the regularized node may have to be fixed. However, since the regularization is done with a bottom-up traversal, and only nodes below the regularized node need to be fixed, it is again possible to postpone fixing and do it with only a single traversal afterwards. Therefore, instead of replacing the irregular node by one of its parents immediately, its other parent is marked as deletedNode, as shown in Algorithm 2. Only later during fixing, the irregular node is actually replaced by its surviving parent (i.e. the parent that is not marked as deletedNode).
The set of safe literals of a node can be computed from the set of safe literals of its children (cf. Algorithm 3). In the case when has a single child , the safe literals of are simply the safe literals of together with the resolved literal of belonging to ( is safe for , because whenever is propagated down the proof through , gets resolved in ). It is important to note, however, that if has been marked as regularized, it will eventually be replaced by , and hence should not be added to the safe literals of . In this case, the safe literals of should be exactly the same as the safe literals of . When has several children, the safe literals of w.r.t. a child contain literals that are safe on all paths that go from through to the root. For a literal to be safe for all paths from to the root, it should therefore be in the intersection of the sets of safe literals w.r.t. each child.
The and the algorithms differ from each other mainly in the
computation of the safe literals of a node that has many children. While
returns the intersection as shown in Algorithm 3,
returns the empty set (cf. Algorithm 4). Additionally, while in the safe literals of the root node contain all the literals of the root clause, in the root node is always assigned an empty set of literals.
(Of course, this makes a difference only when the proof is not a refutation.)
Note that during a traversal of the proof,
the lines from 5 to 10 in Algorithm 3 are executed as many times as the number of edges in the proof.
Since every node has at most two parents, the number of edges is at most twice the number of nodes.
Therefore, during a traversal of a proof with nodes, lines from 5 to 10 are
executed at most times, and the algorithm remains linear.
In our prototype implementation, the sets of safe literals are instances of Scala’s
mutable.HashSet class. Being mutable, new elements can be added efficiently.
And being HashSets, membership checking is done in constant time in the average case,
and set intersection (line 12) can be done in , where is the number of sets and is the size of the smallest set.
When applied to the proof shown in Figure (a)a, the algorithm assigns and as the safe literals of, respectively, and . The safe literals of w.r.t. its children and are respectively and , and hence the safe literals of are (the intersection of and ). Since the right resolved literal of () belongs to ’s safe literals, is correctly detected as a redundant node and hence regularized: is replaced by its right parent . The resulting proof is shown in Figure (b)b.
4 Lifting to First-Order
In this section, we describe challenges that have to be overcome in order to successfully adapt to the first-order case. The first example illustrates the need to take unification into account. The other two examples discuss complex issues that can arise when unification is taken into account in a naive way.
Consider the following proof . When computed as in the propositional case, the safe literals for are .
: : : : : : :
As neither of ’s resolved literals is syntactically equal to a safe literal, the propositional algorithm would not change . However, ’s left resolved literal is unifiable with the safe literal . Regularizing , by deleting the edge between and and replacing by , leads to further deletion of (because it is not resolvable with ) and finally to the much shorter proof below.
: : :
Unlike in the propositional case, where a resolved literal must be syntactically equal to a safe literal for regularization to be possible, the example above suggests that, in the first-order case, it might suffice that the resolved literal be unifiable with a safe literal. However, there are cases, as shown in the example below, where mere unifiability is not enough and greater care is needed.
The node appears to be a candidate for regularization when the safe literals are computed as in the propositional case and unification is considered naïvely. Note that , and the resolved literal is unifiable with the safe literal ,
: : : : : : :
However, if we attempt to regularize the proof, the same series of actions as in Example 4.1 would require resolution between and , which is not possible.
One way to prevent the problem depicted above would be to require the resolved literal to be not only unifiable but subsume a safe literal. A weaker (and better) requirement is possible, and requires a slight modification of the concept of safe literals, taking into account the unifications that occur on the paths from a node to the root.
The set of safe literals for a node in a proof with root clause , denoted , is such that if and only if or for all paths from to the root of there is an edge with .
As in the propositional case, safe literals can be computed in a bottom-up traversal of the proof. Initially, at the root, the safe literals are exactly the literals that occur in the root clause. As we go up, the safe literals of a parent node of where is set to . Note that we apply the substitution to the resolved literal before adding it to the set of safe literals (cf. algorithm 3, lines 8 and 10). In other words, in the first-order case, the set of safe literals has to be a set of instantiated resolved literals.
In the case of Example 4.2, computing safe literals as defined above would result in , where clearly the pivot in is not safe. A generalization of this requirement is formalized below.
Let be a node with safe literals and parents and , assuming without loss of generality, . The node is said to be pre-regularizable in the proof if matches a safe literal .
This property states that a node is pre-regularizable if an instantiated resolved literal matches a safe literal. The notion of pre-regulariziability can be thought of as a necessary condition for recycling the node .
Satisfying the pre-regularizability is not sufficient. Consider the proof in Figure 3. After collecting the safe literals, . ’s pivot matches the safe literal . Attempting to regularize would lead to the removal of , the replacement of by and the removal of (because does not contain the pivot required by ), with also being replaced by . Then resolution between and results in , which cannot be resolved with , as shown below.
: : : : : ??
’s literal , which would be resolved with ’s literal, was changed to due to the resolution between and .
Thus we additionally require that the following condition be satisfied.
Let be pre-regularizable, with safe literals and parents and , with clauses and respectively, assuming without loss of generality that such that matches a safe literal . The node is said to be strongly regularizable in if .
This condition ensures that the remainder of the proof does not expect a variable in to be unified to different values simultaneously. This property is not necessary in the propositional case, as the literals of the replacement node would not change lower in the proof.
The notion of strongly regularizable can be thought of as a sufficient condition.
Let be a proof with root clause and be a node in . Let and be the root of . If is strongly regularizable, then .
By definition of strong regularizability, is such that there is a node with clause and such that and matches a safe literal and .
Firstly, in , has been replaced by . Since , by definition of , every literal in either subsumes a single literal that occurs as a pivot on every path from to the root in (and hence on every new path from to the root in ) or subsumes literals ,…, in . In the former case, is resolved away in the construction of (by contracting the descendants of with the pivots in each path). In the latter case, the literal () in is a descendant of through a path and the substitution is the composition of all substitutions on this path. When is replaced by , two things may happen to . If the path does not go through , remains unchanged (i.e. unless the path ceases to exist in ). If the path goes through , the literal is changed to , where is such that .
Secondly, when is replaced by , the edge from ’s other parent to ceases to exist in . Consequently, any literal in that is a descendant of a literal in the clause of through a path via will not belong to .
Thirdly, a literal from that descends neither from nor from either remains unchanged in or, if the path to the node from which it descends ceases to exist in the construction of , does not belong to at all.
Therefore, by the three facts above, , and hence . ∎
As the name suggests, strong regularizability is stronger than necessary. In some cases, nodes may be regularizable even if they are not strongly regularizable. A weaker condition (conjectured to be sufficient) is presented below. This alternative relies on knowledge of how literals are changed after the deletion of a node in a proof (and it is inspired by the post-deletion unifiability condition described for in [GFOLU]). However, since weak regularizability is more complicated to check, it is not as suitable for implementation as strong regularizability.
Let be a pre-regularizable node with parents and , assuming without loss of generality that such that is unifiable with some . For each safe literal , let be a node on the path from to the root of the proof such that is the pivot of . Let be the set of all resolved literals such that , , and , for some nodes and and unifier ; if no such node exists, define . The node is said to be weakly regularizable in if, for all , all elements in are unifiable, where is the literal in that used to be222Because of the removal of , may differ from . in and is the set of literals in that used to be the literals of in .
This condition requires the ability to determine the underlying (uninstantiated) literal for each safe literal of a weakly regularizable node . To achieve this, one could store safe literals as a pair , rather than as an instantiated literal , although this is not necessary for the previous conditions.
Note further that there is always at least one node as assumed in the definition for any safe literal which was not contained in the root clause of the proof: the node which resulted in being a safe literal for the path from to the root of the proof. Furthermore, it does not matter which node is used. To see this, consider some node with the same pivot . Consider arbitrary nodes and such that and where . Now consider arbitrary nodes and such that and where . Since the pivots for and are equal, we must have that and , and thus . This shows that it does not matter which we use; the instantiated resolved literals will always be equal implying that both of the resolved literals and will be contained in both and .
Informally, a node is weakly regularizable in a proof if it can be replaced by one of its parents , such that for each , can still be used as a pivot in order to complete the proof. Weakly regularizable nodes differ from strongly regularizable nodes by not requiring the entire parent replacing the resolution to be simultaneously matched to a subset of , and requires knowledge of how literals will be instantiated after the removal of and from the proof.
This example illustrates a case where a node is weakly regularizable but not strongly regularizable. Table 1 shows the sets , and for the nodes in the proof below. Observe that is pre-regularizable, since is unifiable with . In fact, is the only pre-regularizable node in the proof, and thus the sets for all . In the proof below, note that is not strongly regularizable: there is no unifier such that . : : : : : : : :
We show that is weakly regularizable, and that can be removed. Recalling that is pre-regularizable, observe that is unifiable. Consider the following proof of : : : : : : : : : Now observe that for each we have the following, showing that is weakly regularizable:
: which is unifiable with
: which is (trivially) unifiable with
: which is unifiable with
: which is unifiable with
If a node with parents and is pre-regularizable and strongly regularizable in , then is also weakly regularizable in .
() (cf. Algorithm 5) is a first-order generalization of the propositional . traverses the proof in a bottom-up manner, storing for every node a set of safe literals. The set of safe literals for a node is computed from the set of safe literals of its children (cf. Algorithm 7), similarly to the propositional case, but additionally applying unifiers to the resolved literals (cf. Example 4.2). If one of the node’s resolved literals matches a literal in the set of safe literals, then it may be possible to regularize the node by replacing it by one of its parents.