Tableau methods (see for instance  or ) always played a crucial role in the development of new techniques for automated theorem proving. They are easy to comprehend and implement, well-adapted to interactive theorem proving, and, therefore, normally form the basis of the first proof procedure for any newly defined logic . Nonetheless, they cannot compete with resolution-based calculi both in terms of efficiency and deductive power (i.e. proof length, see for instance ). This is partly due to the ability of resolution-based methods to generate lemmas and to simulate atomic cuts111We recall that the cut rule consists in expanding a tableau by adding two branches with and respectively, where is any formula (intuitively can be viewed as a lemma). A cut is atomic if is atomic. in a feasible way. There have been attempts to integrate some restricted forms of cut into tableau methods, improving both efficiency and proof size (see for instance [10, 6]). But, for more general forms of cuts, it is difficult to decide whether an application of the cut rule is useful or not, thus the rule is not really applicable during proof search. Instead, cuts may be introduced after the proof is generated, to make it more compact by introducing lemmas and fusing recurring patterns [8, 9].
In this paper, rather than trying to integrate cuts into the tableau calculus, we devise a new tableau procedure in which a proof compression, that is similar to the compressive power of a -cut, is achieved by employing a shared representation of literals. Formal definitions will be given later, but we now provide a simple example to illustrate our ideas. Consider the schema of clause sets: . A closed tableau can be constructed by adding copies of the clauses and (for ) and unifying all variables and with . One gets a tableau of size . To make the proof more compact, we may merge the inferences applied for each , since each of these constants are handled in the same way. This can be done by first applying the cut rule on the formula . The branch corresponding the can be closed by using the first clause. In the branch corresponding to a constant is generated by skolemization and the branch can be closed by unifying and with . This yields a tableau of size . Since it is hard to guess in advance whether such an application of the cut rule will be useful or not, we investigate another solution allowing the same proof compression. We represent the disjunction by a single literal , together with a set of substitutions . Intuitively, this literal states that holds for some term , and the given set of substitutions specifies the possible values of . In the following, we call such variables abstraction variables. The clauses are kept as compact as possible by grouping all literals with the same heads and in some cases inferences may be performed uniformly regardless of the value of . In our example we get a tableau of size by unifying and with , this tableau may be viewed as a compact representation of an ordinary tableau, obtained by making copies of the tree, with . If we find out that an inference is applicable only for some specific value(s) of (e.g., if one wants to close a branch by unifying with a clause ), then one may “separate” the literal by isolating some substitution (or sets of substitutions) before proceeding with the inference.
In this paper, we formalize these ideas into a tableau calculus called -tableau. Basic inference rules are devised to construct -tableaux and a strategy is provided to apply these rules efficiently, keeping the tableau as compact as possible. We prove that the procedure is sound and refutationally complete and that it may reduce the size of the proofs by an exponential factor. Our approach may be combined with all the usual refinements of the tableau procedure.
We briefly review usual definitions (we refer to, e.g.,  for details). Terms, atoms and clauses are built as usual over a (finite) set of function symbols (including constants, i.e. nullary function symbols), an (infinite and countable) set of variables and a (finite) set of predicate symbols . The set of variables occurring in an expression (term, atom or clause) is denoted by . For readability, a term is sometimes written . Ordinary (clausal) tableaux are trees labelled by literals and built by applying Expansion and Closure rules, the Expansion rule expands a leaf by children labelled by literals , where a copy of occurs in the clause set at hand, and the Closure rule closes a branch by unifying the atoms of two complementary literals. A substitution is a function (with finite domain) mapping variables to terms. A substitution mapping to (for ) is written . The identity substitution (for ) is denoted by . The image of an expression by a substitution is defined inductively as usual and written .
3 A Shared Representation of Literals
We introduce the notion of an -literal, that is a compact representation of a disjunction of ordinary literals with the same shape. The interest of this representation is that it will allow us to perform similar inferences in parallel on all these literals. We assume that is partitioned into two (infinite) sets and . The variables in are ordinary variables. They may be either universally quantified variables in clauses, or rigid variables in tableaux. The variables in are called abstraction variables. These are not variables in the standard sense, but can been seen rather as placeholders for a term that may take different values in different literals or branches. These variables will permit to share inferences applied on different literals. The set of ordinary variables (resp. abstraction variables) that occur in a term is denoted by (resp. ). A renaming is an injective substitution such that and .
Definition 1 (Syntax of -Clauses)
An -literal is either or a triple , where:
is either a predicate symbol or the negation of a predicate symbol ,
is an -tuple of terms, where is the arity of ,
and is a set of substitutions with the same domain (by convention is empty if ) and such that .
An -clause is a set of -literals, often written as a disjunction.
With a slight abuse of words, we will call the set in the above definition the domain of (denoted by ). The semantics of -clauses is defined by associating each -literal with an ordinary clause (or ):
Definition 2 (Semantics of -Clauses)
For every -literal , we denote by the formula defined as follows (with the convention that empty disjunctions are equivalent to ):
For every -clause , we denote by the clause . For every set of -clauses , we denote by the formula (in conjunctive normal form) .
We write iff (up to the usual properties of and : associativity, commutativity and idempotence).
Let be a unary predicate, be a binary predicate, be a constant, be unary functions, be an ordinary variable, and be abstraction variables. The triples and
are -literals, and
The common shape is shared between the two literals in the second clause.
Observe that if then , i.e. denotes an empty clause. Moreover, any ordinary literal may be encoded as an -literal where the set of substitutions is a singleton, e.g., . Also, an -literal is always equivalent to .
The application of a substitution to an -literal is defined as follows:
where denotes the restriction of to the variables not occurring in .
Let and . Then:
Let and . Then .
Let be an -literal. If , then one of the following conditions hold:
The identity is the only substitution with empty domain, whence the result.
A given ordinary clause may be represented by many different -clauses, for instance may be represented as or , or even . In practice it is preferable to start with a representation in which useless literals are deleted and in which the remaining literals are grouped when possible. This motivates the following:
An -clause is in bundled normal form (short: ) if it satisfies the following conditions.
For every -literal , .
If then .
For all distinct literals , is distinct from .
A -clause set is in if all -clauses of are in .
be -literals. The -clause is not in while the -clauses and are in .
An -clause is well-formed if for all distinct literals and in , .
Consider the two -clauses and of Example 3. is well-formed, is not well-formed. By renaming, can be transformed into .
It is clear that every -clause can be transformed into an equivalent well-formed -clause by renaming. In the following, we shall implicitly assume that all the considered -clauses are well-formed.
Let be a formula in conjunctive normal form. Then there is an -clause set in such that .
Let be a clause of and let the pairwise different predicate symbols or negated predicate symbols occurring in . For each symbol , we collect all term tuples such that (for ). It is clear that all the tuples for have the same length , where is the arity of the predicate symbol of . We define the -literal where is a tuple of fresh abstraction variables and is the set of substitutions . Then we can define the -clause . It is easy to check that and that is in . By applying this method to every clause of , we eventually get a set of -clauses in such that .
Consider the clause of Example 3. It can be written in as , where denotes the following set of substitutions:
4 A Tableaux Calculus for -Clauses
In this section, we devise a tableaux calculus for refuting sets of -clauses. This calculus is defined by a set of inference rules, that, given an existing tableau , allow one to:
Expand a branch with new children, by introducing a new copy of an -clause of the set at hand.
Instantiate some of the (rigid) variables occurring in the tableau.
Separate shared literals inside an -clause, so that different inferences can be applied on each of the corresponding branches. The rule can be applied on nodes that are not leaves.
Steps and are standard, but Step is original.
Definition 5 (Pre-Tableau)
A pre-tableau is a tree where vertices are labelled by -literals or by . We call the direct successors of a node its children. The root is the (unique) node that is not a child of any node in and a leaf is a node with no child. A path is a sequence of nodes such that is a child of for . Furthermore, we call the initial node of and the last node of . A branch is a path such that the initial node is the root and the last node is a leaf. With a slight abuse of words we say that a branch contains an -literal if it contains a node labelled by .
The descendants of a node are inductively defined as and the descendants of the children of . The subtree of root in is the subtree consisting of all the descendants of , as they appear in .
If is a non-leaf node with exactly children labelled by -literals respectively, then the formula associated with is defined as: .
We say that an -literal (resp. a node labelled by ) introduces an abstraction variable if .
Let be a pre-tableau and be a substitution. Then denotes the result of applying to all -literals labelling the nodes of .
Definition 7 (Tableau for a Set of -Clauses)
An -tableau for a set of -clauses is a pre-tableau build inductively by applying the rules Expansion, Instantiation and Separation to an initial tableau containing only one node, labelled by (also called the initial -literal).
In the following, the word “tableau” always refers to an -tableau, unless specified otherwise (we use the expression “ordinary tableau” for standard ones).
The rules are defined as follows (in each case, denotes a previously constructed tableau for a set of -clauses .
Let be a leaf of , and be an element of not containing . Let be a copy of where all variables that occur also in are renamed such that share no variable222Note that both ordinary and abstraction variables are renamed. with . The pre-tableau constructed by adding a new child labelled by to for each is a tableau for .
Let be a term and such that for all nodes , if is labelled by an -literal containing and introduces an abstraction variable , then is a proper descendant of . Then is a tableau for .
Observe that if contains no abstraction variables then the condition always holds, since no node satisfying the above property exists. In practice, the Instantiation rule should of course not be applied with arbitrary variable and term. Unification will be used instead to find the most general instantiations closing a branch. A formal definition will be given later (see Definition 11).
be two tableaux for some set of -clauses . The pre-tableau is not a tableau, because is substituted by a term containing an abstraction variable , and occurs above the literal introducing . On the other hand, is a tableau.
The rule is illustrated in Figure 3 towards Figure 3. Let be a non-leaf node of . Let be a child of , labelled by . Let be an instance of , with and . Let be the set of substitutions such that there exists a substitution with and every variable in is an abstraction variable not occurring in , and let . Assume that . We define the new literal . The Separation rule is defined as follows:
We apply the substitution to 333Actually, due to the above conditions, the variables in only occur in the subtree of root , hence only affects this subtree. .
We replace the label of by .
We add a new child to the node , labelled by a literal .
Observe that if then hence the third step may be omitted, since the added branch is unsatisfiable anyway. The rule does not apply if is empty.
be -literals and be an -clause set in . Applying three times the Expansion rule, we can derive the tableau
[ [ [ [ ] ] [ ] ] ]
Now, we apply two times the Separation rule. First we choose of Figure 3 to be and to be , where the substitution is (hence we get ). Afterwards, we choose analogously (which is the result of the first application) and , with the substitution . Both times, the tuple of the Separation rule is (with in both cases). This leads to the tableau:
[ [ [ [ ] ] [ ] ] ]
Afterwards, we again apply the Separation rule, to modify the node labelled with where and . We abbreviate by .
[ [ [ [ ] ] [ ] [ ] ] ]
After some further applications of the Separation and Expansion rules, we are able to construct the following tableau by applying the Instantiation rule with the substitutions .
[ [ [ [ ] ] [ [ ] ] [ ] ] ]
Let be a pre-tableau or a tableau. A branch of is closed if it contains or two nodes labelled by literals such that , , for some predicate symbol . The (pre)-tableau is closed iff all branches of are closed.
The final tableau of Example 7 contains three branches, i.e.
All of them are closed and so the tableau is closed. Observe that the inferences closing the branches corresponding to the literals and in are shared in the constructed tableau (both branches are closed by introduced suitable instances of ), whereas the literal is handled separately (by instantiating by and using yet another instance of ).
Let be a tableau. If and are distinct nodes in , labelled by the -literals and respectively, then and have disjoint domains.
This is immediate since the -clauses are well-formed and since variables are renamed before all applications of the Expansion rule, so that the considered -clause share no variable with the tableau. Afterwards, none of the construction rules can affect the domain of the substitutions. For the Separation rule, the condition on in the definition of the rule guarantees that the new -literal introduces variables that are distinct from those in . The detailed proof is by an easy induction on tableaux.
Let be a tableau for a set of -clauses . For every non-leaf node in , the formula associated with (as defined in Definition 5) is an instance of a formula , where is a renaming of an -clause in .
It suffices to show that all the construction rules preserve the desired property.
Expansion. The property immediately holds for the nodes on which the rule is applied, by definition of the rule. The other nodes are not affected.
Instantiation. By definition, the formula associated with a node in the final tableau is an instance of the formula associated with in the initial one. Thus the property holds.
Separation. The nodes occurring outside of the subtree of root are not affected. By definition, the formula associated with the descendants of in the new tableau are instances of formulas associated with nodes of the initial tableau. Thus it only remains to consider the node . The formula associated with in the final tableau is obtained from that of the initial one by removing the formula corresponding to an -literal and replacing it by . Since , we have (since by definition of the Separation rule). Thus and the proof is completed.
Let be a tableau for . Let be a variable introduced in a node and assume that occurs in an -literal labelling a node . Then is a descendant of .
By an easy induction on tableau. The property is preserved by any application of the Expansion rule because the variables in the considered -clause are renamed by fresh variables, it is preserved by the Instantiation rule due to the conditions associated with the rule, and the Separation rule only instantiates abstraction variables.
Let be a closed tableau for and let be the tableau after applying once the Separation rule to a node of with a child labelled with . Then there is at most one branch in that is not closed. Moreover, this branch necessarily contains the node labelled by (see Figure 3). In particular, if is empty then is and is closed.
Let be a branch in and let be the substitution that we apply to in the Separation rule (see Figure 3). Assume that does not contain . Then there is a branch in where for every literal (with ), either and , where replaced during the application of the Separation rule, or . Since is closed, is closed, i.e. there are two -literals such that and . If , then (by Proposition 2) , thus and . Consequently is closed. Otherwise, one of the literals or is , say , and . Then, by definition of the Separation rule, we have . By Proposition 2, , thus , hence is closed.
Theorem 5.1 (Soundness)
If a set of -clauses admits a closed tableau then is unsatisfiable.
We prove soundness by transforming into a closed tableau that contains only -literals where the substitution set is . Due to Proposition 3, the resulting tableau then corresponds to an ordinary tableau, i.e. a tableau constructed only by the Expansion rule and the Instantiation rule. The -literals could be replaced by usual literals. The soundness of ordinary tableau then implies the statement.
We transform the tableau by an iterative procedure. We always take the topmost node labelled with an -literal where (if there is more than one topmost literal we can arbitrarily choose one). Then we consider a substitution and we apply the Separation rule with the tuple . We have and if , then , hence there is no substitution with , such that only contains fresh variables. Consequently, the rule splits into a singleton and . The literal gets replaced by and we add the node labelled with (note that cannot be the root of the tableau, because the root is always labelled by ). By Lemma 2, there is at most one non-closed branch, i.e. the branch ending with the node is the only open branch. We consider a copy of the subtree of root in , renaming all variables introduced in by fresh variables. We replace the root node of by and replace the subtree of root in the tableau by . It is easy to check that the obtained tableau is a closed tableau for . Furthermore, the length of the branches does not increase, the number of non-empty substitutions occurring in the -literals does not increase, and it decreases strictly in and . This implies that the multiset of multisets of natural numbers, where is a branch in is strictly decreasing according to the multiset extension of the usual ordering. Since this ordering is well-founded, the process eventually terminates, and after a finite number of applications of this procedure we get a tableau only containing nodes labelled with -literals whose substitution set is equal to . This finishes the proof.
As a by-product of the proof, we get that the size of the minimal ordinary tableau for a clause set is bounded exponentially by the size of any closed tableau for . Indeed, we constructed an ordinary tableau from an -tableau in which every branch in is replaced by (at most) branches, where is the maximal number of substitutions in . In Section 7, we shall prove that this bound is precise, i.e. that our tableau calculus allows exponential reduction of proof size w.r.t. ordinary (cut-free) tableaux.
Proving completeness of -tableau is actually a trivial task, since one could always apply the Separation rule in a systematic way on all -literals to transform them into ordinary literals (as it is done in the proof of Theorem 5.1), and then get the desired result by completeness of ordinary tableau. However, this strategy would not be of practical use. Instead, we shall devise a strategy that keeps the -tableau as compact as possible and at the same time allows one to “simulate” any application of the ordinary expansion rules. In this strategy, the Separation rule is applied on demand, i.e. only when it is necessary to close a branch. No hypothesis is assumed on the application of the ordinary expansion rules, therefore the proposed strategy is “orthogonal” to the usual refinements of ordinary tableaux, for instance connection tableaux444Connection tableaux can be seen as ordinary tableaux in which any application of the Expansion rule must be followed by the closure of a branch, using one of the newly added literals and the previous literal in the branch.  or hyper-tableaux555Hyper-tableaux may be viewed in our framework as ordinary tableaux in which the Expansion rule must be followed by the closure of all the newly added branches containing negative literals. . Thus our approach can be combined with any refutationally complete tableau procedure .
The main idea denoted by simulate a strategy is to do the same steps as in ordinary tableau, while keeping -clauses as compressed as possible. If ordinary tableau expands the tableau by a clause, we expand the tableau with the corresponding -clause, and if a branch is closed in the ordinary tableau, then the corresponding branch is closed in the -tableau. This last step is not trivial: Given two ordinary literals and the ordinary tableau might compute the most general unifier (mgu) of and . But in the presented formalism, the two literals might not appear as such, i.e. there are no literals and . In general, there are only -literals and such that and , where and denote the compositions of the substitutions occurring in the -literals in the considered branch. Note also that, although and are unifiable, the Instantiation rule cannot always be applied to unify them and close the branch. Indeed, the domain of the mgu may contain abstraction variables, whereas the Instantiation rule only handles universal variables. For showing completeness, it would suffice to apply the Separation rule on each ancestor of and involved in the definition of or , to create a branch where the literals and appear explicitly. Thereby, we would loose a lot of the formalisms benefit. Instead, we shall introduce a strategy that uses the Separation rule only if this is necessary for making the unification of and feasible (by mean of the Instantiation rule). Such applications of the Separation rule may be seen as preliminary steps for the Instantiation rule. This follows the maxim to stay as general as possible because a more general proof might be more compact.
In the formalisation of the Instantiation rule we ensured soundness by allowing abstraction variables only to occur in descendants of the literal that introduced the variable. This has a drawback to our strategy: The unification process that we try to simulate can ask for an application of the Instantiation rule which would cause a violation of this condition for abstraction variables if we follow the procedure in the former paragraph. We thus have to add further applications of the Separation rule to ensure that this condition is fulfilled.
Let be a path in a tableau where is the initial node of and each node (with is labelled by .
An abstraction substitution for is a substitution with , for .
A conflict in a branch is a pair with , and are dual and and are unifiable. A conflict is -realizable if is an abstraction substitution for such that and are unifiable.
In practice, we do not have to check that a conflict is realizable (this would be costly since we have to consider exponentially many substitutions).
If is a conflict then and are necessarily unifiable, with some mgu . As mentioned before, this does not mean that a branch with conflict can be closed. Moreover, according to the restriction on the Instantiation rule, a variable cannot be instantiated by a term containing an abstraction variable , if occurs in some ancestor of the literal introducing in the tableau. This motivates the following:
A variable is blocking for a conflict , where if or occurs in a term , where and occurs in a literal labelling an ancestor of the node introducing .
Finally, we introduce a specific application of the Separation rule which allows one either to “isolate” some literals in order to ensure that they have a specific “shape” (as specified by a substitution), or to eliminate abstraction variables completely if needed.
If is a substitution, we denote by