1 Introduction
For the efficiency of organizing proof search during saturationbased firstorder theorem proving, simplification rules are of critical importance. Simplification rules are inference rules that do not add new formulas to the search space, but simplify formulas by deleting (redundant) clauses from the search space. As such, simplification rules reduce the size of the search space and are crucial in making automated reasoning efficient.
When reasoning about properties of firstorder logic with equality, one of the most common simplification rules is demodulation [10] for rewriting (and hence simplifying) formulas using unit equalities , where are terms and denotes equality. As a special case of superposition, demodulation is implemented in firstorder provers such as E [13], Spass [20] and Vampire [10]. Recent applications of superpositionbased reasoning, for example to program analysis and verification [5], demand however new and efficient extensions of demodulation to reason about and simplify upon conditional equalities , where is a firstorder formula. Such conditional equalities may, for example, encode software properties expressed in a guarded command language, with denoting a guard (such as a loop condition) and encoding equational properties over program variables. We illustrate the need of considering generalized versions of demodulation in the following example.
Example 1
Consider the following formulas expressed in the firstorder theory of integer linear arithmetic:
(1) 
Here, is an implicitly universally quantified logical variable of integer sort, and is integervalued constant. Firstorder reasoners will first clausify formulas (1), deriving:
(2) 
By applying demodulation over (2), the formula is rewritten^{1}^{1}1assuming that is simpler/smaller than using the unit equality , yielding the clause . That is, is derived from (1) by one application of demodulation.
Let us now consider a slightly modified version of (1), as below:
(3) 
whose clausal representation is given by:
(4) 
In this paper we propose a generalized version of demodulation, called subsumption demodulation, allowing to rewrite terms and simplify formulas using rewriting based on conditional equalities, such as in (3). To do so, we extend demodulation with subsumption, that is with deciding whether (an instance of a) clause is a submultiset of a clause . This way, subsumption demodulation can be applied to nonunit clauses and is not restricted to have at least one premise clause that is a unit equality. We show that subsumption demodulation is a simplification rule of the superposition framework (Section 4), allowing for example to derive the clause (5) from (3) in one inference step. By properly adjusting clause indexing and multiliteral matching in firstoder theorem provers, we provide an efficient implementation of subsumption demodulation in Vampire (Section 5) and evaluate our work against stateoftheart reasoners, including E [13], Spass [20], CVC4 [3] and Z3 [7] (Section 6).
Related work.
While several approaches generalize demodulation in superpositionbased theorem proving, we argue that subsumption demodulation improves existing methods either in terms of applicability and/or efficiency. The AVATAR architecture of firstorder provers [18] splits general clauses into components with disjoint sets of variables, potentially enabling demodulation inferences whenever some of these components become unit equalities. Example 1 demonstrates that subsumption demodulation solves applies in situations where AVATAR does not: in each clause of (4), all literals share the variable and hence none of the clauses from (4) can be split using AVATAR. That is, AVATAR would not generate unit equalities from (4), and therefore cannot apply demodulation over (4) to derive (5).
The local rewriting approach of [19] requires rewriting equality literals to be maximal^{2}^{2}2w.r.t. clause ordering in clauses. However, following [10], for efficiency reasons we consider equality literals to be “smaller” than nonequality literals. In particular, the equality literals of clauses (4) are “smaller” than the nonequality literals, preventing thus the application of local rewriting in Example 1.
We further note that the contextual rewriting rule of [1] is more general than our rule of subsumption demodulation. Yet, efficiently automating contextual rewriting is extremely challenging, while subsumption demodulation requires no radical changes in the existing machinery of superposition provers (see Section 5).
To the best of our knowledge, except Spass [20], no other stateoftheart superposition prover implements variants of conditional rewriting. Subterm contextual rewriting [21] is a refined notion of contextual rewriting and is implemented in Spass. A major difference of subterm contextual rewriting when compared to subsumption demodulation is that in subsumption demodulation the discovery of the substitution is driven by the side conditions whereas in subterm contextual rewriting the side conditions are evaluated by checking the validity of certain implications by means of a reduction calculus. This reduction calculus recursively applies another restriction of contextual rewriting called recursive contextual ground rewriting, among other standard reduction rules. While subterm contextual rewriting is more general, we believe that the benefit of subsumption demodulation comes with its relatively easy and efficient integration within existing superposition reasoners, as evidenced also in Section 6.
Local contextual rewriting [9] is another refinement of contextual rewriting implemented in Spass. In our experiments it performed similarly to subterm contextual rewriting.
Contributions.
Summarizing, this paper brings the following contributions.

To improve reasoning in the presence of conditional equalities, we introduce the new inference rule subsumption demodulation, which generalizes demodulation to nonunit equalities by combining demodulation and subsumption (Section 4).

Subsumption demodulation does not require radical changes to the underlying superposition calculus. We implemented subsumption demodulation in the firstorder theorem prover Vampire, by extending Vampire with a new clause index and adapting its multiliteral matching component (Section 5).

We compared our work against stateoftheart reasoners, using the TPTP and SMTLIB benchmark repositories. Our experiments show that subsumption demodulation in Vampire can solve 11 firstorder problems that could so far not be solved by any other stateoftheart provers, including Vampire, E, Spass, CVC4 and Z3 (Section 6).
2 Preliminaries
For simplicity, in what follows we consider standard firstorder logic with equality, where equality is denoted by . We support all standard boolean connectives and quantifiers in the language. Throughout the paper, we denote terms by , variables by , constants by , function symbols by and predicate symbols by , all possibly with indices. Further, we denote literals by and clauses by , again possibly with indices. We write to denote the formula . A literal is called an equality literal. We consider clauses as multisets of literals and denote by the subset relation among multisets. A clause that only consists of one one equality literal is called a unit equality.
An expression is a term, literal, or clause. We write to mean an expression with a particular occurrence of a term . A substitution, denoted by , is any finite mapping of the form , where . Applying a substitution to an expression yields another expression, denoted by , by simultaneously replacing each by in . We say that is an instance of . A unifier of two expressions and is a substitution such that . If two expressions have a unifier, they also have a most general unifier (mgu). A match of expression to expression is a substitution such that . Note that any match is a unifier (assuming the sets of variables in and are disjoint), but not viceversa, as illustrated below.
Example 2
Let and be the clauses and , respectively. The only possible match of to is . On the other hand, the only possible match of to is . As and are not the same, there is no match of to . Note however that and can be unified; for example, using .
Superposition inference system.
We assume basic knowledge in firstorder theorem proving and superposition reasoning [2, 11]. We adopt the notations and the inference system of superposition from [10]. We recall that firstorder provers perform inferences on clauses using inference rules, where an inference is usually written as: with . The clauses are called the premises and is the conclusion of the inference above. An inference is sound if its conclusion is a logical consequence of its premises. An inference rule is a set of inferences and an inference system is a set of inference rules. An inference system is sound if all its inference rules are sound.
Modern firstorder theorem provers implement the superposition inference system for firstorder logic with equality. This inference system is parametrized by a simplification ordering over terms and a literal selection function over clauses. In what follows, we denote by a simplification ordering over terms, that is is a wellfounded partial ordering satisfying the following three conditions:

stability under substitutions: if , then ;

monotonicity: if , then ;

subterm property: whenever is a proper subterm of .
The simplification ordering on terms can be extended to a simplification ordering on literals and clauses, using a multiset extension of orderings. For simplicity, the extension of to literals and clauses will also be denoted by . Whenever , we say that is bigger than and is smaller than w.r.t. . We say that an equality literal is oriented, if or . The literal extension of asserts that negative literals are always bigger than their positive counterparts. Moreover, if , where and are positive, then . Finally, equality literals are set to be smaller than any literal using a predicate different than .
A selection function selects at least one literal in every nonempty clause. In what follows, selected literals in clauses will be underlined: when writing , we mean that (at least) is selected in . In what follows, we assume that selection functions are wellbehaved w.r.t. : either a negative literal is selected or all maximal literals w.r.t. are selected.
In the sequel, we fix a simplification ordering and a wellbehaved selection function and consider the superposition inference system, denoted by Sup, parametrized by these two ingredients. The inference system Sup for firstorder logic with equality consists of the inference rules of Figure 1, and it is both sound and refutationally complete. That is, if a set of clauses is unsatisfiable, then the empty clause (that is, the always false formula) is derivable from in Sup.

Resolution and Factoring
where is not an equality literal and

Superposition
where not a variable, is not an equality, , and

Equality Resolution and Equality Factoring
where , and
3 Superpositionbased Proof Search
We now overview the main ingredients in organizing proof search within firstorder provers, using the superposition calculus. For details, we refer to [2, 11, 10].
Superpositionbased provers use saturation algorithms: applying all possible inferences of Sup in a certain order to the clauses in the search space until (i) no more inferences can be applied or (ii) the empty clause has been derived. A simple implementation of a saturation algorithm would however be very inefficient as applications of all possible inferences will quickly blow up the search space.
Saturation algorithms can however be made efficient by exploiting a powerful concept of redundancy: deleting socalled redundant clauses from the search space by preserving completeness of Sup. A clause in a set of clauses (i.e. in the search space) is redundant in , if there exist clauses in , such that and . That is, a clause is redundant in if it is a logical consequence of clauses that are smaller than w.r.t. . It is known that redundant clause can be removed from the search space without affecting completeness of superpositionbased proof search. For this reason, saturationbased theorem provers, such as E, Spass and Vampire, not only generate new clauses but also delete redundant clauses during proof search by using both generating and simplifying inferences.
Simplification rules. A simplifying inference is an inference in which one premise becomes redundant after the addition of the conclusion to the search space, and hence can be deleted. In what follows, we will denote deleted clauses by drawing a line through it and refer to simplifying inferences as simplification rules. The premise that becomes redundant is called the main premise, whereas other premises are called side premises of the simplification rule. Intuitively, a simplification rule simplifies its main premise to its conclusion by using additional knowledge from its side premises. Inferences that are not simplifying are called generating, as they generate and add a new clause to the search space.
In saturationbased proof search, we distinguish between forward and backward simplifications. During forward simplification, a newly derived clause is simplified using previously derived clauses as side clauses. Conversely, during backward simplification a newly derived clause is used as side clause to simplify previously derived clauses.
Demodulation. One example of a simplification rule is demodulation, or also called rewriting by unit equalities. Demodulation is the following inference rule:
where , and , for some substitution .
It is easy to see that demodulation is a simplification rule. Moreover, demodulation is special case of a superposition inference where one premise of the inference is deleted. However, unlike a superposition inference, demodulation is not restricted to selected literals.
Example 3
Consider the clauses and . Let be the substitution . By the subterm property of , we have . Further, as equality literals are smaller than nonequality literals, we have . We thus apply demodulation and is simplified into the clause :
Deletion rules. Even when simplification rules are in use, deleting more/other redundant clauses is still useful to keep the search space small. For this reason, in addition to simplifying and generating rules, theorem provers also use deletion rules: a deletion rule checks whether clauses in the search space are redundant due to the presence of other clauses in the search space, and removes redundant clauses from the search space.
Given clauses and , we say subsumes if there is some substitution such that is a submultiset of , that is . Subsumption is the deletion rule that removes subsumed clauses from the search space.
Example 4
Let and be clauses in the search space. Using , it is easy to see that subsumes , and hence is deleted from the search space. ∎
4 Subsumption Demodulation
In this section we introduce a new simplification rule, called subsumption demodulation, by extending demodulation to a simplification rule over conditional equalities. We do so by combining demodulation with subsumption checks to find simplifying applications of rewriting by nonunit (and hence conditional) equalities.
4.1 Subsumption Demodulation for Conditional Rewriting
Our rule of subsumption demodulation is defined below.
Definition 1 (Subsumption Demodulation)
Subsumption demodulation is the inference rule:
(5) 
where:

,

,

, and

.
We call the equality in the left premise of (5) the rewriting equality of subsumption demodulation.
It is easy to see that if and are valid, then also holds. We thus conclude:
Theorem 4.1 (Soundness)
Subsumption demodulation is sound.
Detecting possible applications of subsumption demodulation involves (i) selecting one equality of the side clause as rewriting equality and (ii) matching each of the remaining literals, denoted in (5), to some literal in the main clause. Step (i) is similar to finding unit equalities in demodulation, whereas step (ii) reduces to showing that subsumes parts of the main premise. Informally speaking, subsumption demodulation combines demodulation and subsumption, as discussed in Section 5. Note that in step (ii), matching allows any instantiation of to via substitution ; yet, we we do not unify the side and main premises of subsumption demodulation, as illustrated later in Example 7. Furthermore, we need to find a term in the unmatched part of the main premise, such that can be rewritten according to the rewriting equality into .
As the ordering is partial, the conditions of Definition 1 must be checked a posteriori, that is after subsumption demodulation has been applied with a fixed substitution and revise the substitution if needed. Note however that if in the rewriting equality, then for any substitution, so checking the ordering a priori helps, as illustrated in the following example.
Example 5
Let us consider the following two clauses:
By the subterm property of , we conclude that . Hence, the rewriting equality, as well as any instance of it, is oriented.
Let be the substitution . Due to the previous paragraph, we know As equality literals are smaller than nonequality ones, we also conclude . Thus, we have and we can apply subsumption demodulation to and , deriving clause .
We note that demodulation cannot derive from and , as there is no unit equality. ∎
Example 5 highlights limitations of demodulation when compared to subsumption demodulation. We next illustrate different possible applications of subsumption demodulation using a fixed side premise and different main premises.
Example 6
Consider the clause . Only the first literal is a positive equality and as such eligible as rewriting equality. Note that and are incomparable w.r.t. due to occurrences of different variables, and hence whether depends on the chosen substitution .
(1) Consider the clause as the main premise. With the substitution , we have as due to the subterm property of , enabling a possible application of subsumption demodulation over and .
(2) Consider now as the main premise and the substitution . We have , as . The instance of the rewriting equality is oriented differently in this case than in the previous one, enabling a possible application of subsumption demodulation over and .
(3) On the other hand, using the clause as the main premise, the only substitution we can use is . The corresponding instance of the rewriting equality is then , which cannot be oriented in general. Hence, subsumption demodulation cannot be applied in this case, even though we can find the matching term in . ∎
As mentioned before, the substitution appearing in subsumption demodulation can only be used to instantiate the side premise, but not for unifying side and main premises, as we would not obtain a simplification rule.
Example 7
Consider the clauses:
As we cannot match to (although we could match to ), subsumption demodulation is not applicable with premises and . ∎
4.2 Simplification using Subsumption Demodulation
Note that in the special case where is the empty clause in (5), subsumption demodulation reduces to demodulation and hence it is a simplification rule. We next show that this is the case in general:
Theorem 4.2 (Simplification rule)
Subsumption demodulation is a simplification rule and we have:
where:

,

,

, and

.
Proof
Because of the second condition of the definition of subsumption demodulation, is clearly a logical consequence of and . Moreover, from the fourth condition, we trivially have . It thus remains to show that is smaller than w.r.t. . As , the monotonicity property of asserts that , and hence . This concludes that is redundant w.r.t. the conclusion and leftmost premise of subsumption demodulation. ∎
4.3 Refining Redundancy
The fourth condition defining subsumption demodulation in Definition 1 is needed to ensure that the main premise of subsumption demodulation becomes redundant. However, comparing clauses w.r.t. the ordering is computationally expensive; yet, not necessary for subsumption demodulation. Following the notation of Definition 1, let such that . By properties of multiset orderings, the condition is equivalent to , as the literals in occur on both sides of . This means, to ensure the redundancy of the main premise of subsumption demodulation, we only need to ensure that there is a literal from such that this literal is bigger that the rewriting equality.
Theorem 4.3 (Refining redundancy)
The following two conditions are equivalent:
As mentioned in Section 4.1, application of subsumption demodulation involves checking that an ordering condition between premises holds (side condition 4 in Definition 1). Theorem 4.3 asserts that we only need to find a literal in that is bigger than the rewriting equality in order to ensure that the ordering condition is fulfilled. In the next section we show that by reusing and properly changing the underlying machinery of firstorder provers for demodulation and subsumption, subsumption demodulation can efficiently be implemented in superpositionbased proof search.
5 Subsumption Demodulation in Vampire
We implemented subsumption demodulation
in the firstorder theorem prover Vampire.
Our implementation consists of about 5000 lines
of C++ code and is available at:
As for any simplification rule, we implemented the forward and backward versions of subsumption demodulation separately. Our new Vampire options controlling subsumption demodulation are fsd and bsd, both with possible values on and off, to respectively enable forward and backward subsumption demodulation.
As discussed in Section 4, subsumption demodulation uses reasoning based on a combination of demodulation and subsumption. Algorithm 1 details our implementation for forward subsumption demodulation. In a nutshell, given a clause as main premise, (forward) subsumption demodulation in Vampire consists of the following main steps:

Prune candidate clauses by checking the conditions of subsumption demodulation (lines 1–1 of Algorithm 1), in particular selecting a rewriting equality and matching the remaining literals of the side premise to literals of the main premise. After this, prune further by performing a posteriori checks for orienting the rewriting equality , and checking the redundancy of the given main premise . To do so, we revised multiliteral matching and redundancy checking in Vampire (see later).
Our implementation of backward subsumption demodulation requires only a few changes to Algorithm 1: (i) we use the input clause as side premise of backward subsumption demodulation and (ii) we retrieve candidate clauses as potential main premises of subsumption demodulation. Additionally, (iii) instead of returning a single simplified clause , we record a replacement clause for each candidate clause where a simplification was possible.
Clause indexing for subsumption demodulation.
We build upon the indexing approach [14] used for subsumption in Vampire: the subsumption index in Vampire
stores and retrieves candidate clauses for subsumption. Each clause is indexed by exactly one of its literals. In principle, any literal of the clause can be chosen. In order to reduce the number of retrieved candidates, the best literal is chosen in the sense that the chosen literal maximizes a certain heuristic (e.g. maximal weight). Since the subsumption index is not a perfect index (i.e., it may retrieve nonsubsumed clauses), additional checks on the retrieved clauses are performed.
Using the subsumption index of Vampire as the clause index for forward subsumption demodulation would however omit retrieving clauses (side premises) in which the rewriting equality is chosen as key for the index, omitting this way a possible application of subsumption demodulation. Hence, we need a new clause index in which the best literal can be adjusted to be the rewriting equality. To address this issue, we added a new clause index, called the forward subsumption demodulation index (FSD index), to Vampire, as follows: we index potential side premises either by their best literal (according to the heuristic), the second best literal, or both. If the best literal in a clause is a positive equality (i.e. a candidate rewriting equality) but the second best is not, is indexed by the second best literal, and vice versa. If both the best and second best literal are positive equalities, is indexed by both of them. Furthermore, because the FSD index is exclusively used by forward subsumption demodulation, this index only needs to keep track of clauses that contain at least one positive equality.
In the backward case, we can in fact reuse Vampire’s index for backward subsumption. Instead we need to query the index by the best literal, the second best literal, or both (as described in the previous paragraph).
Multiliteral matching.
Similarly to the subsumption index, our new subsumption demodulation index is not a perfect index, that is it performs imperfect filtering for retrieving clauses. Therefore, additional postchecks are required on the retrieved clauses. In our work, we devised a multiliteral matching approach to:
– choose the rewriting equality among the literals of the side premise , and
– check whether the remaining literals of can be uniformly instantiated to the literals of the main premise of subsumption demodulation.
There are multiple ways to organize this process. A simple approach is to (i) first pick any equality of a side premise as the rewriting equality of subsumption demodulation, and then (ii) invoke the existing multiliteral matching machinery of Vampire to match the remaining literals of with a subset of literals of . For the latter step (ii), the task is to find a substitution such that becomes a submultiset of the given clause . If the choice of the rewriting equality in step (i) turns out to be wrong, we backtrack. In our work, we revised the existing multiliteral matching machinery of Vampire to a new multiliteral matching approach for subsumption demodulation, by using the steps (i)(ii) and interleaving equality selection with matching.
We note that the substitution in step (ii) above is built in two stages: first we get a partial substitution from multiliteral matching and then (possibly) extend to by matching term instances of the rewriting equality with terms of .
Example 9
Let be the clause . Assume that our (FSD) clause index retrieves the clause from the search space (line 1 of Algorithm 1). We then invoke our multiliteral matcher (line 1 of Algorithm 1), which matches the literal of to the literal of and selects the equality literal of as the rewriting equality for subsumption demodulation over and . The matcher returns the choice of rewriting equality and the partial substitution . We arrive at the final substitution only when we match the instance , that is , of the lefthand side of the rewriting equality to the literal of . Using , subsumption demodulation over and will derive , after ensuring that becomes redundant (line 1 of Algorithm 1). ∎
We further note that multiliteral matching is an NPcomplete problem. Our multiliteral matching problems may have more than one solution, with possibly only some (or none) of them leading to successful applications of subsumption demodulation. In our implementation, we examine all solutions retrieved by multiliteral matching. We also experimented with limiting the number of matches examined after multiliteral matching but did not observe relevant improvements. Yet, our implementation in Vampire also supports an additional option allowing the user to specify an upper bound on how many solutions of multiliteral matching should be examined.
Redundancy checking.
To ensure redundancy of the main premise after the subsumption demodulation inference, we need to check two properties. First, the instance of the rewriting equality must be oriented. This is a simple ordering check. Second, the main premise must be larger than the side premise . Thanks to Theorem 4.3, this latter condition is reduced to finding a literal among the unmatched part of the main premise that is bigger than the instance of the rewriting equality .
Example 10
In case of Example 9, the rewriting equality is oriented and hence is also oriented. Next, the literal is bigger than , and hence is redundant w.r.t. and . ∎
6 Experiments
We evaluated our implementation of subsumption demodulation in Vampire on the examples of the TPTP [16] and SMTLIB [4] repositories. All our experiments were carried out on the StarExec cluster [15].
Benchmark setup. From the 22,686 problems in the TPTP benchmark set, Vampire can parse 18,232 problems. Out of these problems, we only used those problems that involve equalities as subsumption demodulation is only applicable in the presence of (at least one) equality. As such, we used 13,924 TPTP problems in our experiments.
On the other hand, when using the SMTLIB repository, we chose the benchmarks from categories LIA, UF, UFDT, UFDTLIA, and UFLIA, as these benchmarks involve reasoning with both theories and quantifiers and the background theories are the theories that Vampire supports. These are 22,951 SMTLIB problems in total, of which 22,833 problems remain after removing those where equality does not occur.
Comparative experiments with Vampire. As a first experimental study, we compared the performance of subsumption demodulation in Vampire for different values of fsd and bsd, that is by using forward (FSD) and/or backward (BSD) subsumption demodulation. To this end, we evaluated subsumption demodulation using the CASC and SMTCOMP schedules of Vampire’s portfolio mode. In order to test subsumption demodulation with the portfolio mode, we added the options fsd and/or bsd to all strategies of Vampire. While the resulting strategy schedules could potentially be further improved, it allowed us to test FSD/BSD with a variety of strategies.
Configuration  Total  Solved  New (SAT+UNSAT) 

Vampire  13,924  9,923  – 
Vampire, with FSD  13,924  9,757  20 (3+17) 
Vampire, with BSD  13,924  9,797  14 (2+12) 
Vampire, with FSD and BSD  13,924  9,734  30 (6+24) 
Configuration  Total  Solved  New (SAT+UNSAT) 

Vampire  22,833  13,705  – 
Vampire, with FSD  22,833  13,620  55 (1+54) 
Vampire, with BSD  22,833  13,632  48 (0+48) 
Vampire, with FSD and BSD  22,833  13,607  76 (0+76) 
Our results are summarized in Tables 12. The first column of these tables lists the Vampire version and configuration, where Vampire refers to Vampire in its portfolio mode (version 4.4). Lines 24 of these tables use our new Vampire, that is our implementation of subsumption demodulation in Vampire. The column “Solved” reports, respectively, the total number of TPTP and SMTLIB problems solved by the considered Vampire configurations. Column “New” lists, respectively, the number of TPTP and SMTLIB problems solved by the version with subsumption demodulation but not by the portfolio version of Vampire. This column also indicates in parentheses how many of the solved problems were satisfiable/unsatisfiable.
While in total the portfolio mode of Vampire can solve
more problems, we note that this comes at no suprise as the portfolio
mode of Vampire is highly tuned using the existing Vampire
options.
In our experiments, we were interested to see whether subsumption
demodulation in Vampire can solve problems that cannot be solved by
the portfolio mode of Vampire. The columns “New” of
Tables 12 give
practical evidence of the impact of subsumption demodulation: there
are 30 new TPTP problems and 76 SMTLIB problems^{3}^{3}3The list of these new problems is available at
https://gist.github.com/JakobR/605a7b7db0101259052e137ade54b32c
that the portfolio version of Vampire cannot
solve, but forward and backward subsumption demodulation in Vampire
can.
New problems solved only by subsumption demodulation. Building upon our results from Tables 12, we analysed how many new problems subsumption demodulation in Vampire can solve when compared to other stateoftheart reasoners. To this end, we evaluated our work against the superposition provers E (version 2.4) and Spass (version 3.9), as well as the SMT solvers CVC4 (version 1.7) and Z3 (version 4.8.7). We note however, that when using our 30 new problems from Table 1, we could not compare our results against Z3 as Z3 does not natively parse TPTP. On the other hand, when using our 76 new problems from Table 2, we only compared against CVC4 and Z3, as E and Spass do not support the SMTLIB syntax.
Table 3 summarizes our findings. First, 11 of our 30 “new” TPTP problems can only be solved using forward and backward subsumption demodulation in Vampire; none of the other systems were able solve these problems.
Second, while all our 76 “new” SMTLIB problems can also be solved by CVC4 and Z3 together, we note that out of these 76 problems there are 10 problems that CVC4 cannot solve, and similarly 27 problems that Z3 cannot solve.
Solver/configuration  TPTP problems  SMTLIB problems 

Baseline: Vampire, with FSD and BSD  30  76 
E with autoschedule  14   
Spass (default)  4   
Spass (local contextual rewriting)  6   
Spass (subterm contextual rewriting)  5   
CVC4 (default)  7  66 
Z3 (default)    49 
Only solved by Vampire, with FSD and BSD  11  0 
Comparative experiments without AVATAR. Finally, we investigated the effect of subsumption demodulation in Vampire without AVATAR [18]. We used the default mode of Vampire (that is, without using a portfolio approach) and turned off the AVATAR setting. While this configuration solves less problems than the portfolio mode of Vampire, so far Vampire is the only superpositionbased theorem prover implementing AVATAR. Hence, evaluating subsumption demodulation in Vampire without AVATAR is more relevant to other reasoners. Further, as AVATAR may often split nonunit clauses into unit clauses, it may potentially simulate applications of subsumption demodulation using demodulation. Table 4 shows that this is indeed the case: with both fsd and bsd enabled, subsumption demodulation in Vampire can prove 190 TPTP problems and 173 SMTLIB examples that the default Vampire without AVATAR cannot solve. Again, the column “New” denotes the number of problems solved by the respective configuration but not by the default mode of Vampire without AVATAR.
TPTP problems  SMTLIB problems  

Configuration  Total  Solved  New  Total  Solved  New 
(SAT+UNSAT)  (SAT+UNSAT)  
Vampire  13,924  6,601  –  22,833  9,608  – 
Vampire, with FSD  13,924  6,539  152 (13+139)  22,833  9,597  134 (1+133) 
Vampire, with BSD  13,924  6,471  112 (12+100)  22,833  9,541  87 (0+87) 
Vampire, with FSD and BSD  13,924  6,510  190 (15+175)  22,833  9,581  173 (1+172) 
7 Conclusion
We introduced the simplifying inference rule subsumption demodulation to improve support for reasoning with conditional equalities in superpositionbased firstorder theorem proving. Subsumption demodulation revises existing machineries of superposition provers and can therefore be efficiently integrated in superposition reasoning. Our implementation in Vampire shows that subsumption demodulation solves many new examples that existing provers, including firstorder and SMT solvers, cannot handle. Future work includes the design of more sophisticated approaches for selecting rewriting equalities and improving the imperfect filtering of clauses indexes.
Acknowledgements.
This work was funded by the ERC Starting Grant 2014 SYMCAR 639270, the ERC Proof of Concept Grant 2018 SYMELS 842066, the Wallenberg Academy Fellowship 2014 TheProSE, and the Austrian FWF research project W1255N23.
References
 [1] Bachmair, L., Ganzinger, H.: RewriteBased Equational Theorem Proving with Selection and Simplification. J. Log. Comput. 4(3), 217–247 (1994)
 [2] Bachmair, L., Ganzinger, H., McAllester, D.A., Lynch, C.: Resolution Theorem Proving. In: Handbook of Automated Reasoning, pp. 19–99 (2001)
 [3] Barrett, C., Conway, C.L., Deters, M., Hadarean, L., Jovanović, D., King, T., Reynolds, A., Tinelli, C.: CVC4. In: International Conference on Computer Aided Verification. pp. 171–177. Springer (2011)
 [4] Barrett, C., Fontaine, P., Tinelli, C.: The Satisfiability Modulo Theories Library (SMTLIB). www.SMTLIB.org (2016)
 [5] Barthe, G., Eilers, R., Georgiou, P., Gleiss, B., Kovács, L., Maffei, M.: Verifying Relational Properties using Trace Logic. In: Proc. of FMCAD. pp. 170–178 (2019)
 [6] Bjørner, N., Gurfinkel, A., McMillan, K.L., Rybalchenko, A.: Horn clause solvers for program verification. In: Fields of Logic and Computation II. pp. 24–51 (2015)
 [7] De Moura, L., Bjørner, N.: Z3: An efficient SMT solver. In: International conference on Tools and Algorithms for the Construction and Analysis of Systems. pp. 337–340. Springer (2008)
 [8] Ganzinger, H., Hagen, G., Nieuwenhuis, R., Oliveras, A., Tinelli, C.: DPLL(T): Fast Decision Procedures. In: Proc. of CAV. pp. 175–188 (2004)
 [9] Hillenbrand, T., Piskac, R., Waldmann, U., Weidenbach, C.: From search to computation: Redundancy criteria and simplification at work. In: Voronkov, A., Weidenbach, C. (eds.) Programming Logics: Essays in Memory of Harald Ganzinger, pp. 169–193. Springer Berlin Heidelberg, Berlin, Heidelberg (2013)
 [10] Kovács, L., Voronkov, A.: Firstorder theorem proving and vampire. In: International Conference on Computer Aided Verification. pp. 1–35. Springer (2013)
 [11] Nieuwenhuis, R., Rubio, A.: ParamodulationBased Theorem Proving. In: Handbook of Automated Reasoning, pp. 371–443 (2001)
 [12] Reynolds, A., Woo, M., Barrett, C.W., Brumley, D., Liang, T., Tinelli, C.: Scaling Up DPLL(T) String Solvers Using ContextDependent Simplification. In: Proc. of CAV. pp. 453–474 (2017)
 [13] Schulz, S., Cruanes, S., Vukmirovic, P.: Faster, higher, stronger: E 2.3. In: Proc. of CADE. pp. 495–507 (2019)
 [14] Sekar, R., Ramakrishnan, I.V., Voronkov, A.: Term indexing. In: Robinson, J.A., Voronkov, A. (eds.) Handbook of Automated Reasoning, pp. 1853–1964. Elsevier Science Publishers B. V. (2001)
 [15] Stump, A., Sutcliffe, G., Tinelli, C.: StarExec: A CrossCommunity Infrastructure for Logic Solving. In: Proc. of IJCAR. pp. 367–373 (2014)
 [16] Sutcliffe, G.: The TPTP Problem Library and Associated Infrastructure. From CNF to TH0, TPTP v6.4.0. Journal of Automated Reasoning 59(4), 483–502 (Feb 2017)
 [17] Tange, O.: GNU Parallel 2018. Ole Tange (Mar 2018)
 [18] Voronkov, A.: AVATAR: the architecture for firstorder theorem provers. In: Proc. of CAV. pp. 696–710 (2014)
 [19] Weidenbach, C.: Combining Superposition, Sorts and Splitting. In: Handbook of Automated Reasoning, pp. 1965–2013 (2001)
 [20] Weidenbach, C., Dimova, D., Fietzke, A., Kumar, R., Suda, M., Wischnewski, P.: SPASS version 3.5. In: Proc. of CADE. pp. 140–145 (2009)
 [21] Weidenbach, C., Wischnewski, P.: Contextual Rewriting in SPASS. In: Proc. of PAAR (2008)