1 Introduction
A hammer [7] for an interactive theorem prover (ITP) [23] typically translates an ITP goal into a formalism used by an Automated Theorem Prover (ATP). Since the most successful ATPs have so far been firstorder, the focus has been on firstorder translations. There is also interest in ATPs working in richer formalisms, such as monomorphic and polymorphic, typed firstorder and higherorder logics. The TPTP formats have been adopted for this work, viz. TF0 [47], TF1 [8], TH0 [5], and TH1 [27]. An interesting related task is to create a (grand) unified largetheory benchmark that will allow fair comparison of such ATP systems, their combination and integration with premise selectors and machine learners [1] across the different formalisms. As a step towards creating such benchmarks we present two families of translations from the language of HOL4 [42] to the various TPTP formats. We have implemented these translations and plan to use them as the first (grand) unified benchmarks, generalizing existing benchmarks such as the CakeML export [33] that was used in the LTB division of the CASCJ9 ATP competition [48].
The rest of the paper is structured as follows. Section 2 introduces notation and the HOL syntax. Section 3 introduces the problems – the HOL4 standard library. Section 4 introduces the first family of translations, and Section 5 introduces the second family of translations. Section 6 discusses and compares the translations on an example, and Section 7 evaluates the translations using existing ATPs. Section 8 describes the updated CASC LTB division setup. Related work is discussed in Section 9.
2 Preliminaries
Since our work is based on the HOL4 standard library, it is necessary to start with brief comments about the syntax and notion of proof in HOL4. More detailed information is in [42, 20]. HOL4, like several other ITPs (e.g., Isabelle/HOL [37], HOL Light [21], ProofPower [29]), is based on an extension of Church’s simple type theory [13] that includes prefix polymorphism and type definitions [20]. HOL4 includes a type of propositions, a type of individuals and a type of functions from a type to a type . Parentheses are omitted, with associating to the right. In addition, there are type variables and defined types. At each point in the development of the HOL4 library there is a finite set of (previously defined) constants , and a a finite set of (previously defined) type constructors giving a type for types . For simplicity we consider the signature to be fixed. To be more precise, we would need to speak of the set of types and terms relative to the current signature.
Terms are generated from constants and variables using application and abstractions in the expected way, for terms and . Parentheses are omitted, with application associating the left. Binders have scope as far to the right as possible, consistent with parentheses. Multiple binders over the same type can be written in a combined form, e.g., means .
Constants may be polymorphic. There are two primitive polymorphic logical constants: is polymorphic with type and is polymorphic with type , where is a type variable. When terms are defined, such constants are used with a fixed type for written as a superscript. New polymorphic constants can be defined within a HOL4 theory.
Aside from and , implication of type is primitive. From these primitive logical constants it is possible to define , and as well as polymorphic operators and . The usual notation is used for these logical connectives, so that the binder notation is written as , using the same binding conventions as for abstractions. Similarly, is written as .
Terms of type are called propositions and we use and to range over propositions. A sequent is a pair where is a finite set of propositions and is a proposition. There is a notion of HOL4 provability for sequents. While our translations map HOL4 sequents to TPTP formulas, it is not our intention to mirror HOL4 provability in the target format. The intention, roughly speaking, is to gain information about when a HOL4 theorem is a consequence of previous HOL4 theorems, in some logic weaker than HOL4. Further discussion is beyond the scope of this paper.
In the simplest case, each translation will translate HOL4 types and HOL4 terms (including propositions) to terms in the target format. A type, term or sequent with no type variables is called monomorphic. As an optimization, some of the translations will translate some monomorphic HOL4 types to types in the target language. To have common notation, for a HOL4 type we write for translated as a term and for translated as a type. Another optimization is to translate HOL4 propositions (and sequents) to the level of formulas in the target language.
3 Problem Set: The Hol4 Standard Library
The current HOL4 standard library contains 15733 formulas: 8 axioms, 2294 definitions, and 13431 theorems. If most of the formulas were monomorphic and fell into a natural firstorder fragment of HOL4, then there would be a natural translation into the FOF format. However, many formulas are either polymorphic or higherorder (or both), as Table 1 shows (note that the numbers are not cumulative, e.g., the 2232 monomorphic firstorder formulas does not include the 1632 unisorted firstorder formulas, which could also be processed by an ATP that can handle the monomorphic types). The problem set consists of 12140 theorems proven in the HOL4 standard library^{1}^{1}11291 theorems were not included due to dependencies being erased during the build of the HOL4 library., in the context of a finite set of dependencies used in the HOL4 proof [17].
Firstorder  Higherorder  Combined  

Unisorted  1632  (FOF)  0  1632  
Monomorphic  2232  (TF0)  3683  (TH0)  5915 
Polymorphic  1536  (TF1)  6650  (TH1)  8186 
Combined  5400  10333  15733 
4 The First Family of Translations
There already exist many families of translations for HOL to the TPTP format, usually developed for hammers [28, 36]
. We have adopted and adpated them: (i) We have made the translations more local, by making them independent for each formula, i.e., unaffected by other formulae in the problem. (ii) We have made more problems provable (in principle) by introducing additional axioms and relying on an embedding of polymorphic types instead of relying on heuristic monomorphization. (iii) For the
TH0 and TF0 formats, we make use of their polysorted nature by expressing the type of monomorphic constants directly with the available TF0 and TH0 types.The translations are described in the order , as translations to the later formats take advantage of translation techniques used for earlier formats.
4.1 Translating to Th1
The TH1 format is a language that is strictly more expressive than HOL4. Therefore HOL4 formulas can be represented in TH1 with minimal effort. This produces the TH1I collection of ATP problems.
Alignment of Logical Constructions.
The TPTP format contains a set of reserved objects that have implicit definitions. Thus, the HOL4 objects are mapped to their TPTP counterparts in a natural way. The boolean type of HOL4 is mapped to the defined TPTP type $o. The arrow type operator is mapped to the TPTP arrow >. All other type operators are declared to take types and give a type, using the TPTP “type of types” $tType. For example, the type operator has type $tType > $tType.
The TPTP logical connectives , , , , , , are used at the toplevel of the translated formula whenever possible, but the corresponding HOL4 constants are used when necessary. Equivalences relating HOL4 logical constants to TPTP connectives are included.
Explicit type arguments.
In HOL4, a constant carries a type . This type is an instance of type that was given to when was created. By matching with , a type substitution can be inferred. Ordering the type variables in the domain of , the implicit type arguments of are deduced. Making the quantification of type variables and the type arguments explicit is required in the TH1 format. The effect that this requirement has on a constant declaration and a formula is shown in Example 1.
Example 1
(Explicit type arguments and type quantifications)
HOL4  TH1  

Type of  
Formula 
4.2 Translating to Tf1
To create the TF1I collection of ATP problems all the higherorder features of the HOL4 problems have to be eliminated. We do this in a sequence of steps.
Lambdalifting and Booleanlifting.
One of the higherorder features is the presence of lambdaabstraction, so the translation starts by rewriting the formula using lambdalifting [15, 36]. Before applying lambdalifting, the translation is optimized by finding other ways to rewrite lambdaabstractions. The extensionality property is used to add extra arguments to lambdas appearing on either side of an equality, and then betareduce the formula. The lambdalifting then creates a constant for the leftmost outermost lambdaabstractions appearing at the termlevel. This constant replaces the lambdaabstraction in the formula. A definition is given for , which may involve some variable capture  see Example 2 This procedure is repeated until all the atoms are free of lambdaabstractions. The definitions are part of the background theory and are considered to be axioms in the TF1 problem even if they were created from the conjecture.
Example 2
(Lambdalifting with variable capture)
Original formula:  
Definition:  
New formula: 
A similar method can be applied to move some logical constants from the termlevel to the formulalevel  see Example 3. (This optimization is not applied in our second family of translations.)
Example 3
(Booleanlifting)
Original formula:  
Definition:  
New formula: 
To allow the ATPs to create their own encoded lambdaabstractions, axioms for the combinators , and are added to every problem. (These axioms are omitted in our second family of translations. In the TH0II versions of the problem, combinators are not needed since all simply typed calculus terms are already representable. In the TF0II and FOFII versions only an axiom for and a partially applied axiom for are included. Combinator axioms often hinder the proof search and they are not needed for proving most problems.
Apply operator and Arity equations.
As functions cannot be passed as arguments in firstorder logic, an explicit apply operator is used to apply a function to an argument. This way, all objects (constants and variables) have arity zero except for the apply operator which has arity two. The HOL4 functional extensionality axiom is added to to all problems, as it expresses the main property of the apply operator:
This axiom also demonstrates how the higherorder variables and become firstorder variables after the introduction of .
To limit the number of apply operators in a formula, versions of each constant are defined for all its possible arities, in terms of the zeroarity version. These constants are used in the translated formula  see Example 4.
Example 4
(Using constants with their arity)
Original formula:  
Arity equations:  
New formula: 
If the return type of a constant is a type variable then some of its instances can expect an arbitrarily large numbers of arguments. In the case where the number of arguments of an instance exceeds the number of arguments of the primitive constant, variants of this constant are not created for this arity. Instead, the apply operators is used to reduce the number of arguments to . For example, the term is not translated to but instead to .
Tf1 types.
As a final step, type arguments and type quantifications are added as in Section 4.2 . Moreover, the boolean type of HOL4 is replaced by $o at the formulalevel, and by at the termlevel (because $o is not allowed at the termlevel in a firstorder formula). This causes a mismatch between the type of the atom and the type of the logical connective $o. Therefore an additional operator is applied on top of every atom. The following properties of and are added to every translated problem (written here in the firstorder style for function application):
In a similar manner, the TPTP arrow type cannot be used whenever a function appears as an argument. In this case, the type constructor is used, as illustrated by the following constant declaration:
4.3 Translating to Fof
The translation to FOF (which produces the FOFI collection of ATP problems) follows exactly the same path as the translation to TF1 except that the types are encoded as firstorder terms. To represent the fact that a firstorder term has type , the tagging function introduced by Hurd [24] is used: every term of type is replaced by . Going from the type to the term effectively transforms type variables into term variables, and type operators into firstorder functions and constants. Type arguments are not necessary anymore as the tags contain enough information. In practice, the tagging function prevents terms of different types from unifying, and allows instantiation of type variables  see Example 5.
Example 5
(type instantiation)
TF1  FOF 

4.4 Translating to Tf0
An easy way to translate HOL4 formulas to TF0 (to produce the TF0I collection of ATP problems) is to take the translation to FOF and inject it into TF0.
Trivial Injection from Fof to Tf0.
The first step is to give types to all the constants and variables appearing the the FOF formula. A naive implementation would be to give the type to symbols with arity . However, since it is known that the first argument comes from the universe of nonempty types, and the second argument comes from the universe of untyped terms, an explicit distinction can be made. The type of is defined to be , with being the universe of nonempty types, being the universe of untyped terms, and being the universe of typed terms. After this translation a type operator (or type variable) with arguments has type and a function (or term variable) with arguments has type . The type of is . Declaring the type of all these objects achieves a trivial translation from FOF with tags to TF0.
Using Special Types.
To take full advantage of the polysorted nature of TF0, a constant is declared for every constant , with arity and monomorphic type . The type of is declared to be , where , and are basic types. A basic type constructs a single type from a monomorphic type, e.g, for , for . The basic types are our special types and are declared using $tType. Thanks to these new constants monomorphic formulas can be expressed in a natural way, without type encodings in the formula. Nevertheless, an ATP should still be able to perform a type instantiation if necessary. That is why we relate the monomorphic representation with its tagged counterpart.
If a term has a basic type then it lives in the monomorphic world where as a term of type it belongs to the tagged world. All monomorphic terms (constructed from monomorphic variables and constants) can be expressed in the monomorphic world. To relate the two representation of the same HOL4 term an “injection” and a “surjection” is defined for each basic type . The constants and are expected to respect the following properties, which are included as axioms in the translated problems:
Whenever is an instance of a polymorphic function , the following equation is included in the TF0 problem, which relates the two representatives:
Effect on defined operators.
The operator is treated in the same way as every other constant. In particular, a different version of is created for each monomorphic type. The type of becomes , and the projection is used to transfer atoms from the tagged world to the monomorphic world.
If the presence of the predicate and the inclusion of additional equations are ignored, our translation of a HOL4 firstorder monomorphic formula using special types to TF0 is simply the identity transformation.
4.5 Translating to Th0
Translating from HOL4 to TH0 (to produce the TH0I collection of ATP problems) is achieved in a way similar to the translation to TF0. The HOL4 formulas are first translated to FOF, and then trivially injected into TH0. Special types are used for basic types extracted from monomorphic types. The set of higherorder basic types is slightly different from the firstorder ones, where we recursively remove arrow types till a nonarrow constructor is found. In the higherorder setting a single monomorphic constant can be used to replace all arity versions of : . Another benefit of the expressivity of TH0 is that the basic type can be replaced by , and the the predicate can be omitted. The effect of the previous steps is illustrated in Example 6.
Example 6
(Translations of )
In this example has type
where is the special type corresponding to .
TF0  TH0 

In order to have the same shallowness result in TH0 as for TF0, it would be necessary to replace monomorphic constants created by the lifting procedure by their lambdaabstractions. We chose to keep the definitions for the lifted constants as they allow some termlevel logical operators to be pushed to the formula level.
5 The Second Family of Translations
The second family of translations into TH0, TF0, and FOF is semanticlly motivated: we make use of constructors known to be definable in set theory. Types and terms are translated to sets, where types must translate to nonempty sets. The translation may optionally use other special types for monomorphic types in the HOL4 source. In the TH0 case the builtin type $o can be used for the HOL4 type . In the firstorder cases HOL4 terms of type are sometimes translated to terms, and sometimes to formulas, depending on how the HOL4 term is used. In the TF0 case a separate type of booleans is declared, which is used as the type of terms translated from HOL4 terms of type . In the FOF case this approach is not possible, as all terms have the same type (intuitively representing sets). The other main difference between the translation to TH0 and the translations to the firstorder languages is that the firstorder translations make use of lambda lifting [15, 36]. As a result of the translations we obtain three new collections of ATP problems: TH0II, TF0II and FOFII.
5.1 Translating to Tf0
The base types (for propositions, written as $o in TH0) and (for individuals, written as $i in TH0) are built into TH0. In addition a base type is declared. The translation treats elements of type as sets, and elements of type as nonempty sets. The basic constants used in the ATP problems are as follows:

is used for a fixed two element set.

is used for a fixed nonempty set corresponding to HOL4’s type of individuals.

is used to construct the function space of two sets.

corresponds to the membership relation on sets, where the second set is known to be nonempty. We will write for i The term is written as , and the term is written as .

corresponds to set theory level application. (represented as a set).

is used to build set bounded abstractions as sets.

is a predicate that indicates whether or not an element of is true or not.

is an injection of into , essentially translating false to a set and true to a different set.
The basic axioms included in each ATP problem are:
 :

.
 :

.
 :

.
 :

.
 :

.
 :

 :

.
If is interpreted using a model of ZFC and using a copy of the nonempty sets in this model, then the constants above can be interpreted in an obvious way so as to make the basic axioms true.
Given this theory, a basic translation from HOL4 to TH0 can be informally described as follows. Each HOL4 type (including type variables) is mapped to a term of type . HOL4 type variables (constants) are mapped to TH0 variables (constants) of type . For the remaining cases , , and are used. Each HOL4 term is mapped to a TH0 term of type , for which the context is always known. The invariant can be maintained by including the hypothesis whenever is a variable or a constant. The and constants are used to handle HOL4 applications and abstractions. The axioms and ensure the invariant is maintained. Finally HOL4 propositions (which may quantify over type variables) are translated to TH0 propositions in an obvious way, using to go from to and to go from to when necessary. As an added heuristic, the translation makes use of TH0 connectives and quantifiers as deeply as possible, delaying the use of when possible.
Using Special Types.
As with the first family of translations, the second family optimizes by using special types for HOL4 types with no type variables, e.g., and . Unlike the first family, special types are not used for monomorphic function types. As a result, it is not necessary to consider alternative operators. A basic monomorphic type is a monomorphic type which is not of the form . If special types are used, then for each basic monomorphic type occurring in a proposition a corresponding TH0 type is declared, mappings and axioms relating to the type of sets are declared, and the type is used to translate terms of the type and quantifiers over the type when possible. For example, if a basic monomorphic type (e.g., ) occurs in a HOL4 proposition, then in addition to translating as a term we also declare a TH0 type , and along with axioms and .
One obvious basic monomorphic type is . In the case of we do not declare a new type, but use the TH0 type $o. That is, denotes $o. Note that is already declared. Additionally, is used as shorthand for , which has the desired type .
Suppose a HOL4 constant has type , where are basic monomorphic types with corresponding TH0 types . Instead of translating a term simply as a term of type , each is translated to a term of type , and a first order constant is used to translate to the term of type . In such a case an equation relating to is also included. Since the translation may return a term of type or , where is a basic monomorphic type, and are used to obtain a term of type or when one is required. If a quantifier ranges over a monomorphic type , a quantifier over type is used, instead of using a quantifier over type and using to guard the quantifier.
5.2 Translating to Tf0
There are two main modifications to the translation to TH0 when targeting TF0. Firstly, propositions cannot be treated as special kinds of terms in TF0. In order to deal with this is treated like other special types by declaring a new type and functions and along with corresponding axioms as above. Note that unlike the TH0 case, differs from . In TF0 is a unary predicate on , and is a function from to . In the TF0 version of the axioms and , is replaced with . Secondly, the background theory cannot include the higherorder operator. Therefore the operator is omitted, and lambda lifting is used to translate (most) HOL4 abstractions. The two higherorder axioms and are also omitted.
In the TH0 case, the background axioms are enough to infer the following (internal) propositional extensionality principle
from the corresponding extensionality principle valid in TH0. This is no longer the case in TF0, so propositional extensionality is added as an axiom.
There are two special cases where lambda lifting can be avoided: identity and constant functions. For this purpose a new unary function on sets and a new binary function on sets are added. Two new basic axioms are added to the ATP problem for these functions:
 Id:

.
 Const:

.
A HOL4 term is translated as . For a HOL4 term , where is not free in , is translated to a firstorder term of type , and the term is translated to . If there is already a function defined for (with the same variable names), then that function is reused. Otherwise, lambda lifting of proceeds as follows. Let be type variables occurring in and be the free variables occurring in . Assume the translation of as a firstorder term with of type corresponding to the variable . (Note that this may have involved some lambda lifting.) Let be a new ary function returning sets. If special types are not being used, then each argument of is a set. If special types are used, then each argument is a set unless it corresponds to , where is a monomorphic type in which case the argument has type . The following axioms about are added to the ATP problem:
 :

.
 :

.
In these axioms the preconditions that each must be in if has type have been elided (otherwise special types are being used, is monomorphic, has type and no guard is required).
5.3 Translating to Fof
In order to translate to FOF all terms must be translated to the same type, effectively the type . This requires omission of any special treatment of monomorphic types, and instead all HOL4 terms must be translated to terms of type . The type of nonempty sets must also be omitted. Instead, is used wherever was used in the TF0 setting, and quantifiers that were over are guarded by a new nonemptiness predicate . Aside from these changes, the translation proceeds using lambda lifting as in the TF0 case.
6 Case Study
A very simple HOL4 theorem is , where is defined to be . Informally the proof is clear: expand the definition of and perform two reductions. However, proving various translated versions of the problem range from trivial to challenging.
The first family of translations make use of a preprocessing step that changes the definition of from to
Note that this step makes the definition of the same (up to conversion) as the theorem. Even if further encodings are applied to obtain a firstorder problem, the axiom will still be the same as the conjecture. Consequently all versions resulting from the first family of translations are trivially provable.
The TH0II version has conjecture
and the axiom (corresponding to the definition of )
The axiom defining combined with the basic axiom is enough to prove the theorem. However, the TH0II version also includes all the other basic axioms along with internal versions of the logical constants for universal quantification and equality. The extra axioms make the problem hard for ATP systems, but if only the necessary axioms are provided the problem is easy. In TF0II and FOFII the conjecture is the same as in the TH0II version, but the definition of is split into two functions declared when lambda lifting:
and
All the firstorder versions of this problem are easy for current ATP systems.
7 Results
Since we are working in a large ITP library with a natural order of the problems, each translation can generate two versions of each problem. The bushy (small) version contains only the (translated) library facts that were needed for the ITP proof of each theorem. The chainy (large) version contains all the facts that precede the theorem in the library order, i.e., the real task faced by hammer systems. Chainy problems typically include thousands of axioms, requiring the use of premise selection algorithms [1] as a frontend in the ATP systems. Thus, in order to maintain the focus on ATP system performance, the results of running the ATP systems on the bushy problems are presented here.
We have run a total of 19 ATPs on the 12140 problems in each of the bushy problem sets, according to the ATPs’ support for the various TPTP formats. The ATPs supporting TH1 were HOLyHammer 0.21 [28] and LeoIII 1.3 [45, 44] . The ATPs supporting TH0 were agsyHOL 1.0 [34], cocATP 0.2.0, LEOII 1.7.0 [4], LeoIII 1.3, Satallax 3.3 [11], and Zipperposition 1.4 [14]. The ATPs supporting TF1 were LeoIII 1.3 and Zipperposition 1.4. The ATPs supporting TF0 were Beagle 0.9.47 [3], CVC4 1.6 [2], E 2.2 [41], iProverModulo 2.50.1 [12], LeoIII 1.3, Princess 170717 [39, 40], Vampire 4.3 [31], and Zipperposition 1.4. The ATPs supporting FOF were CSE_E 1.0 [51], E 2.2, iProver 2.8 [30], Metis 2.4 [25]. Prover9 1109a [35] SPASS 3.9 [50], and Vampire 4.3. In each case we ran the ATP with a CPU time limit of 60s per problem. Table 2 summarizes the results.
System  TH1I  TH0I  TH0II  TF1I  TF0I  TF0II  FOFI  FOFII  Union 

agsyHOL  1374  1187  1605  
Beagle  2007  2047  2531  
cocATP  899  599  1000  
CSE_E  4251  3102  4480  
CVC4  4851  3991  5252  
E  4277  3622  4618  3844  5118  
HOLyHammer  5059  5059  
iProver  2778  2894  3355  
iProverMo’  2435  1639  2699  
LEOII  2579  1923  3213  
LeoIII  6668  5018  3485  3458  4032  3421  7062  
Metis  2353  474  2356  
Princess  3646  2138  3849  
Prover9  2894  1742  3128  
Satallax  2207  1292  2494  
SPASS  2850  3349  3821  
Vampire  4837  4693  4008  4928  5929  
Zipperp’n  2252  2161  3771  3099  2576  4203  
Union  6824  5209  3771  4608  5732  5073  5165  5108  7377 
Of the 12140 problems 7377 (60.7%) were solved by some ATP in one of the representations. The TacticToe [18, 19] prover built into HOL4 has been tested as a baseline comparison, and it (re)proves 5327 of 8855 chainy versions of the problems (60%). TacticToe
is a machinelearning guided prover that searches for a tactical proof by selecting suitable tactics and theorems learned from humanwritten tactical proofs. By design, this system works in the chainy setting. In total 8827 (73%) of the 12140 problems can be proved by either
TacticToe or one of the ATPs using one of the representations.8 GRUNGE as CASC LTB Division
The CADE ATP System Competition (CASC) [46] is the annual evaluation of fully automatic, classical logic Automated Theorem Proving (ATP) systems – the world championship for such systems. CASC is divided into divisions according to problem and system characteristics. Each competition division uses problems that have certain logical, language, and syntactic characteristics, so that the systems that compete in the division are, in principle, able to attempt all the problems in the division. For example, the FirstOrder Form (FOF) division uses problems in full firstorder logic, with each problem having axioms and a conjecture to be proved.
While most of the CASC divisions present the problems to the ATP systems one at a time, with an individual CPU or wall clock time limit per problem, the Large Theory Batch (LTB) division presents the problems in batches, with an overall wall clock time limit on the batch. As the name also suggests, the problems in each batch come from a “large theory”, which typically has many functors and predicates, and many axioms of which only a few are required for the proof of a theorem. The problems in a batch typically have a common core set of axioms used by all problems, and each problem typically has additional axioms that are specific to the problem. The batch presentation allows the ATP systems to load and preprocess the common core set of axioms just once, and to share logical and control results between proof searches. Each batch is accompanied by a set of training problems and their solutions, taken from the same source as the competition problems. The training data can be used for ATP system tuning during (typically at the start of) the competition.
In CASCJ9  the most recent edition of the competition [48]  the LTB division used firstorder form problems exported from CakeML [32]. At the time there was growing interest in an LTB division for typed higherorder problems, and it was soon evident that a multiformat LTB division, in which each problem is encoded in several of the available TPTP languages, would add a valuable dimension to CASC. For CASC27 each problem will be presented in multiple formats: TH1, TH0, TF1, TF0, and FOF. The work described in this paper will provide the problem set. Systems will be able to attempt whichever versions they want, and a solution to any version will constitute a solution to the problem. There are two ways that core ATP systems can attempt a particular form of a problem: If the system is able to handle the form natively then that is the obvious first approach. The alternative is to translate the problem to another “lower” form, either internally, or using existing translation tools available in, e.g., Isabelle [37] or Why3 [16]. For example, LeoIII [43] would be able to handle all the formats natively, while E [41] would need to translate TH1, TH0, and TF1 to TF0 or FOF, which it can handle natively.
The batch presentation of problems in the LTB division provides interesting opportunities for ATP systems, including making multiple attempts on problems and learning search heuristics from proofs found. The multiformat LTB division extends these possibilities, by allowing multiple attempts on problems by virtue of the multiple formats available, and learning from proofs found in one format to improve performance on problems in another format. The latter is especially interesting, with little known research effort in this direction, and is complicated by differences in symbol naming between the various exports from HOL4.
9 Related Work
The HOL4 library already has translations for SMT solvers such as Yices [49], Z3 [9] and Beagle. A link to firstorder ATPs is also available thanks to exports [17] of HOL4 theories to the HOL(y)Hammer framework [28]. Another notable project that facilitates the export of HOL4 theories is Open Theory [26]. The general approach for higherorder to firstorder translations is laid out in Hurd [24]. An evaluation of the effect of different translations on ATPprovability was performed in [36]. A further study shows the potential improvements provided by the use of supercombinators [15]. In our work, the use of lambdalifting (or combinators) is not necessary in TH0II thanks to the use of the higherorder operator . This is similar to using higherorder abstract syntax to model syntax with binders [38].
A method for encoding of polymorphic types as terms through type tags (as in our first translation) or type guards (as in our second translation) is described in [6]. Translations [22, 45] from a polymorphic logic to a monomorphic polysorted logic without encoding typically rely on heuristic instantiations of type variables. However, heuristics may miss useful instantiations, and make the translation less modular (i.e., contextdependent). Our translations to TH0 and TF0 try to get the best of both worlds by using a type encoding for polymorphic types and special types for basic monomorphic types.
10 Conclusion
This work has defined, evaluated and compared two families of translations of the HOL4 logic to a number of ATP formalisms, and described a new unified largetheory ATP benchmark (GRUNGE) based on them. The first family is designed to play to the strengths of the calculi of most ATP systems, while the second family is based on more straightforward semantics rooted in set theory. The case study shows how different the translated problems may be, even in a simple example. A number of methods and optimizations have been used, however it is clear that these translations can be further optimized and that different encodings favour different provers. Out of 12140 HOL4 theorems, the ATP systems we evaluated could solve 7377 problems in one or more of the formats. The TacticToe system that works directly in the HOL4 formalism and uses HOL4 tactics could solve 5327 problems. Together the total number of problems solved is 8827. LeoIII was the strongest system in the higherorder representations. In the firstorder representations the strongest systems were Zipperpin, CVC4, E and Vampire. We are also planning to prerelease a part of both the bushy and chainy versions of the problems, to allow the system developers to develop and tune their systems on them before the 2019 CASC LTB competition.
References

[1]
Alama, J., Heskes, T., Kühlwein, D., Tsivtsivadze, E., Urban, J.: Premise selection for mathematics by corpus analysis and kernel methods. J. Autom. Reasoning
52(2), 191–213 (2014). https://doi.org/10.1007/s1081701392865  [2] Barrett, C., Conway, C.L., Deters, M., Hadarean, L., Jovanović, D., King, T., Reynolds, A., Tinelli, C.: CVC4. In: Gopalakrishnan, G., Qadeer, S. (eds.) Conference on Computer Aided Verification (CAV). LNCS, vol. 6806, pp. 171–177. Springer (2011), https://doi.org/10.1007/9783642221101_14
 [3] Baumgartner, P., Waldmann, U.: Hierarchic superposition with weak abstraction. In: Bonacina, M.P. (ed.) Conference on Automated Deduction (CADE). Lecture Notes in Computer Science, vol. 7898, pp. 39–57. Springer (2013), http://dx.doi.org/10.1007/9783642385742_3

[4]
Benzmüller, C., Paulson, L., Theiss, F., Fietzke, A.: LEOII  A Cooperative Automatic Theorem Prover for HigherOrder Logic. In: Baumgartner, P., Armando, A., Dowek, G. (eds.) Proceedings of the 4th International Joint Conference on Automated Reasoning. pp. 162–170. No. 5195 in Lecture Notes in Artificial Intelligence, SpringerVerlag (2008)
 [5] Benzmüller, C., Rabe, F., Sutcliffe, G.: THF0 – the core of the TPTP language for classical higherorder logic. In: Armando, A., Baumgartner, P., Dowek, G. (eds.) Automated Reasoning, 4th International Joint Conference, IJCAR 2008, Sydney, Australia, August 1215, 2008, Proceedings. LNCS, vol. 5195, pp. 491–506. Springer (2008), http://christophbenzmueller.de/papers/C25.pdf
 [6] Blanchette, J.C., Böhme, S., Popescu, A., Smallbone, N.: Encoding monomorphic and polymorphic types. In: Piterman, N., Smolka, S.A. (eds.) Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS). LNCS, vol. 7795, pp. 493–507. Springer (2013), https://doi.org/10.1007/9783642367427_34
 [7] Blanchette, J.C., Kaliszyk, C., Paulson, L.C., Urban, J.: Hammering towards QED. J. Formalized Reasoning 9(1), 101–148 (2016). https://doi.org/10.6092/issn.19725787/4593
 [8] Blanchette, J.C., Paskevich, A.: TFF1: The TPTP typed firstorder form with rank1 polymorphism. In: Bonacina, M.P. (ed.) CADE. LNCS, vol. 7898, pp. 414–420. Springer (2013)
 [9] Böhme, S., Weber, T.: Fast LCFstyle proof reconstruction for Z3. In: Kaufmann, M., Paulson, L.C. (eds.) Conference on Interactive Theorem Proving (ITP). LNCS, vol. 6172, pp. 179–194. Springer (2010), http://dx.doi.org/10.1007/9783642140525_14
 [10] Bonichon, R., Delahaye, D., Doligez, D.: Zenon : An extensible automated theorem prover producing checkable proofs. In: Dershowitz, N., Voronkov, A. (eds.) Logic for Programming, Artificial Intelligence, and Reasoning, 14th International Conference, LPAR 2007, Yerevan, Armenia, October 1519, 2007, Proceedings. Lecture Notes in Computer Science, vol. 4790, pp. 151–165. Springer (2007)
 [11] Brown, C.E.: Satallax: An automatic higherorder prover. In: Gramlich, B., Miller, D., Sattler, U. (eds.) IJCAR. LNCS, vol. 7364, pp. 111–117. Springer (2012)
 [12] Burel, G.: Experimenting with deduction modulo. In: SofronieStokkermans, V., Bjørner, N. (eds.) CADE 2011. Lecture Notes in Artificial Intelligence, vol. 6803, pp. 162–176. Springer (2011)
 [13] Church, A.: A formulation of the simple theory of types. J. Symbolic Logic 5, 56–68 (1940)
 [14] Cruanes, S.: Extending Superposition with Integer Arithmetic, Structural Induction, and Beyond. (Extensions de la Superposition pour l’Arithmétique Linéaire Entière, l’Induction Structurelle, et bien plus encore). Ph.D. thesis, École Polytechnique, Palaiseau, France (2015), https://tel.archivesouvertes.fr/tel01223502
 [15] Czajka, L.: Improving automation in interactive theorem provers by efficient encoding of lambdaabstractions. In: Avigad, J., Chlipala, A. (eds.) Proceedings of the 5th ACM SIGPLAN Conference on Certified Programs and Proofs, Saint Petersburg, FL, USA, January 2022, 2016. pp. 49–57. ACM (2016). https://doi.org/10.1145/2854065.2854069
 [16] Filliatre, J.C., Paskevich, A.: Why3  Where Programs Meet Provers. In: Felleisen, M., Gardner, P. (eds.) Proceedings of the 22nd European Symposium on Programming. pp. 125–128. No. 7792 in Lecture Notes in Computer Science, Springer (2013)
 [17] Gauthier, T., Kaliszyk, C.: Premise selection and external provers for HOL4. In: Certified Programs and Proofs (CPP’15). LNCS, Springer (2015). https://doi.org/10.1145/2676724.2693173
 [18] Gauthier, T., Kaliszyk, C., Urban, J.: TacticToe: Learning to reason with HOL4 tactics. In: Eiter, T., Sands, D. (eds.) LPAR21, 21st International Conference on Logic for Programming, Artificial Intelligence and Reasoning, Maun, Botswana, May 712, 2017. EPiC Series in Computing, vol. 46, pp. 125–143. EasyChair (2017), http://www.easychair.org/publications/paper/340355
 [19] Gauthier, T., Kaliszyk, C., Urban, J., Kumar, R., Norrish, M.: Learning to prove with tactics. CoRR (2018), http://arxiv.org/abs/1804.00596
 [20] Gordon, M.J.C., Melham, T.F. (eds.): Introduction to HOL: A theorem proving environment for higher order logic. Cambridge University Press (1993), http://www.cs.ox.ac.uk/tom.melham/pub/Gordon1993ITH.html
 [21] Harrison, J.: HOL Light: A tutorial introduction. In: Srivas, M.K., Camilleri, A.J. (eds.) FMCAD. LNCS, vol. 1166, pp. 265–269. Springer (1996)
 [22] Harrison, J.: Optimizing proof search in model elimination. In: McRobbie, M., Slaney, J. (eds.) Conference on Automated Deduction (CADE). pp. 313–327. No. 1104 in LNAI, Springer (1996), https://doi.org/10.1007/3540615113_97
 [23] Harrison, J., Urban, J., Wiedijk, F.: History of interactive theorem proving. In: Siekmann, J.H. (ed.) Computational Logic, Handbook of the History of Logic, vol. 9, pp. 135–214. Elsevier (2014). https://doi.org/10.1016/B9780444516244.500046
 [24] Hurd, J.: Firstorder proof tactics in higherorder logic theorem provers. Design and Application of Strategies/Tactics in Higher Order Logics, number NASA/CP2003212448 in NASA Technical Reports pp. 56–68 (2003)
 [25] Hurd, J.: System description: The Metis proof tactic. In: Christoph Benzmueller, John Harrison, C.S. (ed.) Workshop on Empirically Successful Automated Reasoning in HigherOrder Logic (ESHOL). pp. 103–104 (2005), https://arxiv.org/pdf/cs/0601042
 [26] Hurd, J.: The OpenTheory standard theory library. In: Bobaru, M.G., Havelund, K., Holzmann, G.J., Joshi, R. (eds.) NASA Formal Methods. LNCS, vol. 6617, pp. 177–191. Springer (2011), http://dx.doi.org/10.1007/9783642203985_14
 [27] Kaliszyk, C., Sutcliffe, G., Rabe, F.: TH1: The TPTP Typed HigherOrder Form with Rank1 Polymorphism. In: Fontaine, P., Schulz, S., Urban, J. (eds.) Proceedings of the 5th Workshop on Practical Aspects of Automated Reasoning. pp. 41–55. No. 1635 in CEUR Workshop Proceedings (2016)
 [28] Kaliszyk, C., Urban, J.: Learningassisted automated reasoning with Flyspeck. J. Autom. Reasoning 53(2), 173–213 (2014). https://doi.org/10.1007/s1081701493033
 [29] King, D., Arthan, R., Winnersh, I.: Development of practical verification tools. ICL Systems Journal 11, 106–122 (1996)
 [30] Korovin, K.: iprover  an instantiationbased theorem prover for firstorder logic (system description). In: Armando, A., Baumgartner, P., Dowek, G. (eds.) Automated Reasoning, 4th International Joint Conference, IJCAR 2008, Sydney, Australia, August 1215, 2008, Proceedings. Lecture Notes in Computer Science, vol. 5195, pp. 292–298. Springer (2008)
 [31] Kovács, L., Voronkov, A.: Firstorder theorem proving and Vampire. In: Sharygina, N., Veith, H. (eds.) CAV. LNCS, vol. 8044, pp. 1–35. Springer (2013)
 [32] Kumar, R., Myreen, M., Norrish, M., Owens, S.: CakeML: A Verified Implementation of ML. In: Sewell, P. (ed.) Proceedings of the 41st ACM SIGPLANSIGACT Symposium on Principles of Programming Languages. pp. 179–191. ACM Press (2014)
 [33] Kumar, R., Myreen, M.O., Norrish, M., Owens, S.: CakeML: a verified implementation of ML. In: Jagannathan, S., Sewell, P. (eds.) The 41st Annual ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, POPL’14, San Diego, CA, USA, January 2021, 2014. pp. 179–192. ACM (2014). https://doi.org/10.1145/2535838.2535841
 [34] Lindblad, F.: A focused sequent calculus for higherorder logic. In: Demri, S., Kapur, D., Weidenbach, C. (eds.) Automated Reasoning  7th International Joint Conference, IJCAR 2014, Held as Part of the Vienna Summer of Logic, VSL 2014, Vienna, Austria, July 1922, 2014. Proceedings. Lecture Notes in Computer Science, vol. 8562, pp. 61–75. Springer (2014). https://doi.org/10.1007/9783319085876_5, https://doi.org/10.1007/9783319085876_5
 [35] McCune, W.: Prover9 and Mace4 (2005–2010), http://www.cs.unm.edu/~mccune/prover9/
 [36] Meng, J., Paulson, L.C.: Translating higherorder clauses to firstorder clauses. J. Autom. Reasoning 40(1), 35–60 (2008)
 [37] Nipkow, T., Paulson, L., Wenzel, M.: Isabelle/HOL: A Proof Assistant for HigherOrder Logic. No. 2283 in Lecture Notes in Computer Science, SpringerVerlag (2002)
 [38] Pfenning, F., Elliot, C.: Higherorder abstract syntax. In: Proceedings of the ACM SIGPLAN 1988 Conference on Programming Language Design and Implementation. pp. 199–208. PLDI ’88, ACM, New York, NY, USA (1988). https://doi.org/10.1145/53990.54010
 [39] Rümmer, P.: A Constraint Sequent Calculus for FirstOrder Logic with Linear Integer Arithmetic. In: Cervesato, I., Veith, H., Voronkov, A. (eds.) Proceedings of the 15th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning. pp. 274–289. No. 5330 in Lecture Notes in Artificial Intelligence, SpringerVerlag (2008)
 [40] Rümmer, P.: EMatching with Free Variables. In: Bjørner, N., Voronkov, A. (eds.) Proceedings of the 18th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning. pp. 359–374. No. 7180 in Lecture Notes in Artificial Intelligence, SpringerVerlag (2012)
 [41] Schulz, S.: System description: E 1.8. In: McMillan, K.L., Middeldorp, A., Voronkov, A. (eds.) LPAR. LNCS, vol. 8312, pp. 735–743. Springer (2013). https://doi.org/10.1007/9783642452215_49
 [42] Slind, K., Norrish, M.: A brief overview of HOL4. In: Mohamed, O.A., Muñoz, C.A., Tahar, S. (eds.) Theorem Proving in Higher Order Logics, 21st International Conference, TPHOLs 2008, Montreal, Canada, August 1821, 2008. Proceedings. LNCS, vol. 5170, pp. 28–32. Springer (2008)
 [43] Steen, A., Benzmüller, C.: The HigherOrder Prover LeoIII. In: Galmiche, D., Schulz, S., Sebastiani, R. (eds.) Proceedings of the 8th International Joint Conference on Automated Reasoning. pp. 108–116. No. 10900 in Lecture Notes in Artificial Intelligence (2018)
 [44] Steen, A., Benzmüller, C.: The higherorder prover LeoIII. In: Galmiche, D., Schulz, S., Sebastiani, R. (eds.) Automated Reasoning. IJCAR 2018. LNCS, vol. 10900, pp. 108–116. Springer, Cham (2018), http://christophbenzmueller.de/papers/C70.pdf
 [45] Steen, A., Wisniewski, M., Benzmüller, C.: Going polymorphic  TH1 reasoning for leoiii. In: Eiter, T., Sands, D., Sutcliffe, G., Voronkov, A. (eds.) IWIL@LPAR 2017 Workshop and LPAR21 Short Presentations, Maun, Botswana, May 712, 2017. Kalpa Publications in Computing, vol. 1. EasyChair (2017), http://www.easychair.org/publications/paper/346851
 [46] Sutcliffe, G.: The CADE ATP System Competition  CASC. AI Magazine 37(2), 99–101 (2016)
 [47] Sutcliffe, G., Schulz, S., Claessen, K., Baumgartner, P.: The TPTP Typed Firstorder Form with Arithmetic. In: Bjørner, N., Voronkov, A. (eds.) Proceedings of the 18th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning. pp. 406–419. No. 7180 in Lecture Notes in Artificial Intelligence, SpringerVerlag (2012)
 [48] Sutcliffe, G.: The 9th IJCAR automated theorem proving system competition  CASCJ9. AI Commun. 31(6), 495–507 (2018). https://doi.org/10.3233/AIC180773
 [49] Weber, T.: SMT solvers: new oracles for the HOL theorem prover. International Journal on Software Tools for Technology Transfer 13(5), 419–429 (2011), http://dx.doi.org/10.1007/s1000901101888
 [50] Weidenbach, C., Dimova, D., Fietzke, A., Kumar, R., Suda, M., Wischnewski, P.: SPASS Version 3.5. In: Schmidt, R.A. (ed.) CADE. LNCS, vol. 5663, pp. 140–145. Springer (2009)
 [51] Xu, Y., Liu, J. Chen, S., Zhong, X., He, X.: Contradiction Separation Based Dynamic Multiclause Synergized Automated Deduction. Information Sciences 462, 93–113 (2018)