 # The Epsilon Calculus with Equality and Herbrand Complexity

Hilbert's epsilon calculus is an extension of elementary or predicate calculus by a term-forming operator ε and initial formulas involving such terms. The fundamental results about the epsilon calculus are so-called epsilon theorems, which have been proven by means of the epsilon elimination method. It is a procedure of transforming a proof in epsilon calculus into a proof in elementary or predicate calculus through getting rid of those initial formulas. One remarkable consequence is a proof of Herbrand's theorem due to Bernays and Hilbert which comes as a corollary of extended first epsilon theorem. The contribution of this paper is the upper and lower bounds analysis of the length of Herbrand disjunctions in extended first epsilon theorem for epsilon calculus with equality. We also show that the complexity analysis for Herbrand's theorem with equality is a straightforward consequence of the one for extended first epsilon theorem without equality due to Moser and Zach.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Hilbert’s epsilon calculus is an extension of predicate calculus by the -operator which forms for a formula a term . This operator is governed by the following two initial formulas: One is the critical formula

 A(t)→A(εxA(x))

where is an arbitrary term, and the other is the -equality formula

 →u=→v→εxB(x,→u)=εxB(x,→v)

where and are sequences of terms and and stands for the conjunction of , , , and for an arbitrary positive natural number , and the proper subterms of are only variables . Pure epsilon calculus is an extension of elementary calculus by the -operator and the critical formula. The -operator is expressive enough to encode the existential and universal quantifiers, so that they are definable as and within the epsilon calculus.

The epsilon calculus was originally developed in the context of Hilbert’s program. Early work in proof theory (before Gentzen) concentrated on the epsilon calculus, the -elimination method, and the -substitution method, and those results were carried out by Bernays [HB39] (see also [Zac03, Zac04, MZ06]), Ackermann [Ack25, Ack40] (see also [Mos06]), and von Neumann [vN27]. The correct proof of Herbrand’s theorem was first given by means of epsilon calculus [Bus94]. The theorem is commonly stated in a less general way as follows than the original: If there is a proof of a prenex existential formula for quantifier-free in predicate calculus, there is a proof of for some terms in elementary calculus. The epsilon calculus is of independent and lasting interest, however, and a study from a computational and proof-theoretic point of view is particularly worthwhile.

In the course of proving epsilon theorems and Herbrand’s theorem, the -elimination method is used to proof-theoretically transform a proof in epsilon calculus into a proof which is free from the above mentioned initial formulas. Assume there is a proof of in pure epsilon calculus, where is a finite sequence of terms possibly with occurrences of -terms, then the -elimination method generates another proof of the disjunction in elementary calculus, where are terms without the -operator. The disjunction is a so-called Herbrand disjunction for the formula , and the aim of this paper is analyses of the Herbrand complexity which is the length of the shortest Herbrand disjunction for the original formula. This paper extends the Herbrand complexity analysis by Moser and Zach [MZ06]. Their result tells us that the Herbrand Complexity of a formula is based on the proof measure speaking only about the first-order counterpart of a proof of . While they have dealt with the systems of epsilon calculus without the -equality formula, we target epsilon calculus with the -equality formula and study the upper and lower bounds analysis of the Herbrand complexity for the system with the -equality formula. Our contribution is divided into two parts. The first one is a complexity analysis for Herbrand’s theorem in first-order logic with equality. In this case, we can avoid to rely on epsilon calculus with the -equality formula, hence the result by Moser and Zach is directly applicable. The second one is the upper and lower bounds analyses for extended first epsilon theorem with the -equality formula, where the upper bound analysis depends on a measure concerning the structure of critical formulas as well as the measure for first-order ingredients of a proof.

Hilbert’s epsilon calculus is primarily a classical formalism, and we will restrict our attention to classical first-order logic. For non-classical approaches to epsilon calculus, see the work of Bell [Bel93a, Bel93b], DeVidi [DeV95], Fitting [Fit75], Mostowski [Mos63], and Shirai [Shi71]. Our study is also motivated by the recent renewed interest in the epsilon calculus and the -substitution method in, e.g., the work of Arai [Ara03, Ara05], Avigad [Avi02], Baaz et al. [BLL18], and Mints et al., [MT99, Min03]. The epsilon calculus also allows the incorporation of choice construction into logic [BG00]. The treatment of eigenvariables in the context of unsound proofs and its relation to the epsilon calculus is studied by Aguilera and Baaz [AB16]. On the semantics of epsilon calculus, see the work of Zach [Zac17].

The rest of this paper is organized in the following way. Section 2 describes the syntax of epsilon calculus without the -equality formula, Section 3 shows the embedding lemma which states that predicate calculus is a subset of pure epsilon calculus without the -equality formula. A complexity analyses of Herbrand’s theorem for a prenex existential formula comes as a simple consequence of the lemma. In Section 4, the system is extended by the -equality formula, which makes the identity schema true within the system. Section 5 clarifies the subtlety of complexity analyses of a system with equality through Yukami’s trick [Yuk84]. In Section 6 we review first and second epsilon theorems following the proof by Bernays. Section 7 and Section 8 are devoted to analysing the upper and the lower bounds, respectively, where Section 7 describes our complexity analysis for extended first epsilon theorem and the upper bound of the Herbrand complexity. Section 9 concludes this paper.

## 2 Epsilon Calculus

We start from defining terms and formulas of our logic. As a convention we assume range over a set of bound variables, over a set of free variables, over a set of function symbols, and over a set of predicate symbols. The symbol is reserved for the equality predicate. Each function symbol and predicate symbol has an arity, and , , , , and are disjoint. We abbreviate as and let denote its length . We define terms, formulas, and free variable occurrences. Notice the difference between free variables and free variable occurrences.

###### Definition 2.1 (Term and formula).

Raw terms and raw formulas , are simultaneously defined as follows.

 t::=x∣a∣f→t∣εxA A,B::=P→t∣t=t′∣¬A∣A→B∣A∧B∣A∨B∣∃xA∣∀xA

Sets of free variable occurrences and are simultaneously defined, assuming , , and .

 FV(z):={z}, FV(f→t):=FV(P→t):=⋃i<|→t|FV(ti), FV(¬A):=FV(A),

A raw term is a semiterm if , and is a term if . A raw formula is a semiformula if , and is a formula if . A (semi)formula and a (semi)term are quantifier free in case neither , nor occurs in them.

We abbreviate as and also as , and the same convention applies to . Terms of the form is called -terms.

###### Definition 2.2 (Substitution).

Assume and . For (semi)terms , (semi)formulas , and variables , the substitution is defined as follows.

 w{z/s}:=sif w≡z,w{z/s}:=wif w≢z, (f→t){z/s}:=f(→t{z/s}),(P→t){z/s}:=P(→t{z/s}), (¬A){z/s}:=¬(A{z/s}),(A∘B){z/s}:=A{z/s}∘B{z/s}, (QxA){z/s}:=QxA and (εxA){z/s}:=εxAif x≡z, (QxA){z/s}:=Qx′A{x/x′}{z/s} and (εxA){z/s}:=εx′(A{x/x′}{z/s})o.w.,

where and is fresh.

We can write for a formula with a free variable , and then is abbreviated as

. This notation is extended through the vector notation and the simultaneous substitution. We employ the same for terms.

###### Definition 2.3 (α-equivalence).

We define the -equivalence for (semi)terms and (semi)formulas as follows.

 x≡αx,a≡αa,f→s≡αf→t:=⋀i<|→s|si≡αti, P→s≡αP→t:=⋀i<|→s|si≡αti,¬A≡α¬B:=A≡αB, A∘B≡αA′∘B′:=A≡αA′ and B≡αB′ for % ∘∈{→,∧,∨}, QxA(x)≡αQyB(y):=εxA(x)≡αεyB(y):=A(z)≡αB(z) for a fresh% z.

We also define the term substitution for (semi)terms through the -equivalence instead of the equality on variables, and the simultaneous substitution.

###### Definition 2.4 (Set induced by vector).

For any vector , a set is defined to be via . We say a list of vectors is a split of if and for .

###### Definition 2.5 (Equality).

The following formulas are referred to by .

 t=t, s=t→t=s, s=t→t=u→s=u, →s=→t→P→s→P→t, →s=→t→f→s=f→t.
###### Definition 2.6 (Elementary calculus and predicate calculus).

The system of elementary calculus is denoted by , where its initial formulas are propositional tautologies and its inference rule is modus ponens given as follows.

 Γ⊢AΓ⊢A→BΓ⊢B

The system of first-order predicate calculus is denoted by , where the initial formulas are propositional tautologies and the following formulas and .

 ∀xA(x)→A(t) (∀−) A(t)→∃xA(x) (∃+)

The inference rules of are modus ponens and the following and , where the eigenvariable may not occur in any formula in the axiom .

 Γ⊢A→B(a)Γ⊢A→∀xB(x)(∀+)Γ⊢A(a)→BΓ⊢∃xA(x)→B(∃−)

and extended by the initial formulas are called and , respectively. We alternatively say and for them.

###### Definition 2.7 (Epsilon calculus).

Let a formula of the form

 A(t)→A(εxA(x)),

where is an arbitrary term and is a formula containing , be a critical formula, and we define the systems and by extending and by taking such critical formulas as initial formulas. We say is the critical -term of the critical formula and the critical formula belongs to .

###### Definition 2.8 (Proof).

Let be a system which consists of initial formulas and inference rules, and assume a set of formulas which we call axioms. A list of formulas is a proof in from , if each formula is an initial formula of , a formula in , or a consequence of an inference rule of referring to preceding formulas in the proof. We write if and only if a formula is the last formula of the proof in system from . We omit if it is empty and if there is no confusion. An inference rule consists of one consequence and assumptions, and may be displayed using a horizontal line.

###### Definition 2.9 (Languages).

Let the language be formulas and terms in Definition 2.1 and the language be without the universal and existential quantifiers. We denote by and the sublanguages of and without the -operator, respectively. Also, and are the sublanguages of and without the equality symbol, respectively. Finally, and are the sublanguages of and without the equality symbol, respectively.

We give two examples of -proofs. These formulas in the examples are meant to be -calculus versions of the independence of premise and the drinker’s formula. See also Example 3.2 and Example 3.3.

###### Example 2.10.

Consider the following formula in .

 (A→B(εxB(x)))→A→B(εx(A→B(x))). (1)

This formula (1) is an instance of the critical formula, hence a proof of (1) is given as follows.

 (A→B(εxB(x)))→A→B(εx(A→B(x))) critical formula
###### Example 2.11.

Consider the following formula in .

 A(εx(A(x)→A(εyA(y))))→A(εyA(y)). (2)

An -proof of this formula (2) is given as follows.

 (A(εyA(y))→A(εyA(y)))→ A(εx(A(x)→A(εyA(y))))→A(εyA(y)) critical formula A(εyA(y))→A(εyA(y)) propositional tautology A(εx(A(x)→A(εyA(y))))→A(εyA(y)) modus ponens

We conclude this section with the following basic results.

###### Theorem 2.12 (Deduction theorem).

Assume is a closed formula. iff in and in .

###### Lemma 2.13 (Identity schema).

For any formula and terms in , holds for .

###### Proof.

By induction on the size of . ∎

Note that the above identity schema is not available in and , if the language is extended to and , respectively. In Section 4, we deal with epsilon calculus with the -equality formula, within which the identity schema is recovered for and .

## 3 Embedding Lemma

Hilbert introduced the epsilon operator to encode quantifiers, so that predicate calculus goes to elementary calculus extended with the critical formula. This section describes this encoding of within . The idea is to define the quantifiers by -operator as follows, and recursively apply them.

 ∃xA(x):=A(εxA(x)),∀xA(x):=A(εx¬A(x)).
###### Definition 3.1 (ε-translation).

For a (semi)term and a (semi)formula we define its -translation and . Let stand for .

 xε:=x,aε:=a,(f→t)ε:=f→tε,(εxA)ε:=εxAε,(P→t)ε:=P→tε, (A→B)ε:=Aε→Bε,(A∧B)ε:=Aε∧Bε,(A∨B)ε:=Aε∨Bε, (¬A)ε:=¬Aε,(∃xA(x))ε:=Aε(εxAε(x)),(∀xA(x))ε:=Aε(εx¬Aε(x)).
###### Example 3.2.

Here is the formula of independence of premise in ,

 (A→∃xB(x))→∃x.A→B(x),

whose -translation is the formula (1) in Example 2.10.

###### Example 3.3.

Here is the drinker’s formula in ,

 ∃xA(x)→∀yA(y),

whose -translation is the formula (2) in Example 2.11.

###### Remark 3.4.

The above two examples also show that the -translation of a formula which is not provable in intuitionistic logic can be provable in without using any classical propositional tautology.

###### Definition 3.5 (Regular proof).

A proof is regular if each eigenvariable in the proof is used by at most one or .

###### Definition 3.6 (Proof size).

The size of a proof is the length of the list.

If there is a proof, a regular one is always available and whose size is polynomially bounded to the original non-regular proof. This fact comes by the following theorem due to Krajíček [Kra94].

###### Theorem 3.7.

Let and be the size of the smallest sequence-proof and tree-proof of a provable first-order formula in the Hilbert style calculus, respectively. Then there exists a polynomial such that for every provable first-order formula .

In the rest of this paper we implicitly assume the regularity of proofs.

###### Definition 3.8 (Critical count).

For a proof , we let be the number of critical formulas, , and in .

###### Lemma 3.9 (Embedding).

Assume for a formula , then for some with .

###### Proof.

We refer to the formula at line in the proof by . By induction on the length . If it is trivial. We prove the case , making case analysis how the formula at the line is derived. In case it comes by modus ponens using and , which is of the form , for , by the induction hypotheses there are proofs and concluding and , respectively, hence by modus ponens. In case is derived by , is of the form for . As , it suffices to substitute for throughout the proof of which is due to the induction hypothesis. Here we assumed the regularity of the proof. In case is derived by , is of the form and for . As , it suffices to use modus ponens with a critical formula and , which comes by induction hypothesis. In case is by , is of the form and hence we prove , whose contrapositive is a critical formula. In case is by , is of the form and hence we prove that is immediate as it is a critical formula. The rest is the axioms. The rest is the cases for propositional tautologies and , which are all trivial. ∎

###### Theorem 3.10 (Herbrand’s theorem).

Assume is a prenex existential formula in , namely, is quantifier free, and

 PC=⊢∃→xE(→x).

Then there are -free terms for such that

 EC=⊢n⋁i=0E(→ti).
###### Proof.

Assume is the -proof of . By means of Lemma 3.9, there is an -proof of , which is namely for some -terms , then the conclusion follows from extended first epsilon theorem for (cf. Theorem 16 in [MZ06]) with being propositional tautologies. ∎

## 4 Epsilon Calculus with the ε-Equality Formula

Epsilon calculus with equality was originally introduced by Hilbert [HB39]. Assuming is an -matrix, he formulated the -equality formula as follows.

 u=v→εxA(x,u)=εxA(x,v) (3)

In this section we adopt a variant of the -equality formula which is given as follows via the vector notation.

 →u=→v→εxA(x;→u)=εxA(x;→v) (4)

Then we define our system of epsilon calculus with equality to be extended with the initial formula (4). The -elimination method and the proofs of epsilon theorems for can be simpler than the ones for the original system by Hilbert. While the notion of closures is crucial in Hilbert and Bernays’ work, we do not need this notion in . Moreover, concerning the hyperexponential part of the upper bound analysis of the Herbrand complexity, our result for is better than the one for the system with (3), as it will be shown in Section 7.3.

###### Definition 4.1 (ε-matrix and semicolon notation).

An -term is an -matrix iff each proper subterm of is a free variable and each free variable in occurs exactly once. The -matrix and its immediate subsemiformula can be denoted as and as , respectively, if and only if is an -matrix with its free variables . We call the free variables of an -matrix its parameters.

Conventionally, we let range over -matrices, possibly with its parameters explicitly denoted as . For any -term, its -matrix is uniquely determined modulo free variable names. If is a critical -term, the -matrix of is called a critical -matrix.

###### Definition 4.2 (Arity of ε-matrix).

For an -matrix , we define its arity to be . Let be the maximal arity of critical -matrices of rank in , and be .

###### Lemma 4.3.

If is an -term, then for some -matrix and .

###### Proof.

Let be all the immediate subterms of and be fresh variables, so that . Then, is the -matrix . ∎

The epsilon calculus with equality by Hilbert and Bernays also employs the -equality formula as an initial formula.

###### Definition 4.4 (Epsilon calculus with the ε-equality formula).

Let and be and extended with the following additional initial formula, respectively,

 →u=→v→εxA(x;→u)=εxA(x;→v),

where and are term vectors of the same length as of the parameters of -matrix . A formula of the form is an -equality formula, where and are called the critical -terms of the -equality formula. We also say that the -equality formula belongs to and to .

According to the semicolon notation, the -equality formula always belongs to critical -terms which were formed by applying substitutions to an -matrix . The next section details this constraint from a perspective of complexity analysis. Due to the -equality formula, the identity schema is available in .

###### Lemma 4.5 (Identity schema).

Let a formula and terms be in , then . The same holds for in .

###### Proof.

By induction. ∎

We further define means of measuring complexity of terms and proofs, which are used in the next sections to study procedures of eliminating critical -terms. The rank counts the depth of nesting -semiterms, while the degree counts the depth of nesting -terms. Here we suppose that .

###### Definition 4.6 (Rank).

We define the rank for a (semi)term .

 rk(a):=rk(x):=0,rk(f→t):=max{rk(ti)∣i<|→t|}, rk(εxA(x)):=max{rk(t)∣t%subordinatesεxA(x)}+1,

where subordinates iff and is a subsemiterm of . We define be , where are the critical -terms in .

The rank is stable against substitutions.

###### Lemma 4.7.

For any terms and , .

###### Proof.

Comparing with , nothing new is subordinating in due to the substitution of for , hence it is obvious from Definition 4.6. ∎

###### Lemma 4.8.

For an -matrix , .

###### Proof.

By induction on the construction of . ∎

###### Definition 4.9 (Degree).

For a (semi)term , we define its degree .

 deg(a):=deg(x):=0,deg(f→t):=max{deg(ti)∣i<|→t|}, deg(εxA(x)):=max{deg(t)∣t is a subterm of A(x)}+1,
###### Definition 4.10 (Maximal critical ε-term).

Let maximal critical -terms of a proof be the set of critical -terms of the greatest degree among the set of critical -terms of the greatest rank in a proof .

We conclude this section by defining measures for the proof complexity based on critical -terms, -matrices, critical formulas and -equality formulas.

###### Definition 4.11 (Order).

For a proof , the number of distinct critical -terms of rank in is denoted by which we call the order, and the number of distinct -matrices is defined in the same manner and denoted by which we call the matrix order.

###### Definition 4.12 (Width).

Define and by the number of distinct critical formulas belonging to in and of distinct -equality formulas belonging to in , respectively. The width is defined to be . Let be critical -terms in , then the maximal width is defined to be .

In order to measure the number of -equality formulas, we replace the notion of the critical count in Definition 3.8 by the following one.

###### Definition 4.13 (Critical count).

Assume is a proof in or . The critical count of is defined to be the sum of the numbers of critical formulas, -equality formulas, , and in . We let and be the numbers of critical fomulas and of -equality formulas in , respectively.

## 5 Yukami’s Trick

In this short section, we clarify the need for the restriction to -matrices in the definition of -equality axioms, cf. (3) and (4) (see also Definition 4.4).

For the sake of the argument we assume, for the duration of this section only, that the restriction to -matrices is dropped. We focus on the above formulation of -equality axioms, using vector notation, as expressed in (4). However, the below given argument is equally valid for Hilbert’s original definition (if we drop the restriction to -matrices).

We will employ Yukami’s trick [Yuk84] together with folkore results in structural proof theory [Bus98, Pud98]. For additional insight into the proof theoretic strength of applications of identiy schema, see [BF01].

###### Theorem 5.1 (cf. [Yuk84]).

Using two instances of the following restricted scheme of identity

 t=0→g(t)=g(0) (5)

we can uniformly derive , from (i) , (ii) , and (iii) .

###### Proof.

Let where is fully indicated. Let , where in refers only to the innermost occuring term . The following equalities can be easily derived (employing in addition suitable instance of the transitivity axiom (ii) and axiom (i))

 0n+A(0n−1+⋯+(02+0)) = 0n−1+(0n−2+⋯+(0+0)) = 0n−1+(0n−2+⋯+(02+0))A

if we employ the instances of (5)

 0+0=0→r1(0+0)=r1(0)

and

 0+0=0→r2[0+0]=r2

Hence we have derived . Eventually, to obtain the desired result, we apply axiom (iii), as is nothing else than ( is indicated by above).

Note that the derivation is uniform for any : while for any the proof slightly differs, the number of steps and in particular the critical count is constant. ∎

###### Remark 5.2.

Using induction one can derive (5) uniformly from (i) and (ii) . That is Yukami’s trick is available in any suitable rich arithmetical theory.

The next result clarifies that the restricted identity axioms employed in Yukami’s trick are uniformly derivable if no additional restriction on the form of the -terms are enforced in (4). Let denote the extension of the -calculus with the following axioms to cover -equality:

 →u=→v→εxA(x;→u)=εxA(x;→v)to0.0pt$$, where , denote (arbitrary) -terms. ###### Lemma 5.3. The following identity schema, generalising (5), is derivable in :  s=t→g(s)=g(t)to0.0pt$$,

where is an aribrary term in .

###### Proof.

Let . Consider the following two critical axioms:

 g(s)=g(s)→εx(x=g(s))=g(s)g(t)=g(t)→εx(x=g(t))=g(t)

Thus derives (i) as well as (ii) . We exploit the following -equality axiom in

 s=t→εx(x=g(s))=εx(x=g(t))to0.0pt. (6)

Assuming , we can thus derive (within ) . Due to (i) and (ii) and equality axioms , we thus obtain in as claimed. It is important to emphasise, that the -term employed in (6) is not an -matrix. ∎

Before we can employ Yukami’s trick and the above lemma, we need some preparatory definitions and results.

Let be a theory. We say admits Herbrand’s theorem if whenever , with quantifier-free, then there exists a finite sequence of terms such that .

Let be axiomatised by purely universal formulas. Then it is well-known that admits Herbrand’s theorem, cf. [Bus98]. Due to Theorem 3.10 we can even conclude the existence of a function such that , where denotes the critical count of the proof of .

The next result improves upon this, in the sense that we also bound the term complexity of the sequence of terms in the critical count. Let denote the depth of any term , defined in the usual way. Futher let denote the formula complexity of an formula in the language of . A variant of the following result is due to Krajicek and Pudlak (see [KP88]).

###### Theorem 5.4.

Suppose is a universal theory such that so that the underlying equational theory of (if any), has positive unification type. Then there exists a primitive recursive function and a finite sequence of terms such that , where .

###### Proof.

Wlog. we assume that is axiomatised by quantifier-free formulas. As is provable in , there exists a conjunction of (quantifier-free) axioms in such that . By the above, we conclude the existence of terms and a primitive recursive function