Applicable Mathematics in a Minimal Computational Theory of Sets

by   Arnon Avron, et al.
Tel Aviv University
cornell university

In previous papers on this project a general static logical framework for formalizing and mechanizing set theories of different strength was suggested, and the power of some predicatively acceptable theories in that framework was explored. In this work we first improve that framework by enriching it with means for coherently extending by definitions its theories, without destroying its static nature or violating any of the principles on which it is based. Then we turn to investigate within the enriched framework the power of the minimal (predicatively acceptable) theory in it that proves the existence of infinite sets. We show that that theory is a computational theory, in the sense that every element of its minimal transitive model is denoted by some of its closed terms. (That model happens to be the second universe in Jensen's hierarchy.) Then we show that already this minimal theory suffices for developing very large portions (if not all) of scientifically applicable mathematics. This requires treating the collection of real numbers as a proper class, that is: a unary predicate which can be introduced in the theory by the static extension method described in the first part of the paper.



There are no comments yet.


page 1

page 2

page 3

page 4


Some axioms for mathematics

The lambda-Pi-calculus modulo theory is a logical framework in which man...

An Equational Logical Framework for Type Theories

A wide range of intuitionistic type theories may be presented as equatio...

A general definition of dependent type theories

We define a general class of dependent type theories, encompassing Marti...

Decompositions of Stratified Institutions

The theory of stratified institutions is a general axiomatic approach to...

Follow the Flow: sets, relations, and categories as special cases of functions with no domain

We introduce, develop, and apply a new approach for dealing with the int...

Formalization of Forcing in Isabelle/ZF

We formalize the theory of forcing in the set theory framework of Isabel...

Sense, reference, and computation

In this paper, I revisit Frege's theory of sense and reference in the co...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Formalized mathematics and mathematical knowledge management (MKM) are extremely fruitful and quickly expanding fields of research at the intersection of mathematics and computer science (see, e.g., [3, 13, 33]). The declared goal of these fields is to develop computerized systems that effectively represent all important mathematical knowledge and techniques, while conforming to the highest standards of mathematical rigor. At present there is no general agreement what should be the best framework for this task. However, since most mathematicians view set theory as the basic foundation of mathematics, formalized set theories should certainly be taken as one of the most natural choice.111Already in [14] it was argued that “a main asset gained from Set theory is the ability to base reasoning on just a handful of axiom schemes which, in addition to being conceptually simple (even though surprisingly expressive), lend themselves to good automated support”. More recently, H. Friedman wrote (in a message on FOM on Sep 14, 2015): “I envision a large system and various important weaker subsystems. Since so much math can be done in systems much weaker than ZFC, this should be reflected in the choice of Gold Standards. There should be a few major Gold Standards ranging from Finite Set Theory to full blown ZFC”. 222Notable set-based automated provers are Mizar [43], Metamath [37], and Referee (aka AetnaNova) [41, 15].

In [6, 7] a logical framework for developing and mechanizing set theories was introduced. Its key properties are that it is based on the usual (type-free) set theoretic language and makes extensive use of abstract set terms. Such terms are extensively used of course in all modern texts in all areas of mathematics (including set theory itself). Therefore their availability is indispensable for the purpose of mechanizing real mathematical practice and for automated or interactive theorem proving in set theories. Accordingly, most of the computerized systems for set theories indeed allow dynamic ways of introducing abstract set terms. The great advantage of the framework of [6, 7] is that it does so in a static way, so the task of verifying that a given term or formula in it is well-formed is decidable, easily mechanizable, and completely separated from any task connected with proving theorems (like finding proofs or checking validity of given ones). Furthermore, this framework enables the use of different logics and set theories of different strength. This modularity of the system has been exploited in [8], where a hierarchy of set theories for formalizing different levels of mathematics within this framework was presented.

The current paper is mainly devoted to one very basic theory, , from the above-mentioned hierarchy, and to its minimal model. The latter is shown to be the universe in Jensen’s hierarchy [32]. Both and are computational (in a precise sense defined below). With the help of the formal framework of [6, 7, 8] they can therefore be used to make explicit the potential computational content of set theories (first suggested and partially demonstrated in [14]). Here we show that they also suffice for developing large portions of scientifically applicable mathematics [23], especially analysis.333The thesis that is sufficient for core mathematics was first put forward in [46]. In [21, 22, 23] it was forcefully argued by Feferman that scientifically applicable mathematics, the mathematics that is actually indispensable to present-day natural science, can be developed using only predicatively acceptable mathematics. We provide here further support to this claim, using a much simpler framework and by far weaker theory than those employed by Feferman.

The restriction to a minimal framework has of course its price. Not all of the standard mathematical structures can be treated as elements of . (The real line is a case in point.) Hence we have to handle such objects in a different manner. To do this, we first enrich the framework used in [6, 7, 8] with means for coherently extending by definitions theories in it, without destroying its static nature, or violating any of the principles on which it is based. (This step is a very important improvement of the framework on each own right.) This makes it possible to introduce the collection of real numbers in as a proper class, that is: a legal defined unary predicate to which no closed term of corresponds. (Classes are introduced here into the formal framework of [6, 7, 8] for the first time.)

The paper is organized as follows: In Section 2 we review the formal framework and the way various standard set theoretical notions have been introduced in in it. We also define in this section the notions of computational theory and universe, and describe the computational theories which are minimal within the framework (as well as the corresponding minimal universes). Section 3 is dedicated to the introduction of standard extensions by definitions of the framework, done in a static way. The notion of a class is then introduced as a particular case, and is used for handling global relations and functions in the system. In Section 4 we introduce the natural numbers in the system. Unlike in [8], this is done here using an absolute characterization of the property of being a natural number, and without any appeal to -induction. In Section 5 we turn to real analysis, and demonstrate how it can be developed in our minimal computational framework, although the reals are a proper class in it. This includes the introduction of the real line and real functions, as well as formulating and proving classical results concerning these notions.444A few of the claims in Section 5 have counterparts in [8]. However, the models used in that paper are based on universes which are more extensive than the minimal one which is studied here. Hence the development and proofs there were much simpler. In particular: there was no need in [8] to use proper classes, as is essential here. Another, less crucial but still important, difference is that unlike in [8], the use of -induction is completely avoided at the present paper. Section 6 concludes with directions for future continuation of the work.

2. The Formal System and its Minimal Model

2.1. Preliminaries: the Framework and the Main Formal System

To avoid confusion, the parentheses are used in our formal languages, for constructing abstract set terms in it, while in the meta-language we use the ordinary .555To be extremely precise, we should have also used different notations in the formal languages and in the meta-language for and , as well as for many other standard symbols which are used below. However, for readability we shall not do so, and trust the reader to deduce the correct use from the context. We use the letters for collections; for finite sets of variables; and for variables in the formal language. denotes the set of free variables of , and denotes the result of simultaneously substituting for in . When the identity of and is clear from the context, we just write instead of .

One of the foundational questions in set theory is which formulas should be excluded from defining sets by an abstract term of the form in order to avoid the paradoxes of naive set theory. Various set theories provide different answers to this question, which are usually based on semantical considerations (such as the limitation of size doctrine [25, 28]). Such an approach is not very useful for the purpose of mechanization. In this work we use instead the general syntactic methodology of safety relations developed in [6, 7]. A safety relation is a syntactic relation between formulas and sets of variables. The addition of a safety relation to a logical system allows to use in it statically defined abstract set term of the form , provided that is safe with respect to . Intuitively, a statement of the form “ is safe with respect to ”, where , has the meaning that for every “accepted” sets , the collection is an “accepted” set, which is constructed from the previously “accepted” sets (see discussion below for further details).

Let be a finite set of constants. The language and the associated safety relation are simultaneously defined as follows:

  • Terms:

    • Every variable is a term.

    • Every is a term (taken to be a constant).

    • If is a variable and is a formula such that , then is a term ().

  • Formulas:

    • If are terms, then , are atomic formulas.

    • If are formulas and is a variable, then , are formulas.666Our official language does not include and . However, since the theory studied in this paper is based on classical logic, we take here as an abbreviation for .

  • The safety relation :

    • If is an atomic formula, then .

    • If is a term such that , and , then .

    • If , then .

    • If and , then .

    • If , and or , then .

    • If and , then .

Notation. We take the usual definition of in terms of , according to which . An -theory 777‘RST’ stands for Rudimentary Set Theory. See Theorem 2.2 below. is a classical first-order system with variable binding term operator ([19]), in a language of the form , which includes the following axioms:

  • Extensionality:   

  • Comprehension Schema:   

[7] The following notations are available (i.e. they can be introduced as abbreviations and their basic properties are provable) in every -theory:

  • .

  • , where is fresh.

  • . .

  • .

  • , provided and .

  • , where is fresh and .

  • , where are fresh.

  • , where is fresh.

  • , where is fresh.

  • , where are fresh.

  • , where are fresh.

  • , provided .888Due to the Extensionality Axiom, if , then the term above for denotes if there is no set which satisfies , and it denotes the union of all the sets which satisfy otherwise. In particular: this term has the property that if there is exactly one set which satisfies , then denotes this unique set since . Note that the definition of taken here is simpler than the definition used in [7], which was (where some caution was taken so that the term is always well defined).

  • , provided .

  • , ( fresh).

  • , ( fresh).

[6] There are formulas, and in such that:

  1. , and for .

  2. , and for .

  3. is provable in every -theory.

  1. is the minimal -theory. In other words: is the theory in whose axioms are those given in Definition 2.1. 999 can be shown to be equivalent to Gandy’s basic set theory [27].

  2. is the -theory in in which the following axioms are added to those given in Definition 2.1:


  • In [6] it was suggested that the computationally meaningful instances of the Comprehension Axiom are those which determine the collections they define in an absolute way, independently of any “surrounding universe”. In the context of set theory, a formula is “computable” w.r.t. if the collection is completely and uniquely determined by the identity of the parameters , and the identity of other objects referred to in the formula (all of which are well-determined beforehand). Note that is computable for iff it is absolute in the usual sense of set theory. In order to translate this idea into an exact, syntactic definition, the safety relation is used. Thus, in an -theory only those formulas which are safe with respect to are allowed in the Comprehension Scheme. It is not difficult to see that the safety relation used in an -theory indeed possesses the above property.101010Recently it was shown [10] that up to logical equivalence, and as long as we restrict ourselves to the basic first-order language, the converse holds as well. It is not known yet whether this is true also in the presence of abstract set terms. Thus the formula should be safe w.r.t. (but not w.r.t. ), since if the identity of is computationally acceptable as a set, then any of its elements must be previously accepted as a set, and . Another example is given by the clause for negation. The intuitive meaning of is the complement (with respect to some universe) of , which is not in general computationally accepted. However, if is absolute, then so is its negation.

  • and differ from the systems and used in [8] with respect to the use of -induction. In principle, -induction does not seem to be in any conflict with the notion of a computational theory, since it only imposes further restrictions on the collection of acceptable sets. Accordingly, it was indeed adopted and used in [6, 7, 8]. Nevertheless, in order not to impose unnecessary constraints on our general framework, and in particular to allow to develop in it set theories which adopt the anti-foundation axiom AFA, -induction is not included in and .

  • It is not difficult to prove that , the set of all hereditary finite sets, is a model of . (In fact, it is the minimal one.) It follows that the set of the natural numbers is not definable as a set in . To solve this problem, the special constant was added in , together with appropriate axioms. (These axioms replace in the usual infinity axiom of .) The intended interpretation of the new constant is , and the axioms for it ensure (as far as it is possible on the first-order level) that is indeed to be interpreted as this collection. In particular, we have:

[8] The followings are provable in :

  1. .

  2. , for .

  3. , for .

  1. An important feature of -theories is that their two axioms directly lead (and are equivalent) to the set-theoretical and reduction rules (see [6]).

  2. While the formal language allows the use of set terms, it also provides a mechanizable static check of their validity due to the syntactic safety relation. To obtain decidable syntax, logically equivalent formulas are not taken to be safe w.r.t. the same set of variables. However, if is provable in some -theory, then so is . Therefore for such we might freely write in what follows instead of .111111Further discussion on decidability issues for safety-based languages can be found in [5].

  3. It is easy to verify that the system is a proper subsystem of . While the latter is not an -theory, in [6] it was shown that it can be obtained from the former by adding the following clauses to the definition of its safety relation:

    • Separation: for every formula .

    • Powerset: if .

    • Replacement: , provided , and .

  4. Unlike in this paper, in general the framework for set theories just reviewed is not confined to the first-order level or to classical logic. Thus in [7] it was used together with ancestral logic ([35, 38, 44, 4, 17]). (This involves adding a special clause to the definition of that treats the operation of transitive closure.) Intuitionistic versions have been investigated too.

  5. A safety relation like presents a difficult challenge for mechanized logical frameworks of the Edinburgh LF’s type ([29]). First, it is a strictly syntactic relation between formulas and variables, whose direct implementation requires the use of meta-variables for the variables of the object language — something which is particularly difficult to handle in this type of logical frameworks ([9]). Second, does not have a fixed arity like all judgements in the Edinburgh LF do: it is actually a relation between formulas and finite sets of object-level variables. Therefore it seems that current logical frameworks should be significantly extended and refined in order to be able to handle the syntactic framework for set theories that was proposed in [6] (and is used here).

2.2. The Minimal Model

We next recall the definition of rudimentary functions (for more on this topic see [20, 30]).121212To be precise, the definition we take here is given in The Basis Lemma in [20]. It was shown there that this definition is equivalent to the standard definition of rudimentary functions. Rudimentary functions are just the functions obtained by omitting the recursion schema from the usual list of schemata for primitive recursive set functions. Every rudimentary function is a composition of the following functions:

  • where

  1. A function is called -rudimentary if it can be generated by composition of the functions in Definition 12, and the following constant function:

    • (the set of hereditary finite sets).

  2. An -universe (universe in short) is a transitive collection of sets that is closed under -rudimentary functions.

Terminology. In what follows, we do not distinguish between a universe and the structure for with domain and an interpretation function that assigns the obvious interpretations to the symbols , , and to . We denote by the -variant of which assigns to . If

are two vectors of the same length we abbreviate

by . We denote by any assignment which assigns to each the element .131313As long as we apply to expressions whose set of free variables is contained in the exact assignment does not matter.

Let be a universe, an assignment in . For any term and formula of , we recursively define a collection and a truth value (respectively) by:

  • for a variable.

  • iff  ;  iff

  • iff

  • iff

  • iff

  • iff

Given and , we say that the term defines the collection .

From Theorem 2.2 below it follows that is an element of (and it denotes the value in that the term gets under ), and denotes the truth value of the formula under and . In case is a closed term or a closed formula, we denote by the value of in , and at times we omit the superscript and simply write .

The following theorem is a slight generalization of a theorem proved in [7].

  1. If is an -ary -rudimentary function, then there is a formula of s.t.:

  2. If is a formula of such that:

    then there exists a -rudimentary function such that:

  3. If is a term of such that , then there exists a -rudimentary function such that for every .


The corresponding theorem in [7] establishes the connection between without the constant and rudimentary functions. Thus, the only modification required here is the treatment of the new function in (1), and the treatment of the constant in (2) and (3) (which are then incorporated in the original proof that was carried out by induction). For (1), it is easy to verify that . For (2) and (3) (which are proved by simultaneous induction on the structure of terms and formulas), the case for the constant is immediate from the definition of -rudimentary functions. ∎

Let be a universe, and an assignment in .

  • For a term of , .

  • For a formula of :

    • If and , then:


    • If and , then for any :


The proof is straightforward using Theorem 2.2. Claims (1) and (2a) are immediate. For (2b) let be a formula s.t. and . Using Theorem 2.2 we get that defines a -rudimentary predicate,

(i.e. one whose characteristic function is

-rudimentary). Define:

is a -rudimentary function (see Lemma 1.1 in [20]). Now, define:

is also -rudimentary since

Now, the fact that is a universe entails that for every assignment in and every , , and so . ∎

Let be a universe. Then, is a model of .


The Extensionality axiom is clearly satisfied in any universe. Theorem 2.2 entails that the interpretation of any term is an element of the universe. This immediately implies that the other axioms are satisfied in any universe. It is also straightforward to verify that the interpretation of as satisfies the three axioms for . ∎

2.3. Computational Theories and Universes

Computations within a set of objects require concrete representations of these objects. Accordingly, we call a theory computational if its set of closed terms induces in a natural way a minimal model of the theory, and it enables the key properties of these elements to be provable within it. Next we provide a more formal definition for the case of set theories which are defined within our general framework. Note that from a Platonist point of view, the set of closed terms of such a theory induces some subset of the cumulative universe of sets , as well as some subset of any transitive model of .

  1. A theory in the above framework is called computational if the set it induces is a transitive model of , and the identity of is absolute in the sense that for any transitive model of (implying that is a minimal transitive model of ).

  2. A set is called computational if it is for some computational theory .

The most basic computational theories are and , which are the two minimal theories in the hierarchy of systems developed in [8]. This fact, as well as the corresponding computational universes, are described in the following three results from [8].

Let and be the first two elements in Jensen’s hierarchy [32].141414 and , where denotes the smallest set such that , , and is closed under application of all rudimentary functions.

  1. is a model of .

  2. with the interpretation of as is a model of .


The first claim is trivial. The second claim follows from Corollary 2.2, since is clearly a universe.∎

  1. iff there is a closed term of s.t. .

  2. iff there is a closed term of such that .


We prove the second item, leaving the easy proof of the first to the reader. Theorem 2.2 entails the right-to-left implication. The converse is proved by induction, using Lemma 2.1. Clearly, and . Now, suppose that for there are closed terms and such that and . We show that there are closed terms for any of the results of applications of to and .

and are computational, and and are their computational models.

Now , the minimal computational set, is the set of hereditary finite sets. Its use captures the standard data structures used in computer science, like strings and lists. However, in order to be able to capture computational structures with infinite objects, we have to move to , whose computational universe, , seems to be the minimal universe that suffices for this purpose. still allows for a very concrete, computationally-oriented interpretation, and it is appropriate for mechanical manipulations and interactive theorem proving. As noted in the introduction, the main goal of this paper is to show that this theory and its corresponding universe are sufficiently rich for a systematic development of (great parts of) applicable mathematics.

3. Classes and Static Extensions by Definitions

When working in a minimal computational universe such as (as done in the next section), many of the standard mathematical objects (such as the real line and real functions) are only available in our framework as proper classes. Thus, in order to be able to formalize standard theorems regarding such objects we must enrich our language to include them. However, introducing classes into our framework is a part of the more general method of extensions by definitions, which is an essential part of every mathematical research and its presentation. There are two principles that govern this process in our framework. First, the static nature of our framework demands that conservatively expanding the language of a given theory should be reduced to the use of abbreviations. Second, since the introduction of new predicates and function symbols creates new atomic formulas and terms, one should be careful that the basic conditions concerning the underlying safety relation are preserved. Thus only formulas s.t. can be used for defining new predicate symbols.

We start with the problem of introducing new predicate symbols. Since -ary predicates can be reduced in the framework of set theory to unary predicates, we focus on the introduction of new unary predicates. In standard practice such extensions are carried out by introducing a new unary predicate symbol and either treating as an abbreviation for for some formula and variable , or (what is more practical) adding as an axiom to the (current version of the base) theory, obtaining by this a conservative theory in the extended language. However, in the set theoretical framework it is possible and frequently more convenient to uniformly use class terms, rather than introduce a new predicate symbol each time. Thus, instead of writing “” one uses an appropriate class term and writes “”. Whatever approach is chosen – in order to respect the definition of a safety relation, class terms should be restricted so that “” is safe w.r.t. . Accordingly, we extend our language by incorporating class terms, which are objects of the form , where . The use of these terms is done in the standard way. In particular, (where is free for in ) is equivalent to (and may be taken as an abbreviation for) . It should be emphasized that a class term is not a valid term in the language, but only a definable predicate. Thus the addition of the new notation does not enhance the expressive power of languages like , but only increases the ease of using them.

Further standard abbreviations (see [34]) are:

  • is an abbreviation for .

  • and stand for .

  • is an abbreviation for .

  • is an abbreviation for .

  • is an abbreviation for .

Note that these formulas are merely abbreviations for formulas which are not necessarily atomic (even though, also happens to be safe with respect to ).

A further conservative extension of the language that we shall use incorporates free class variables, , and free function variables, , into (as in free-variable second-order logic [44]). These variables stand for arbitrary class or function terms (the latter is defined in Def. 16), and they may only appear as free variables, never to be quantified. We allow occurrences of such variables inside a formula in a class term or a function term. One may think of a formula with such variables as a schema, where the variables play the role of “place holders”, and whose substitution instances abbreviate official formulas of the language. (See Example 5.2.) In effect, a formula with free class variable can be intuitively interpreted as “for any given class , holds”. Thus, a free-variable formulation has the flavor of a universal formula. Therefore, this addition allows statements about all potential classes and all potential functions.

Let be a universe, an assignment in , and let . Define:

Given and , we again say that the class term on the left defines here the collection on the right (even though it might not be an element of ).

Let be a collection of elements in a universe .

  • is a -set (in ) if there is a closed term that defines it. (See Definition 2.2.) If is a -set, denotes some closed term that defines it.

  • is a -class (in ) if there is a closed class term that defines it. If is a -class, denotes some closed class term that defines it.

Note that by Corollary 2.2, if is a -set in then . The following holds for every universe :

  1. Every -set is a -class.

  2. The intersection of a -class with a -set is a -set.

  3. Every -class that is contained in a -set is a -set.

  1. If is a -set, then . Hence (see [6]) . This implies that is a class term which defines , and so is a -class.

  2. Let be a -class and be a -set. Then, can be defined by the term . Since and we get that . Hence is a -set.

  3. Follows from (2), since if then . ∎

A semantic counterpart of our notion of a -class was used in [46], and is called there an -class. It is defined as a definable subset of whose intersection with any element of is in . The second condition in this definition seems somewhat ad hoc. More importantly, it is unclear how it can be checked in general, and what kind of set theory is needed to establish that certain collections are -classes. In contrast, the definition of a -class used here is motivated by, and based on, purely syntactical considerations. It is also a simplification of the notion of -class, as by Prop. 3(2) every -class is an -class.151515Two other ideas that appear in the sequel were adopted from [46]: treating the collection of reals as a proper class, and the use of codes for handling certain classes. It should nevertheless be emphasized that the framework in [46] is exclusively based on semantical considerations, and it is unclear how it can be turned into a formal (and suitable for mechanization) theory like or . The following holds for every universe :

  • Let be a -set. If and , then is a -set.

  • If , then is a -set.

  • is defined by .

  • is defined by , where is fresh. ∎

For every -ary -rudimentary function there is a term with s.t. for any , returns the -set .


It is easy to see that if are -sets, and is a formula such that and , then is a -set. Therefore the proposition easily follows from Theorem 2.2.∎

If are -classes (in a universe ), so are , , , , and .


  • .

  • .

  • . (See Lemma 2.1.)

  • .

  • . (See footnote 6.) ∎

For a class term we denote by the class term . Note that for any assignment in and class term , is equal to , i.e., the intersection of the power set of and . This demonstrates the main difference between set terms and class terms. The interpretation of set terms is absolute, whereas the interpretation of class terms might not be (though membership in the interpretation of a class term is absolute). Let be a universe. A -relation (in ) from a -class to a -class is a -class s.t. . A -relation is called small if it is a -set (of ).

Next we extend our framework by the introduction of new function symbols. This poses a new difficulty. While new relation symbols are commonly introduced in a static way, new function symbols are usually introduced dynamically: a new function symbol is made available after appropriate existence and uniqueness theorems had been proved. However, one of the main guiding principles of our framework is that its languages should be treated exclusively in a static way. Thus function symbols, too, are introduced here only as abbreviations for definable operations on sets.161616

In this paper, as in standard mathematical textbooks, the term “function” is used both for collections of ordered pairs and for set-theoretical operations (such as

). Let be a universe. (The various definitions should be taken with respect to .)

  • For a closed class term and a term of , is a function term which is an abbreviation for