A formalization of forcing and the unprovability of the continuum hypothesis

by   Jesse Michael Han, et al.

We describe a formalization of forcing using Boolean-valued models in the Lean 3 theorem prover, including the fundamental theorem of forcing and a deep embedding of first-order logic with a Boolean-valued soundness theorem. As an application of our framework, we specialize our construction to the Boolean algebra of regular opens of the Cantor space 2^ω_2 ×ω and formally verify the failure of the continuum hypothesis in the resulting model.



There are no comments yet.


page 5

page 6

page 9

page 14


Interpreting Lambda Calculus in Domain-Valued Random Variables

We develop Boolean-valued domain theory and show how the lambda-calculus...

Boolean-like algebras of finite dimension

We introduce Boolean-like algebras of dimension n (nBA) having n constan...

Bisimilar Conversion of Multi-valued Networks to Boolean Networks

Discrete modelling frameworks of Biological networks can be divided in t...

Applying Boolean discrete methods in the production of a real-valued probabilistic programming model

In this paper we explore the application of some notable Boolean methods...

A Boolean Task Algebra for Reinforcement Learning

We propose a framework for defining a Boolean algebra over the space of ...

Extension of Boolean algebra by a Bayesian operator; application to the definition of a Deterministic Bayesian Logic

This work contributes to the domains of Boolean algebra and of Bayesian ...

A Meta-Theorem for Distributed Certification

Distributed certification, whether it be proof-labeling schemes, locally...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.


The continuum hypothesis states that there are no sets strictly larger than the countable natural numbers and strictly smaller than the uncountable real numbers. It was introduced by Cantor [7] in 1878 and was the very first problem on Hilbert’s list of twenty-three outstanding problems in mathematics. Gödel [14] proved in 1938 that the continuum hypothesis was consistent with , and later conjectured that the continuum hypothesis is independent of , i.e. neither provable nor disprovable from the axioms. In 1963, Paul Cohen developed forcing [10, 11], which allowed him to prove the consistency of the negation of the continuum hypothesis, and therefore complete the independence proof. For this work, which marked the beginning of modern set theory, he was awarded a Fields medal—the only one to ever be awarded for a work in mathematical logic.

In this paper we discuss the formalization of a Boolean-valued model of set theory where the continuum hypothesis fails. The work we describe is part of the Flypitch project, which aims to formalize the independence of the continuum hypothesis. Our results mark a major milestone towards that goal.

Our formalization is written in the Lean 3 theorem prover. Lean is an interactive proof assistant under active development at Microsoft Research [12, 41]. It implements the Calculus of Inductive Constructions and has a similar metatheory to Coq, adding definitional proof irrelevance, quotient types, and a noncomputable choice principle. Our formalization makes as much use of the expressiveness of Lean’s dependent type theory as possible, using constructions which are impossible or unwieldy to encode in HOL, much less ZF: Lean’s ordinals and cardinals, which are defined as equivalence classes of well-ordered types, live one universe level up and play a crucial role in the forcing argument; the models of set theory we construct require as input an entire universe of types; our encoding of first-order logic uses parametrized inductive types to equate type-correctness with well-formedness, eliminating the need for separate well-formedness proofs.

The method of forcing with Boolean-valued models was developed by Solovay and Scott in ’65-’66 [35, 37] as a simplification of Cohen’s method. Some of these simplifications were incorporated by Shoenfield [40] into a general theory of forcing using partial orders, and it is in this form that forcing is usually practiced. While both approaches have essentially the same mathematical content (see e.g. [26, 23, 28]), there are several reasons why we chose Boolean-valued models for our formalization:

  • Modularity. The theory of forcing with Boolean-valued models cleanly splits into several components (a general theory of Boolean-valued semantics for first-order logic, a library for calculations inside complete Boolean algebras, the construction of Boolean-valued models of set theory, and the specifics of the forcing argument itself) which could be formalized in parallel and then recombined.

  • Directness. For the purposes of an independence proof, the Boolean-valued soundness theorem eliminates the need to produce a two-valued model. This approach also bypasses any requirement for the reflection theorem/Löwenheim-Skolem theorems, Mostowski collapse, countable transitive models, or genericity considerations for filters.

  • Novelty and reusability. As far as we were able to tell, the Boolean-valued approach to forcing has never been formalized. Furthermore, while for the purposes of an independence proof, forcing with Boolean-valued models and forcing with countable transitive models accomplish the same thing, a general library for Boolean-valued semantics of a deeply embedded logic could be used for formal verification applications outside of set theory, e.g. to formalize the Boolean-valued semantics of stochastic -calculus [38, 4].

  • Amenability to structural induction. As with Coq, Lean is able to encode extremely complex objects and reason about their specifications using inductive types. However, the user must be careful to choose the encoding so that properties they wish to reason about are accessible by structural induction, which is the most natural mode of reasoning in the proof assistant. After observing (1) that the Aczel-Werner encoding of as an inductive type is essentially a special case of the recursive name construction from forcing (c.f. Section 3), and (2) that the automatically-generated induction principle for that inductive type is -induction, it is easy to see that this encoding can be modified to produce a Boolean-valued model of set theory where, again, -induction comes for free.

We briefly outline the rest of the paper. In Section 1 we outline the method of Boolean-valued models and sketch the forcing argument. Section 2 discusses a deep embedding of first-order logic, including a proof system and the Boolean-valued soundness theorem. Section 3 discusses our construction of Boolean-valued models of set theory. Section 4 describes the formalization of the forcing argument and the construction of a suitable Boolean algebra for forcing . Section 5 describes the formalization of some transfinite combinatorics. We conclude with a reflection on our formalization and an indication of future work.

1 Outline of the proof

is a collection of first-order sentences in the language of a single binary relation , used to axiomatize set theory. The continuum hypothesis can be written in this fashion as a first-order sentence . A proof of is a finite list of deductions starting from and ending at . The soundness theorem says that provability implies satisfiability, i.e. if , then interpreted in any model of is true. Taking the contrapositive, we can demonstrate the unprovability (equivalently, the consistency of the negation) of by exhibiting a single model where is not true.

A model of a first-order theory in a language is in particular a way of assigning or in a coherent way to sentences in . Modulo provable equivalence, the sentences form a Boolean algebra and “coherent” means the assignment is a Boolean algebra homomorphism (so becomes join, becomes infimum, etc.) into . The soundness theorem ensures that this homomorphism sends a proof to an inequality . may be replaced by any complete Boolean algebra , where the top and bottom elements take the place of and . It is straightforward to extend this analogy to a -valued semantics for first-order logic, and in this generality, the soundness theorem now says that for any such , if , then for any -valued structure where all the axioms of have truth-value , does also. Then as before, to demonstrate the consistency of the negation of it suffices to find just one and a single -valued model where is not “true”.

This is where forcing comes in. Given a universe of set theory containing a Boolean algebra , one constructs in analogy to the cumulative hierarchy a new -valued universe of set theory, where the powerset operation is replaced by taking functions into . Thus, the structure of informs the decisions made by about what subsets, hence functions, exist among the members of ; the real challenge lies in selecting a suitable and reasoning about how its structure affects the structure of . While may vary wildly depending on the choice of , the original universe always embeds into via an operation , and while the passage of to may not always preserve its original properties, properties which are definable with only bounded quantification are preserved; in particular, thinks is .

To force the negation of the continuum hypothesis, we use the Boolean algebra of regular opens of the Cantor space . For each , we associate the

-valued characteristic function

by . This induces what thinks is a new subset , called a Cohen real, and furthermore, simultaneously performing this construction on all induces what thinks is a function from . After showing that thinks this function is injective, to finish the proof it suffices to show that preserves cardinal inequalities, as then we will have squeezed properly between and . This is really the technical heart of the matter, and relies on a combinatorial property of called the countable chain condition (CCC), the proof of which requires a detailed combinatorial analysis of the basis of the product topology for ; we handle this with a general result in transfinite combinatorics called the -system lemma.

So far we have mentioned nothing about how this argument, which is wholly set-theoretic, is to be interpreted inside type theory. To do this, it was important to separate the mathematical content from the metamathematical content of the argument. While our objective is only to produce a model of satisfying certain properties, traditional presentations of forcing are careful to stay within the foundations of , emphasizing that all arguments may be performed internal to a model of , etc., and it is not immediately clear what parts of the argument use that set-theoretic foundation in an essential way and require modification in the passage to type theory. Our formalization clarifies some of these questions.

Finally, when working with Boolean-valued models, it is profitable to keep in mind the following analogy, developed by Scott in [35]. A ready supply of complete Boolean algebras

is obtained by taking the measure algebra of a probability space and quotienting by the ideal of events of measure zero. Let

be a -valued structure. A unary -valued predicate on assigns an event to every element of , whose measure we can think of as being the probability that is true. Specializing to the language of set theory, we can attach to every an “indicator function” which assigns to every a probability that it is actually a member of . Thus, by virtue of extensionality, we may think of the elements of a -valued model of

as being “set-valued random variables”, or “random sets”

333In this analogy, given a universe of random sets, the purpose of the generic filter or ultrafilter in forcing is then to simultaneously evaluate the outcomes of the random variables, collapsing them into an ordinary universe of sets.; see [35] and [28] for details.


Our strategy for constructing a Boolean-valued model in which fails is a synthesis of the proofs in the textbooks of Bell ([5], Chapter 2) and Manin ([27], Chapter 8). For the -system lemma, we follow Kunen ([26], Chapters 1 and 5).

Viewing the formalization

The code blocks in this paper were taken directly from our formalization, but for the sake of formatting and readability, we sometimes omit or modify universe levels, type ascriptions, and casts. We refer the interested reader to our repository,444https://github.com/flypitch/flypitch which contains a guide on compiling and navigating the source files of the project. In particular, there is a summary file summary.lean containing #print statements of important definitions and duplicated proofs of the main theorems.

2 First-order logic

The starting point for first-order logic is a language of relation and function symbols. We represent a language as a pair of -indexed families of types, each of which is to be thought of as the collection of relation (resp. function) symbols stratified by arity:

structure Language : Type (u+1) :=
(functions :   Type u) (relations :   Type u)

2.1 (Pre)terms, (pre)formulas

The main novelty of our implemenation of first-order logic is the use of partially applied terms and formulas, encoded in a parametrized inductive type where the parameter measures the difference between the arity and the number of applications. The benefit of this is that it is impossible to produce an ill-formed term or formula, because type-correctness is equivalent to well-formedness. This eliminates the need for separate well-formedness proofs.

Fix a language . We define the type of preterms as follows:

inductive preterm :   Type u
| var {} :  (k : ℕ), preterm 0
| func :  {l : ℕ} (f : L.functions l), preterm l
| app :  {l : ℕ} (t : preterm (l + 1)) (s : preterm 0), preterm l

We use de Bruijn indices to avoid variable shadowing. A member of preterm n is a partially applied term. If applied to n terms, it becomes a term. Every element of preterm L  is a well-formed term. We use this encoding to avoid mutual or nested inductive types, since those are not too convenient to work with in Lean.

The type of preformulas is defined similarly:

inductive preformula :   Type u
| falsum {} : preformula 0   notation ‘⊥‘
| equal (t t : term L) : preformula 0  notation ‘≃‘
| rel {l : ℕ} (R : L.relations l) : preformula l
| apprel {l : ℕ} (f : preformula (l + 1)) (t : term L) : preformula l
| imp (f f : preformula 0) : preformula 0  notation 
| all (f : preformula 0) : preformula 0  notation ‘∀’‘
 ¬ f := f  ⊥, notation ‘∼f
  f :=  ∀’ f, notation ‘∃’ f

A member of preformula n is a partially applied formula. If applied to n terms, it becomes a formula. Implication is the only binary connective. Since we use classical logic, we can define the other connectives from implication and falsum. Similarly, universal quantification is our only quantifier.

Our proof system is a natural deduction calculus, and all rules are motivated to work well with backwards-reasoning:

inductive prf : set (formula L)  formula L  Type u
| axm      A} (h : A  Γ) : prf Γ A
| impI    {Γ} {A B} (h : prf (insert A Γ) B) : prf Γ (A  B)
| impE    {Γ} (A) {B} (h : prf Γ (A  B)) (h : prf Γ A) : prf Γ B
| falsumE {Γ} {A} (h : prf (insert A Γ) ⊥) : prf Γ A
| allI     A} (h : prf Γ A) : prf Γ (∀’ A)
| allE   {Γ} A t (h : prf Γ (∀’ A)) : prf Γ (A[t // 0])
| ref      t) : prf Γ (t  t)
| subst  {Γ} (s t f) (h : prf Γ (s  t)) (h : prf Γ (f[s // 0])) :
          prf Γ (f[t // 0])

A member of prf Γ A is a proof tree encoding a derivation of from . Note that prf is Type- instead of Prop-valued, so different members of prf Γ A are not definitionally equal.

2.2 Completeness

As part of our formalization of first-order logic, we completed a verification of the Gödel completeness theorem. Although our present development of forcing did not require it, we anticipate that it will useful later to e.g. prove the downward Löwenheim-Skolem theorem for extracting countable transitive models. Like soundness, it also serves as a proof-of-concept and stress-test of our chosen encoding of first-order logic.

For our formalization, we chose the Henkin-style approach of constructing a canonical term model. In order to perform the argument, which normally involves modifying the language “in place” to iteratively add new constant symbols, we had to adapt it to type theory. Since our languages are represented by pairs of indexed types instead of sets, we cannot really modify them in-place with new constant symbols. Instead, at each step of the construction, we must construct an entirely new language in which the previous one embeds, and in the limit we must compute a directed colimit of types instead of a union. This construction induces similar constructions on terms and formulas, and completing the argument requires reasoning with all of them. As a result of our design decisions, only a few arguments required anything more than straightforward case-analysis and structural induction. The final statement makes no restrictions on the cardinality of the language:

  theorem completeness {L : Language} (T : Theory L)  : sentence L) : T ⊢’ ψ  T  ψ

2.3 Boolean-valued semantics for first-order logic

A complete Boolean algebra is a type equipped with the structure of a Boolean algebra and additionally operations and (which we write as and ) returning the infimum and supremum of an arbitrary collection of members of . We use , and to denote meet, join, material implication, and top/bottom elements. For more details on complete Boolean algebras, we refer the reader to the textbook of Halmos-Givant [13].

Definition .

Fix a language and a complete Boolean algebra . A -valued structure is an instance of the following structure:

structure bStructure :=
(carrier : Type u)
(fun_map : ∀{n}, L.functions n  vector carrier n  carrier)
(rel_map : ∀{n}, L.relations n  vector carrier n  𝔹)
(eq : carrier  carrier  𝔹)
(eq_refl :  x, eq x x = ⊤)
(eq_symm :  x y, eq x y = eq y x)
(eq_trans : ∀{x} y {z}, eq x y  eq y z  eq x z)
(fun_congr : ∀{n} (f : L.functions n) (x y : vector carrier n),
  ⨅(map2 eq x y)  eq (fun_map f x) (fun_map f y))
(rel_congr : ∀{n} (R : L.relations n) (x y : vector carrier n),
  ⨅(map2 eq x y)  rel_map R x  rel_map R y)

Above, “⨅(map2 eq x y)” means “the infimum of the list whose th entry is eq applied to x[i] and y[i]”.

Note that Boolean-valued equality is not really an equivalence relation, but “ thinks it is”. One complication which then arises in Boolean-valued semantics is keeping track of the congruence lemmas for formulas. However, as part of the soundness theorem shows, once these extensionality proofs are provided for the basic symbols in the language, they extend by structural induction to all formulas.

2.4 The soundness theorem

A soundness theorem says that a proof tree may be replayed to produce an actual proof in the object of truth-values. When the object of truth-values is Prop, this says that a proof tree compiles to a proof term. When the object of truth-values is a Boolean algebra, this says that the proof tree becomes an internal implication from the interpretation of the context to the interpretation of the conclusion:

  lemma boolean_soundness  : set (formula L)} {A : formula L}
  (H : Γ  A) :  M, (⨅γ  Γ, M[γ])  M[A]

Of course, we also formalized the ordinary soundness theorem. As a result of our design decisions, the proofs of both the ordinary and Boolean-valued soundness theorems were straightforward structural inductions.

3 Constructing Boolean-valued models of set theory

Throughout this section, we fix a universe level and a complete Boolean algebra 𝔹 : Type u.

In set theory (see e.g. Jech [23] or Bell [5]), Boolean-valued models are obtained by imitating the construction of the von Neumann cumulative hierarchy via a transfinite recursion where iterations of the powerset operation (taking functions into ) are replaced by iterations of the “𝔹-valued powerset operation” (taking functions into ).

Since this construction by transfinite recursion does not easily translate into type theory, our construction of Boolean-valued models of set theory is instead a variation on a well-known encoding originally due to Aczel [1, 3, 2]. This encoding was adapted by Werner [42] to encode into Coq, whose metatheory is close to that of Lean. Werner’s construction was implemented in Lean’s mathlib by Carneiro, as part of [9]. In this approach, one takes a universe of types Type u as the starting point and then imitates the cumulative hierarchy by constructing the inductive type

inductive pSet : Type (u+1)
| mk  : Type u) (A : α  pSet) : pSet

The Aczel-Werner encoding is closely related to the recursive definition of names, which is used in forcing to construct forcing extensions:

Definition .

Let be a partial order (which one thinks of as a collection of forcing conditions). A -name is a collection of pairs where is a -name and .

If consists of only one element, then a -name is specified by essentially the same information as a member of the inductive type pSet above. Conversely, specializing to an arbitrary complete Boolean algebra , we generalize the definition of pSet.mk so that elements are recursively assigned Boolean truth-values:

inductive bSet (𝔹 : Type u) [complete_boolean_algebra 𝔹] : Type (u+1)
| mk  : Type u) (A : α  bSet) (B : α  𝔹) : bSet

Thus bSet 𝔹 is the type of -names, and will be the underlying type of our Boolean-valued model of set theory. For convenience, if x : bSet 𝔹 and x := ⟨α, A, B, we put x.type := α, x.func := A, x.bval := B.

3.1 Boolean-valued equality and membership

In pSet, equivalence of sets is defined by structural recursion as follows: two sets and are equivalent if and only if for every , there exists a such that is equivalent to , and vice-versa. Analogously, by translating quantifiers and connectives into operations on , Boolean-valued equality is defined in the same way:

def bv_eq :  (x y : bSet 𝔹), 𝔹
| ⟨α, A, B ⟨α’, A’, B’⟩ :=
             (⨅a : α, B a  a’, B a  bv_eq (A a) (A a’)) 
               (⨅a : α’, B a  a, B a  bv_eq (A a) (A a’))

We abbreviate bv_eq with the infix operator =ᴮ. With equality in place, it is easy to define membership by translating “ is a member of if and only if there exists a indexed by the type of such that .” As with equality, we denote -valued membership by ∈ᴮ.

def mem : bSet 𝔹  bSet 𝔹  𝔹
| a ⟨α’ A B’⟩ := a’, B a  a =ᴮ A a

3.2 Automation and metaprogramming for reasoning in

As Scott stresses in [36], “A main point … is that the well-known algebraic characterizations of [complete Heyting algebras] and [complete Boolean algebras] exactly mimic the rules of deduction in the respective logics.” Indeed, that is really why the Boolean-valued soundness theorem is true. One thinks of the symbol in an inequality of Boolean truth-values as a turnstile in a proof state: the conjunctands on the left as a list of assumptions in context, and the quantity on the right as the goal. For example, given a b : 𝔹, the identity could be proven by unfolding the definition of material implication, but it is really just modus ponens; similarly, given an indexed family a : I  𝔹, i, a i  b   i, a i  b is just -elimination.

Difficulties arise when the statements to be proved become only slightly more complicated. Consider the following example, which should be “by assumption”:

   a b c d e f g: 𝔹, (d  e)  (f  g  ((b  a)  c))  a

or slightly less trivially, the following example where the goal is attainable by “just applying a hypothesis to an assumption”

   a b c d : 𝔹, (a  b)  c  (d  a)  b

There are three ways to deal with goals like these, which approximately describe the evolution of our approach. First, one can try using the basic lemmas in mathlib, using the simplifier to normalize expressions, and performing clever rewrites with the deduction theorem.555The deduction theorem in a Boolean algebra says that for all and , . Second, one can take the LCF-style approach and expand the library of lemmas with increasingly sophisticated derived inference rules. Third, one can make the following observation:

[Yoneda lemma for posets] Let be a partially ordered set. Let . Then if and only if . This is a consequence of the Yoneda lemma for partially ordered sets, and its proof is utterly trivial. However, one side of the equivalence is much easier for Lean to reason with. Take the example which should have been “by assumption”. The following proof, in which the user navigates down the binary tree of nested s, will work:

example {a b c d e f g : 𝔹} : (d  e)  (f  g ⊓((b  a)⊓ c))  a :=
by {apply inf_le_right_of_le, apply inf_le_right_of_le,
    apply inf_le_left_of_le, apply inf_le_right_of_le, refl}

But if we use the right-hand side of subsection 3.2 instead, then after some preprocessing, assumption will literally work:

example {a b c d e f g : 𝔹} : (d  e)  (f  g ⊓((b  a)⊓ c))  a :=
by {tidy_context, assumption}
 tidy_context applies poset_yoneda‘, introduces a hypothesis H‘,
 uses simp at H to convert s to s, and automatically splits
/- Goal state before assumption‘:
H_right_right_left_left : Γ  b,
H_right_right_left_right : Γ  a
 Γ  a -/

A key feature of Lean is that it is its own metalanguage, allowing for seamless in-line definitions of custom tactics. This feature was an invaluable asset, as it allowed the rapid development of a custom tactic library for simulating natural-deduction style proofs inside after applying subsection 3.2. Boolean-valued versions of natural deduction rules like /-elimination, instantiation of existentials, implication introduction, and even basic automation were easy to write. The result is that the user is able to pretend, with absolute rigor, that they are simply writing proofs in first-order logic while calculations in the complete Boolean algebra are being performed under the hood.

One use-case where automation is crucial is context-specialization. For example, suppose that after preprocessing with poset_yoneda, the goal is Γ  a  b, and one would like to “introduce the implication”, adding Γ  a to context and reducing the goal to Γ  b. This is impossible as stated. Rather, the deduction theorem lets us rewrite the goal to Γ  a  b, and now we may add Γ  a  a. So we may introduce the implication after all, but at the cost of specializing the context Γ to the smaller context Γ’ := Γ  a. But now, in order for the user to continue the pretense that they are merely doing first-order logic, this change of variables must be propagated to the rest of the assumptions which may still be of the form Γ  _—which is extremely tedious to do by hand, but easy to automate.

3.3 The fundamental theorem of forcing

The fundamental theorem of forcing for Boolean-valued models [17] states that for any complete Boolean algebra , is a Boolean-valued model of . Since, in type theory, a type universe Type u takes the place of the standard universe , the analogous statement in our setting is that for every complete Boolean algebra , bSet 𝔹 is a Boolean-valued model of .

Bell [5] gives an extremely detailed account of the verification of the axioms, and we faithfully followed his presentation for this part of the formalization. Most of it is routine. We describe some aspects of bSet 𝔹 which are revealed by this verification.


Definition .

From the definitions of pSet and bSet, one immediately sees that there is a canonical map check : pSet  bSet 𝔹, defined by

def check : pSet  bSet 𝔹
| ⟨α,A := ⟨α, λ a, check (A a), λ a, ⊤⟩

We call members of the image of check check-names,666This terminology is standard, c.f. [17, 28]. after the usual diacritic notation x̌ for check (x : pSet). These are also known as canonical names, as they are the canonical representation of standard two-valued sets inside a Boolean-valued model of set theory.777We were pleased to discover Lean’s support for custom notation allowed us to declare the Unicode modifier character U+030C () as a postfix operator for check.

The axiom of infinity

: bSet 𝔹 is . is defined in pSet to be the collection of all finite von Neumann ordinals, which are defined by induction on . While it is easy to show satisfies the axiom of infinity

def axiom_of_infinity_spec (u : bSet 𝔹) : 𝔹 :=
  (∅∈ᴮ u)  (⨅i_x, i_y, (u.func i_x ∈ᴮ u.func i_y))

it can furthermore be shown to satisfy the universal property of , which says that is a subset of any set which contains and is closed under the successor operation .

The axiom of powerset

Definition .

Fix a -valued set x = ⟨α, A, b. Let χ : α  𝔹 be a function. The subset of x associated to χ is a 𝔹-valued set defined as follows:

def set_of_indicator {x}  : x.type  𝔹) := x.type, x.func, χ⟩

The powerset of is defined to be the following 𝔹-valued set, whose underlying type is the type of all functions x.type  𝔹:

def bv_powerset (u : bSet 𝔹) : bSet 𝔹 :=
u.type  𝔹, λ f, set_of_indicator f, λ f, set_of_indicator f ⊆ᴮ u

The axiom of choice

Following Bell, we verified Zorn’s lemma, which is provably equivalent over to the axiom of choice. As is the case with pSet, establishing the axiom of choice requires the use of a choice principle from the metatheory. This was the most involved part of our verification of the fundamental theorem of forcing, and relies on the technical tool of mixtures, which allow sequences of -valued sets to be “averaged” into new ones, and the maximum principle, which allows existentially quantified statements to be instantiated without changing their truth-value.

The smallness of

We end this section by remarking that the “smallness” (or more precisely, the fact that lives in the same universe of types out of which bSet 𝔹 is being built) is essential in making bSet 𝔹 a model of . It is required for extracting the witness needed for the maximum principle, and is also required to even define the powerset operation, because the underlying type of the powerset is the function type of all maps into 𝔹.

4 Forcing

4.1 Representing Lean’s ordinals inside pSet and bSet

The treatment of ordinals in mathlib associates a class of ordinals to every type universe, defined as isomorphism classes of well-ordered types, and includes interfaces for both well-founded and transfinite recursion. Lean’s ordinals may be represented inside pSet by defining a map ordinal.mk : ordinal  pSet via transfinite recursion; it is nothing more than the von Neumann definition of ordinals. In pseudocode,

def ordinal.mk : ordinal  pSet
| 0 := 
| succ ξ := pSet.succ (ordinal.mk ξ)  (mk ξ  {mk ξ})
| is_limit ξ :=  η < ξ, (ordinal.mk η)

Composing by check (subsection 3.3) yields a map check  ordinal.mk : ordinal  bSet 𝔹. (We could just as well have defined ordinal.mk : ordinal  bSet 𝔹 analogously to ordinal.mk without reference to check, such that ordinal.mk = check  ordinal.mk; the point is that there is a link between the metatheory’s notion of size and order with that of the forcing extension.)

Cardinals in Lean are defined separately from ordinals as bijective equivalence classes of types, but are canonically represented by ordinals which are not bijective with any predecessor. We let aleph : ordinal  ordinal index these representatives. For the rest of this section, unadorned alephs (e.g. “”) will mean either an ordinal of the form aleph ξ or a choice of representative from the isomorphism class of well-ordered types, and checked alephs (e.g. “”) will mean the check  ordinal.mk of that ordinal.

4.2 The Cohen poset and the regular open algebra

Forcing with partial orders and forcing with complete Boolean algebras are related by the fact that every poset of forcing conditions can be embedded into a complete Boolean algebra as a dense suborder. This will be the case for our forcing argument: our Boolean algebra is the algebra of regular opens on (we identify this space with the subsets of ), and the poset of forcing condition embeds in this Boolean algebra as a dense suborder.

Definition .

The Cohen poset for adding -many Cohen reals is the collection of all finite partial functions , ordered by reverse inclusion.

In the formalization, the Cohen poset is represented as a structure with three fields:

structure 𝒞 : Type :=
  (ins : finset (ℵ₂.type × ℕ))
  (out : finset (ℵ₂.type × ℕ))
  (H : ins  out = ∅)

That is, we identify a finite partial function with the triple f.ins, f.out, f.H, where f.ins is the preimage of , f.out is the preimage of , and f.H ensures well-definedness. While is usually defined as a finite partial function, we found that in practice is really only needed to give a finite partial specification of a subset of (i.e. a finite set f.ins which must be in the subset, and a finite set f.out which must not be in the subset), and chose this representation to make that information immediately accessible.

Definition .

Let be a topological space, and for any open set , let denote the complement of the closure of . The regular open algebra of a topological space , written , is the collection of all open sets such that , equipped with the structure of a complete Boolean algebra, with , , , and .

The Boolean algebra which we will use for forcing is . Unless stated otherwise, for the rest of this section, we put .

Definition .

We define the canonical embedding of the Cohen poset into as follows:

def ι : 𝒞  𝔹 := λ p, {S | p.ins  S  p.out  - S}

That is, we send each c : 𝒞 to all the subsets which satisfy the specification given by c. This is a clopen set, hence regular. Crucially, this embedding is dense:

lemma 𝒞_dense {b : 𝔹} (H :  < b) :  p : 𝒞, ι p  b

Recalling that in is subset-inclusion, we see that this is essentially because the image of is the standard basis for the product topology. Our chosen encoding of the Cohen poset also made it easier to perform this identification when formalizing this proof.

4.3 Adding -many distinct Cohen reals

As we saw in subsection 3.3, for any -valued set , characteristic functions into from the underlying type of determine -valued subsets of . While the ingredients and for are types and thus external to bSet 𝔹, they are represented nonetheless inside bSet 𝔹 by their check-names and , and in fact ℵ₂ is ℵ₂̌ .type and is .type. Given our specific choice of , this will allow us to construct an -indexed family of distinct subsets of , which we can then convert into an injective function from ℵ₂̌ to , inside bSet 𝔹.

Definition .

Let . For any , the collection of all subsets of which contain is a regular open of , called the principal open over .

Definition .

Let . We associate to the -valued characteristic function defined by . In light of our previous observations, we see that each induces a new -valued subset . We call a Cohen real.

This gives us an -indexed family of Cohen reals. Converting this data into an injective function from to inside bSet 𝔹 requires some care. One must check that is externally injective, and this is where the characterization of the Cohen poset as a dense subset of (and moving back and forth between this representation and the definition as finite partial functions) comes in. Furthermore, one has to develop machinery similar to that for the powerset operation to convert an external injective function x.type  bSet 𝔹 to a -valued set which bSet 𝔹 thinks is a injective function, while maintaining conditions on the intended codomain. Our custom tactics and automation for reasoning inside made this latter task significantly easier than it would have been otherwise. We refer the interested reader to our formalization for details.

4.4 Preservation of cardinal inequalities

So far, we have shown for that bSet 𝔹 thinks is smaller than . Although Lean believes there is a strict inequality of cardinals , in general we can only deduce that their representations inside bSet 𝔹 are subsets of each other: . To finish negating , it suffices to show that bSet 𝔹 thinks is strictly smaller than , and that bSet 𝔹 thinks is a strictly smaller than . That is, for cardinals , we want that the passage from to to preserve cardinal inequalities.

Definition .

For our purposes, “ is strictly smaller than ” means “there exists no function f such that for every y  Y, there exists an x  X such that (x,y)  f”. Thus, “X is strictly smaller than Y” translates to the Boolean truth-value

-(⨆f, (is_func f)  y, y ∈ᴮ Y  x, x ∈ᴮ X  (x, y) ∈ᴮ f).

We abbreviate this with “”.

The condition on an arbitrary which ensures the preservation of cardinal inequalities is the countable chain condition.

Definition .

We say that has the countable chain condition (CCC) if every antichain (i.e. an indexed collection of elements such that whenever ) has a countable image.

We sketch the argument that CCC implies the preservation of cardinal inequalities. The proof is by contraposition. Let and be cardinals such that , and suppose that is not strictly smaller than . Then there exists some f : bSet 𝔹 and some such that Γ  (is_func f)  y, y ∈ᴮ κ₁̌   x, x ∈ᴮ κ₂̌   (x,y) ∈ᴮ f. Then one can show:

lemma AE_of_check_larger_than_check :
 β < κ₂,  η < κ₁,  < (is_func f)  (η⠀̌, β ⠀̌ ) ∈ᴮ f

The name of this lemma emphasizes that what was happened here is that, given this and the assumption that it satisfes some - formula inside bSet 𝔹, we are able to extract, by virtue of and being check-names, a - statement in the metatheory. Using Lean’s choice principle, we can then convert this - statement into a function , such that for every ,  < (is_func f)  (g(β)̌ , β ̌ ) ∈ᴮ f. Since , it follows from the infinite pigeonhole principle that there exists some such that the is uncountable. Define by (is_func f)  (g(β)̌ , β ̌ ) ∈ᴮ f. This is an uncountable antichain because if , then the well-definedness part of is_func f ensures that, since , the truth-value β₁̌ β₂̌ is .

Thus, conditional on showing that has the CCC, we now have that cardinal inequalities are preserved in bSet 𝔹. Combining this with the injection , we obtain:

theorem neg_CH :  = (ℕ  (ℵ₁)̌   (ℵ₁)̌   (ℵ₂)̌   (ℵ₂)̌   𝒫(ℕ))

The arguments sketched in subsection 4.3 and subsection 4.4 form the heart of the forcing argument. Their proofs involve taking objects in Type u and bSet 𝔹, constructing corresponding objects on the other side, and reasoning about them in ordinary and -valued logic simultaneously to determine cardinalities in bSet 𝔹. We have omitted many details from our discussion, but of course, all the proofs have been formally verified.

4.5 The unprovability of

We conclude this section by briefly describing how the previous results may be converted into a formal proof of the unprovability of . We work in a conservative expansion of with an expanded language with symbols for pairing, union, powerset, and . We define to be precisely the axioms which were verified in the fundamental theorem of forcing, along with specifications for the new function symbols. can then be written as a deeply-embedded sentence (note the use of de Bruijn indices for variables)

def CH : sentence L_ZFC := ¬ ∃’ ∃’   &1)  (&1  &0)  (&0  𝒫(ω))

where and are abbreviations with the same meaning as in the previous section. Then proving bSet 𝔹  ZFC + ¬CH is a straightforward matter of checking that sentences are interpreted correctly as Boolean truth values which we have already proved to be . Applying the contrapositive of the Boolean-valued soundness theorem yields the result.

5 Transfinite combinatorics and the countable chain condition

What remains now is to prove that has the CCC. There are several ways forward; we chose a very general proof using the -system lemma to show more generally that the product of topological spaces satisfies the CCC if every finite subproduct does. Our proof follows Kunen [26].

5.1 The -system lemma

Definition .

A family of sets is called a -system (or a sunflower or quasi-disjoint) if there is a set , called the root such that whenever we have .

def is_delta_system  ι : Type*} (A : ι  set α) :=
∃(root : set α), ∀{{x y}}, x  y  A x  A y = root

The -system lemma states that if we have an uncountable family of finite sets, there is an uncounbtable subfamily which forms a -system. In Lean this is formulated as follows. (restrict A t is the restriction of the collection A to t).

theorem delta_system_lemma_uncountable  ι : Type*}
  (A : ι  set α) (h : cardinal.omega < mk ι)
  (h2A : i, finite (A i)) : ∃(t : set ι),
  cardinal.omega < mk t  is_delta_system (restrict A t)

This theorem follows from the following more general statement, taking and (for cardinal numbers the operation c ^< κ or is the supremum of for ).

theorem delta_system_lemma  ι : Type u}  θ : cardinal}
  (hκ : cardinal.omega  κ) (hκθ : κ < θ) (hθ : is_regular θ)
  (hθ_le : ∀(c < θ), c ^< κ < θ) (A : ι  set α)
  (hA : θ  mk ι) (h2A : i, mk (A i) < κ) :
  ∃(t : set ι), mk t = θ  is_delta_system (restrict A t)

We omit the proof, referring the interested reader to [26] or the formalization.

5.2 has the countable chain condition

Definition .

We say that a topological space satisfies the countable chain condition if every family of pairwise disjoint open sets is countable.

We first give a sufficient condition for a product of topological spaces to satisfy the countable chain condition.

Theorem .

If we have a family of topological spaces, then has the countable chain condition if for every finite the product has the countable chain condition.


For the proof, suppose we had an uncountable family of pairwise disjoint open subsets of . By shrinking , we may assume that each is a basic open set of the form for some finite set . Now the form a uncountable family of finite sets, so by the -system lemma we know that there is an uncountable family of indices such that forms a -system with root . Now we can take the projections onto for . We can show this forms an uncountable disjoint family of opens in , contradicting the assumption. ∎

With this, the rest of the proof that has the CCC is easy: since every finite product is a finite topological space, and so satisfies the CCC, it follows that the space satisfies the CCC. Also, if a topological space satisfies the CCC then the algebra of regular opens satisfies the CCC, since every antichain of regular opens forms a family of disjoint open sets. Thus, we have shown:

theorem 𝔹_CCC : CCC (regular_opens (set(ℵ₂.type × ℕ)))

6 Related work

First-order logic, soundness, and completeness

There are many existing formalizations of first-order logic. Shankar [39] used a deep embedding of first-order logic to formalize incompleteness theorems. Harrison gives a deeply-embedded implementation of first-order logic in HOL Light [18] and a proof-search style account of the completeness theorem in [19]. Margetson [33] and Schlichtkrull [34] use the same argument for the completeness theorem in Isabelle/HOL, while Berghofer [6] (in Isabelle) and Ilik [22] (in Coq) use canonical term models.

Set theory and forcing

Set theory is a common target for formalization. Notably, a large body of formalized set theory has been completed in Isabelle/ZF, led by Paulson and his collaborators [32, 29, 30]. Most relevantly, this includes a formalization of the relative consistency of the axiom of choice with [31]. Building on this, Gunther, Pagano, and Terraf have begun formalizing the basic ingredients of forcing [15, 16], taking the more conventional approach of generic extensions of countable transitive models.

Our tactic library for Boolean-valued logic was inspired by work of Hudon [21] on Unit-B, using similar techniques to embed a proof language for temporal logic [20]. It was pointed out to the authors that a trick similar to subsection 3.2 had also been successfully applied in the Metamath library [8].

The work we have described in this paper relies heavily on Lean’s mathlib. In particular, the extensive set_theory and ordinal libraries contained nearly everything we needed (including a treatment of cofinalities for the -system lemma), with missing parts easily accessible through existing lemmas. These libraries were originally developed by Carneiro [9], in part to show that Lean proves the existence of infinitely many inaccessible cardinals.

7 Conclusions and future work

Reflections on the proof

As our formalization has shown, for the purposes of a consistency proof, one can perform forcing entirely outside of the set-theoretic foundations in which forcing is usually presented. There is no need to work inside an ambient model of set theory, or to even have a ground model of set theory over which one constructs a forcing extension. Instead, the recursive name construction applied to a universe of types is key. The type universe, with its classical two-valued logic and its own notion of ordinals, takes the place of the standard universe of sets. These external ordinals are then represented in the internal ordinals of the forcing extension by indexing the construction of von Neumann ordinals. With a clever choice of forcing conditions , this representation of ordinals will preserve cardinal inequalities and force an uncountable set beneath .

In particular, pSet, being only another special case of the construction which produces bSet 𝔹, is no longer a prerequisite for working with bSet 𝔹, but merely a convenient tool for organizing the check-names—this is the only role it played in the proof. The check-names themselves were actually not necessary either: as we remarked, the canonical map ordinal  bSet 𝔹 can be defined without reference to them. However, since in all of our sources, pSet additionally played the role of the universe of types, and an interface for it was readily available in mathlib, we started our formalization by following the usual arguments, implementing these simplifications as we became aware of them.

Lessons learned

  • Originally, we thought set-theoretic arguments involving transfinite/ordinal induction, which are ubiquitous, would be difficult to implement. In practice, Lean’s tools for well-founded recursion and the comprehensive treatment of ordinals in mathlib made the implementation of such arguments painless.

  • Definitions and lemmas should be stated as generally as possible. This maximizes reusability, minimizes redundancy, and by exposing only the information required to complete the proof, improves the performance of automation.

  • One should invest early in domain-specific automation. The formalization of the fundamental theorem was completed using only the first two strategies outlined in subsection 3.2; the calculations, while tedious, were recorded in our sources and it seemed easier to follow them. If we had followed through on the observations around subsection 3.2 and developed the custom tactic library earlier, we would have saved a significant amount of time.

Towards a formal proof of the independence of the continuum hypothesis

The work we have described in this paper was undertaken as part of the Flypitch project, which aims to produce a formal proof of the independence of the continuum hypothesis. As such, the obvious next goal is a formalization of the consistency of . Although it would be possible to do this using Boolean-valued models, we intend to develop the infrastructure necessary to support a proof by forcing with generic extensions, as well as Gödel’s original proof by way of analyzing the constructible universe .

Although our work includes a formal proof of the unprovability of a version of from a version of the axioms in a conservative extension of the language of , verifying this is easy. What is more interesting is formalizing the equivalence of various common formulations of and , so that a skeptical user may verify that their preferred version of is unprovable from their preferred version of . This would require formalizations of the conservativity of commonly-used extensions of , and of the equivalence of the various ways to say that one set is strictly smaller than another. The proof of the completeness theorem already required formalizing nontrivial conservativity statements, which shows that our framework is well-equipped to support such results.

Although the stated goal of our project is to achieve a formal proof of the independence of the continuum hypothesis, we also intend to develop reusable libraries for set theory and mathematical logic. We have completed a formalization of forcing, but are nowhere near completing a library which a set theorist could use to verify their research. Just as, more than 50 years ago, Cohen’s proof marked the beginning of modern research in set theory, a formal proof of the independence of the continuum hypothesis will only mark the beginning of an integration of formal methods into modern research in set theory. This will require robust interfaces for handling the diverse range of forcing arguments and for reasoning about the consistency strengths of various extensions of , so that—to paraphrase Kanamori [24, 25]—deeply-embedded notions of truth and relative consistency become matters of routine manipulation as in algebra. Our work demonstrates that such tasks are well within the scope of modern interactive theorem provers.

8 References


  • [1] Peter Aczel. The type theoretic interpretation of constructive set theory. In Logic colloquium, volume 77, pages 55–66, 1978.
  • [2] Peter Aczel. The type theoretic interpretation of constructive set theory: choice principles. In Studies in Logic and the Foundations of Mathematics, volume 110, pages 1–40. Elsevier, 1982.
  • [3] Peter Aczel. The type theoretic interpretation of constructive set theory: inductive definitions. In Studies in Logic and the Foundations of Mathematics, volume 114, pages 17–49. Elsevier, 1986.
  • [4] Giorgio Bacci, Robert Furber, Dexter Kozen, Radu Mardare, Prakash Panangaden, and Dana Scott. Boolean-valued semantics for the stochastic -calculus. In Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science, pages 669–678. ACM, 2018.
  • [5] John L Bell. Set theory: Boolean-valued models and independence proofs, volume 47. Oxford University Press, 2011.
  • [6] Stefan Berghofer. First-order logic according to fitting. Archive of Formal Proofs, August 2007. http://isa-afp.org/entries/FOL-Fitting.html, Formal proof development.
  • [7] Georg Cantor. Ein beitrag zur mannigfaltigkeitslehre. Journal fur die reine und angewandte Mathematik, 84:242–258, 1878.
  • [8] Mario Carneiro. Natural deduction in the metamath proof explorer. http://us.metamath.org/mpeuni/mmnatded.html, 2014. Slides (http://us.metamath.org/ocat/natded.pdf).
  • [9] Mario Carneiro. The type theory of lean. In preparation (https://github.com/digama0/lean-type-theory/releases), 2019.
  • [10] Paul J Cohen. The independence of the continuum hypothesis. Proceedings of the National Academy of Sciences, 50(6):1143–1148, 1964.
  • [11] Paul J Cohen. The independence of the continuum hypothesis, ii. Proceedings of the National Academy of Sciences of the United States of America, 51(1):105, 1964.
  • [12] Leonardo de Moura, Soonho Kong, Jeremy Avigad, Floris Van Doorn, and Jakob von Raumer. The lean theorem prover (system description). In International Conference on Automated Deduction, pages 378–388. Springer, 2015.
  • [13] Steven Givant and Paul Halmos. Introduction to Boolean algebras. Springer Science & Business Media, 2008.
  • [14] Kurt Gödel. The consistency of the axiom of choice and of the generalized continuum-hypothesis. Proceedings of the National Academy of Sciences, 24(12):556–557, 1938.
  • [15] Emmanuel Gunther, Miguel Pagano, and Pedro Sánchez Terraf. First steps towards a formalization of forcing. arXiv preprint arXiv:1807.05174, 2018.
  • [16] Emmanuel Gunther, Miguel Pagano, and Pedro Sánchez Terraf. Mechanization of separation in generic extensions. arXiv preprint arXiv:1901.03313, 2019.
  • [17] Joel David Hamkins and Daniel Evan Seabold. Well-founded boolean ultrapowers as large cardinal embeddings. arXiv preprint arXiv:1206.6075, 2012.
  • [18] John Harrison. Formalizing basic first order model theory. In International Conference on Theorem Proving in Higher Order Logics, pages 153–170. Springer, 1998.
  • [19] John Harrison.

    Handbook of practical logic and automated reasoning

    Cambridge University Press, 2009.
  • [20] Simon Hudon. Temporal logic in unit-b, 2018. https://github.com/unitb/temporal-logic.
  • [21] Simon Hudon, Thai Son Hoang, and Jonathan S. Ostroff. The unit-b method: refinement guided by progress concerns. Software & Systems Modeling, 15:1091–1116, 2015.
  • [22] Danko Ilik. Constructive completeness proofs and delimited control. PhD thesis, Ecole Polytechnique X, 2010.
  • [23] Thomas Jech. Set theory. Springer Science & Business Media, 2013.
  • [24] Akihiro Kanamori. The mathematical development of set theory from cantor to cohen. Bulletin of Symbolic Logic, 2(1):1–71, 1996.
  • [25] Akihiro Kanamori. The higher infinite: large cardinals in set theory from their beginnings. Springer Science & Business Media, 2008.
  • [26] Kenneth Kunen. Set theory an introduction to independence proofs, volume 102. Elsevier, 2014.
  • [27] Yu I Manin. A course in mathematical logic for mathematicians, volume 53. Springer Science & Business Media, 2009.
  • [28] Justin Tatch Moore. The method of forcing. arXiv preprint arXiv:1902.03235, 2019.
  • [29] Lawrence C Paulson. Set theory for verification: I. from foundations to functions. Journal of Automated Reasoning, 11(3):353–389, 1993.
  • [30] Lawrence C Paulson. The reflection theorem: A study in meta-theoretic reasoning. In International Conference on Automated Deduction, pages 377–391. Springer, 2002.
  • [31] Lawrence C Paulson. The relative consistency of the axiom of choice mechanized using isabelle/ zf. LMS Journal of Computation and Mathematics, 6:198–248, 2003.
  • [32] Lawrence C Paulson and Krzysztof Grabczewski. Mechanizing set theory. Journal of Automated Reasoning, 17(3):291–323, 1996.
  • [33] Tom Ridge and James Margetson. A mechanically verified, sound and complete theorem prover for first order logic. In TPHOLs, 2005.
  • [34] Anders Schlichtkrull. Formalization of logic in the isabelle proof assistant. 2018.
  • [35] Dana Scott. A proof of the independence of the continuum hypothesis. Theory of Computing Systems, 1(2):89–111, 1967.
  • [36] Dana Scott. The algebraic intepretation of quantifiers. intuitionistic and classical. Andrzej Mostowski and foundational studies, pages 289–312, 2008.
  • [37] Dana Scott and Robert Solovay. Boolean algebras and forcing. Unpublished manuscript, 1967.
  • [38] Dana S Scott. Stochastic -calculi. Journal of Applied Logic, 12(3):369–376, 2014.
  • [39] Natarajan Shankar. Metamathematics, machines and Gödel’s proof, volume 38. Cambridge University Press, 1997.
  • [40] Joseph R Shoenfield. Unramified forcing. In Axiomatic set theory, volume 13, pages 357–381. AMS Providence, RI, 1971.
  • [41] Sebastian Ullrich. Lean 4: A guided preview. https://leanprover.github.io/talks/vu2019.pdf. Slides.
  • [42] Benjamin Werner. Sets in types, types in sets. In International Symposium on Theoretical Aspects of Computer Software, pages 530–546. Springer, 1997.