1 Introduction
The calculus of constructions (CC) is a core theory for dependently typed programming and higherorder constructive logic. Originally introduced in Coquand’s 1985 thesis [4], CC has inspired 25 years of research in programming languages and type theory. Today, extensions of CC form the basis of languages like Coq [17] and Agda [15, 16].
The popularity of CC can be attributed to the combination of its expressiveness and its pleasant metatheoretic properties. Among these properties, one of the most important is strong normalization, which means that there are no infinite reduction sequences from welltyped expressions. This result has two important consequences. First, it implies that CC is consistent as a logic. This makes it an attractive target for the formalization of mathematics. Second, it implies that there is an algorithm to check whether two expressions are convertible. Thus, type checking is decidable and CC provides a practical basis for programming languages.
The strong normalization theorem has traditionally been considered difficult to prove [5, 2]. Coquand’s original proof was found to have at least two errors, but a number of later papers give different, correct proofs [5]. In subsequent years, many authors considered how to extend this result for additional programming constructs like inductive datatypes with recursion, a predicative hierarchy of universes, and large eliminations [18, 9, 11]. Many of these proofs are even more challenging, and several span entire theses.
This document reviews three proofs of strong normalization for CC. Each paper we have chosen proves the theorem by constructing a model of the system in a different domain, and each contributes something novel to the theory of CC and its extensions. The technical details of the models are often complicated and intimidating. Rather than comprehensively verifying and reproducing the proofs, we have focused on painting a clear picture of the beautiful and fascinating mathematical structures that underpin them.
The first proof, originally presented by Geuvers and Nederhof [8] and subsequently popularized by Barendregt [3], models CC in the simpler theory of F. It demonstrates that the strong normalization theorems for CC and F are equivalent by giving a reductionpreserving translation from the former to the latter. The second, by Geuvers [7], models CC’s types with sets of expressions. The paper demonstrates how the model may be extended to cope with several popular language features, aiming for flexibility. The last proof, from Melliès and Werner [13], uses realizability semantics to consider a large class of type theories, known as the pure type systems, which include CC. The authors’ goal is to prove strong normalization for any pure type system that enjoys a particular kind of realizability model.
Though each paper has a unique focus and models CC in a different semantic system, the overall structures are very similar. After unifying the syntax, the correspondences between certain parts of the proofs are quite striking. Readers are encouraged, for example, to compare the interpretation functions defined in Sections 6.2 and 6.3 with those in Section 5.2. The similarities between the papers speak to the fundamental underlying structure of CC, while their differences illustrate how design choices can push the proof towards varying goals.
The paper is structured as follows: In Sections 2 and 3 we review the definition and basic metatheory of pure type systems and the calculus of constructions. We present the highlevel structure of a strong normalization argument in Section 4, then the proofs of Geuvers and Nederhof [8], Geuvers [7] and Melliès and Werner [13] in Sections 5, 6 and 7, respectively. We compare the proofs and conclude in Section 8.
2 Pure Type Systems
The calculus of constructions is one example of a pure type system (PTS). This very general notion, introduced by Berardi and popularized by Barendregt [3], consists of a parameterized lambda calculus which can be instantiated to a variety of wellknown type systems. For example, the simplytyped lambda calculus, System F, System F and CC are all pure type systems. The PTS generalization is convenient because it allows us to simultaneously study the properties of several systems.
A PTS is specified by three parameters. First, the collection of sorts is given by a set . The typing hierarchy among these sorts is given by a collection of axioms . Finally, the way product types may be formed is specified by the set of rules . Figure 1 gives the complete definition of the system.
Choosing to explain CC as a PTS settles several questions of presentation. The terms, types and kinds are collapsed into one grammar. Some authors choose to separate these levels syntactically for easier identification, but we find this version more economical and it is more closely aligned with the three papers under consideration. For the same reasons, we have used conversion in the Conv rule instead of using a separate judgemental equality (as is done, for example, in [2]). Here, is the symmetric, transitive, reflexive closure of . We do not consider conversion.
This context also permits a clean and compartmentalized explanation of CC’s features. In most of the systems we consider, the sorts and axioms are given by the sets:
Intuitively, classifies types and classifies kinds. The lone axiom says that is itself a kind. The rule permits standard function types, whose domain and codomain are both types. The system with only this rule is the simplytyped lambda calculus:
The rule permits functions whose domain is a kind. For example, when the domain is these are functions which takes types as arguments (i.e., polymorphism). Thus, adding this rule yields System F:
The rule effectively duplicates STLC at the type level. It allows functions that take and return types. Adding it yields System F, which has typelevel computation:
CC adds dependent types to System F. The rule permits types to depend on terms by allowing functions which take terms as arguments but return types. Thus, the complete specification of CC is:
3 Simple Metatheory
For completeness, we review a few basic metatheoretic results. We will write for the typing judgement of an arbitrary PTS or when it is clear what system we are discussing, and otherwise will label the turnstile as in for CC’s typing relation in particular.
Theorem 3.1 (Confluence).
If and then there is a such that and .
The second property, preservation, is proved by induction on typing derivations, using a substitution lemma.
Theorem 3.2 (Preservation).
If and then .
The last property is not usually considered for less expressive lambda calculi because they are presented with separate syntax for terms, types and kinds. The theorem says that CC expressions can still be classified in this way with the typing judgement. It is proved by a straightforward induction on typing derivations.
Theorem 3.3 (Classification).
If , then exactly one of the following holds:

is . In this case, we call a kind.

. In this case, we call a constructor.

. In this case, we call a term.
When is , we will call a type. This is a special case of the second bullet above. In this document we use the word “expression” to refer to any element of CC’s grammar and reserve the word “term” for the subclass of expressions identified here.
Notice that we need a context to distinguish between constructors and terms, but can identify kinds without one. The ambiguity comes from variables, and some authors avoid it by splitting them into two syntactic classes (typically for term variables and for type variables). Distinguishing the variables in this way forces duplication or subtle inaccuracy when discussing binders at different levels. For that reason, we prefer to mix the variables and use a context to identify the terms and constructors.
Finally, we define the central notion considered below:
Definition 3.4.
An expression is called strongly normalizing if there are no infinite reduction sequences beginning at it. We write for the collection of all such expressions.
4 Structure of the proofs
The three proofs we consider each model CC in a different domain, but they share a similar overall structure. In this section we describe the technique at a high level.
Step 1: Define the interpretations
Each proof begins by defining two interpretations. A “type” interpretation, usually written , captures the static meaning of types, kinds and sorts. For example, in the second proof we will model types as sets of expressions so that contains all the terms of type . Then a “term” interpretation is defined to capture the runtime behavior of terms, types and kinds. This is usually written . In the example where types are interpreted as sets of expressions, the term interpretation might pick a canonical inhabitant with the right reduction behavior from the set.
Step 2: Relate the interpretations
After defining the term and type interpretations, we prove a theorem that relates them. For example, in the second proof we will show that if , then . This theorem is usually called “soundness”.
Step 3: Declare success
After proving the soundness theorem we observe that one of the interpretations has some important property. This property will mean that strong normalization is a direct consequence of the soundness theorem. In the running example, will turn out to contain only strongly normalizing expressions. Then, since and models ’s runtime behavior, .
A clarification about the interpretations
Though we have called the “type” interpretation and the “term” interpretation, we do not mean that the former is only defined on types and the later on terms, in the sense of the classification theorem. Rather, is meant to model the static meaning of any expression that can be used to classify other expressions. In each proof it will be defined on all constructors, kinds and sorts of CC. Correspondingly, is meant to model the dynamic behavior of any expression which can take reduction steps. It will be defined on the terms, constructors and kinds of CC.
5 Modeling CC in F
The first proof we consider translates CC expressions to System F in a way that preserves reduction. System F is known to be strongly normalizing (see [6] for a detailed proof), so the correctness of this translation will imply that CC is strongly normalizing as well. The idea to prove strong normalization of an expressive type theory by translation to a betterunderstood system has been used in a variety of contexts. For example, Harper et al. [10] demonstrated that LF is strongly normalizing by giving a reductionpreserving translation to the simply typed lambda calculus. This technique was originally applied to CC by Geuvers and Nederhof [8], and their proof is reproduced in Barendregt [3].
While this development does not have the same focus on extensibility or generality as the later two, it has at least two advantages. First, the proof is modular. The other two proofs we will see are monolithic in that they must explain the unique features of CC while recapitulating and extending a complicated semantic argument. Here we may focus on the ways in which CC extends F and can rely on the somewhat simpler semantics of that system. Second, the translation itself is simple and can be verified in Peano arithmetic. Thus, this technique demonstrates that the prooftheoretic complexity of CC’s strong normalization argument is no greater than that of F.
5.1 Intuition for the translation
The calculus of constructions extends System F with dependency in the form of the rule . This rule permits typelevel abstractions which create types but take terms as arguments. The difficulty comes from modeling these functions in System F without erasing any possible reduction sequences.
To do this, we will translate expressions in two distinct ways. The “type” translation erases the dependencies to create types from CC types. The “term” translation keeps the dependencies to avoid erasing any possible reductions, but lowers type functions to the level of terms. The soundness theorem of our translation will state
We follow Geuvers and Nederhof [8] in exhibiting how these translations handle several examples before specifying them in full detail. Consider a simple example of dependency where is a dependent type function, a type and a term, so that:
The subderivation which checks the type of will need to make use of rule . We must somehow erase this use of dependency so that, in F:
To solve this, we take where is a fixed type variable that is added to the context by . We set and . Now when checking the translated we have a termlevel function rather than one which returns a type.
For our second example, suppose and are types and is a term of type . When translating the application , we must erase to an F type using the type translation . However, this admits the possibility that by erasing dependency we will delete redexes. This is solved by inserting an extra redex which does nothing but provide a spot to hold ’s translation as a term. That is, for some fresh variable
The situation for polymorphism is similar. Consider a constructor with kind (for example, the polymorphic identity function) and a type . In translating the term , we must preserve ’s static meaning as a type without erasing any possible reduction sequences. The solution is to use both translations, again:
=  
= 
The theme of these examples is that the two translations accomplish different tasks. The type translation erases dependencies to make F types out of CC types. The term translation preserves reduction behavior but lowers CC types to F terms in order to accommodate the weaker type system. We translate parts of expressions twice so that we can achieve both goals.
5.2 The translation of types and contexts
Now we give the complete definition of the translation functions. We begin by owning up to a slight simplification in the last section. To distinguish term variables from type variables, the translations must be indexed by contexts. Thus, the translation for types becomes , and the translation for terms becomes . The translation for contexts, , remains unindexed.
In addition to these functions we define which translates CC sorts and kinds to F kinds:
This function is not indexed by a context because CC kinds may be distinguished without one, by the classification theorem. The reason for the case split in the last clause is that we are erasing dependency.
The translation of types from CC to F follows the examples from the previous section. The domain of is the sorts, kinds, and constructors of CC. We pick a unique type variable and assume it is never used in an input to this function.
This function inserts duplication in product types as we discussed in the examples section. Otherwise, it is straightforward with the intuition that we are erasing dependency. The cases of the type translation that deal with functions take into account the level of the function’s domain (just as we saw with ). This distinction is justified by the classification lemma and is reflected in the substitution lemma for the translation:
Lemma 5.1 ( respects substitution).
Suppose is a kind or constructor in CC. When and , we have:

, if is a kind.

, if is a type.
This lemma can be shown by induction on the typing derivation. It follows that the translation of types preserves conversion:
Lemma 5.2 ( preserves ).
Suppose and are kinds or constructors in CC such that . Then .
Before we can state that the results of are classified by , we must extend the translation to contexts. As mentioned, will add a type variable to the context. There are two additional changes. First, a variable will be added to help provide a canonical inhabitant for each type. Second, for each kind variable which appears in , the translation will add another variable . This last change simply ensures that contexts match up with the translation of product types, where we add an extra argument in the case of kinds as discussed above.
We define the translation of contexts in two parts. First, a function maps each context binding to one or two translated bindings:
The translation of a context simply maps this last function onto each binding and adds and to the front, as mentioned. Suppose , then:
Now the soundness of the translation of types follows straightforwardly by induction on typing derivations
Lemma 5.3 (Soundness of ).
Suppose is a sort, kind or type of CC such that . Then .
5.3 The translation of terms
As mentioned in the last section, the translation of contexts permits the construction of a canonical inhabitant of each type or kind in F. In particular, for any expression such that , we will define a term of type in the same context. That is, . If , then we may use the term to construct :
Otherwise, is a kind and we define:
The evaluation behavior of these canonical inhabitants is not very important. The chief purpose of is to help in the term translation of product types. The problem is that when is a valid CC type, its translation is not necessarily welltyped in F. The translation handles this by erasing dependency, but must retain all the possible reductions which begin at . Instead of translating it as a product, we use to construct a function whose application to and is welltyped. In particular, will be a valid F expression. Since does not erase the terms from and , this retains all the possible reduction sequences.
We now present the full translation of terms:
Theorem 5.4 (Soundness of ).
If then .
As we have seen with previous soundness theorems, this proof is not conceptually surprising but requires a certain amount of book keeping. We show only one interesting case:
Proof.
We go by induction on the structure of the derivation of .

Inversion on yields a subderivation showing either that is a kind that is a type in CC. We will consider each possibility individually. Note that because , in either case we have an induction hypothesis:

Suppose first that . Unfolding the definitions of the translations, we see that we must show:
Here is some variable which doesn’t occur in , , or . By IH and the TApp rule, it will be enough to show:
Recall that will appear in . By applying soundness for to the subderivations of , we find that and are also valid F types in the contexts and , respectively. So by two applications of TPi and a standard weakening lemma for F, we have:
Therefore, by rule TLam, it will be enough to show:
We have already observed that is a valid F type in this context. Thus, by another application of TLam, it is sufficient to show:
Observe that the IH for is close to this (after slightly unfolding the interpretation of the context):
And the result follows by a weakening lemma.

Suppose instead that . After unfolding the translations, we must show:
Here is some fresh variable. By IH and rule TApp, it is enough to show:
As before, is a type and soundness for implies that and are valid types in F as well. The definition of ensures that is an F kind. So by several applications of TPi and weakening for F, we have:
Thus, by three applications of TLam, it is enough to show:
This follows from the IH for , the observation that , and weakening for F. ∎

The soundness of the term translation demonstrates that it preserves the static semantics of CC expressions. We must also show that it preserves their reduction behavior. A lemma describing the way this function interacts with substitutions is needed. The duplication in the first case below mirrors the duplication we have discussed in the translation.
Lemma 5.5 (Substitution for ).
Suppose and .

If is a kind and is a type in CC, then

If is a type and is a term in CC, then
5.4 Strong Normalization
The final step in this proof is to relate reductions from CC expressions with reductions from their translations. The following result says that the term translation does not drop any reduction steps.
Theorem 5.6 ( preserves reduction).
Suppose .
Here, denotes reduction in at least one step.
Proof.
The proof is by induction on the derivation that . The case of beta reduction uses Lemma 5.5. Each congruence case follows quickly by using an inversion lemma on the typing assumption and applying the induction hypothesis. ∎
Strong normalization for CC now follows quickly, using the same result for F.
Theorem 5.7 (Strong normalization).
If , then .
Proof.
Assume for a contradiction that there is an infinite reduction sequence starting at :
By preservation, for each . Thus, by Lemma 5.6, there is another infinite sequence of reductions:
But by the soundness of the term interpretation, we have . This is a contradiction because the welltyped terms of F are strongly normalizing. ∎
6 Modeling types as sets of expressions
The second proof we consider, from Geuvers [7], will be the most familiar to readers acquainted with the Girard–Tait method of reducibility candidates or saturated sets. The paper places a special emphasis on making the proof easy to extend with additional programming language constructs. To this end, only the metatheory we have introduced so far is required.^{1}^{1}1In fact, Geuvers requires a little less: he claims preservation isn’t necessary. He still relies on substitution and a strong inversion lemma, though, so our presentation does not deviate too far from his proof. Several examples of extensions are included, and we consider some after the development for CC itself.
6.1 Basic definitions
We begin with a few definitions and results relating to reduction. Intuition for these ideas is important to understanding the main proof, so we discuss them in some detail.
Definition 6.1.
Any expression of the form is called a base expression. The set of base expressions is denoted . Note that variables are base expressions (i.e., is allowed).
Definition 6.2.
With some expressions we associate another expression, called a key redex.

The expression is its own key redex.

If has a key redex, then has the same key redex.
We denote by the expression obtained by reducing ’s key redex, when it has one. Note that base expressions don’t have key redexes. The intuition behind key redexes is that they can not be avoided. Reducing an expression without reducing its key redex leaves the redex in place. This intuition and the importance of key reduction is captured by the following two lemmas. They are not difficult to prove, but they rely on a few other simple properties of beta reduction.
Lemma 6.3.
Suppose has a key redex and without reducing that redex. Then has a key redex, and .
It is helpful to visualize this lemma:

Lemma 6.4.
Suppose and . Then .
Proof.
Suppose for a contradiction that there is an infinite reduction sequence starting at . Since we know and are in , this means the application must betareduce in some finite number of steps. That is, the infinite sequence has a prefix of the form:
Note that this last step reduces a key redex. Thus, by multiple applications of lemma 6.3, we have . This is a contradiction, since but we found an infinite reduction sequence starting at it. ∎
Saturated sets and their closure properties are the key technical device in the interpretation. Originally introduced by Tait, they are closely related to Girard’s candidates of reducibility (for detailed comparisons, see [11] and [6]). The idea is pervasive, and we will see it again in the second proof.
Definition 6.5.
A set of expressions is called a saturated set if the following three conditions hold:



If and then .
The third condition states that saturated sets are “closed under the expansion of key redexes”. We write for the collection of all saturated sets. Note that and that every saturated set is nonempty.
Lemma 6.6.
If is a nonempty collection of saturated sets, then .
Definition 6.7.
If and are sets of expressions, define:
It helps to have some intuition for this last definition: An expression is in if whenever is applied to an expression in , you get an expression in . Thus, when these sets model types, will contain the functions from the first type to the second. The next lemma, that preserves saturation, involves the most intricate reasoning about reduction that appears in this proof.
Lemma 6.8.
If , then .
Proof.
There are three conditions to verify:

() Suppose . Saturated sets are nonempty, so let be given. Then , and so . Thus, .

() Let be given. For any , since , . Thus . So, .

( is closed under key redex expansion) Suppose and . We must show , so let be given. We have , and must show . But . Since is closed under expansion of key redexes, it is enough to show that . This follows immediately by lemma 6.4. ∎
6.2 Interpreting kinds
The interpretation of types comes in two steps. First we define a function on the sorts and kinds of CC. This is, roughly, the type of the main interpretation: if is a kind or type such that , then ’s interpretation will be an element of the set .
By , we mean the collection of all (settheoretic) functions from to .
Lemma 6.9.
If is a kind, then is nonempty.
As an example, consider the type . Notice that is the collection of all functions from saturated sets to saturated sets. So, when we interpret an expression with this type (say ), we will expect to get a function that takes collections of expressions to other collections of expressions. This makes sense, since it is a function from types to types.
The observant reader will notice that this definition of mirrors the one from Section 5.2
. Just as there, it indicates that we will ignore dependency in the interpretation of types. This will work because of the limited ways in which CC may use terms in types. For example, CC lacks large eliminations: even though we can encode natural numbers, we can not define types by pattern matching on them.
6.3 Interpreting types
Because our interpretation is not restricted to closed types, we begin by defining an environment that interprets the variables. Later, we will consider another similar environment for terms.
Definition 6.10.
Given a context such that , a constructor environment for is a function that maps the type variables of to appropriate sets according to . It should satisfy the condition:
We’ll write for this relation and for the constructor environment which maps to and otherwise agrees with .
Finally, we define the interpretation of types when is a sort, kind or type:
By , we mean the settheoretic function that maps each in to .
This type interpretation is very similar to the one from Section 5.2. Many of the lemmas we will need also mirror results from that section. For example, compare the substitution lemma below with Lemma 5.1.
Lemma 6.11 ( respects substitution).
Suppose and is a kind or constructor. When and , we have:

, if is a kind.

, if is a type.
From this it follows that betaconvertible types have the same interpretation.
Lemma 6.12 ( respects ).
Suppose and , are kinds or constructors such that . Then .
As promised, the range of the interpretation is classified by the function . The proof is by induction on typing derivations. In the conversion case, Lemma 6.12 is used.
Lemma 6.13 (Soundness of ).
If and is a kind or constructor such that , then .
An important consequence of this lemma is that the interpretation of a type is always a saturated set and thus contains only strongly normalizing expressions.
6.4 From the interpretation to Strong Normalization
The key fact about the function is that every CC expression is in the interpretation of its type. Before we can prove this, we need a notion of environment for terms corresponding to for types.
Definition 6.14.
We call a mapping on variables a term environment for with respect to when and:
We write for this relation and for the expression created by simultaneously replacing the variables of with their mappings in . We write for the term environment that sends to and otherwise agrees with .
We show only the trickiest case of the key theorem—the complete proof may be found in the appendix. Though there are a number of details to keep track of, all the cleverness is in the definition of the interpretation; the result here is straightforward by induction.
Theorem 6.15 (Soundness of ).
Suppose and . Then .
Proof.
By induction on the derivation of .

The IH for gives us . Since is a bound variable, we may pick it to be fresh for the domain and range of . There are two subcases: is either or .

Suppose is . Then we must show . So let be given, and observe it is enough to show .
We have . Thus, the IH for gives us . But we also know
