1 Introduction
Consider the simple types generated by
where we take to be rightassociative and to be leftassociative, and we think of as the type of natural numbers.
Loosely speaking, we shall be interested in the question: when can a type be encoded in a type ? In other words, for which pairs of types can one provide an ‘encoding’ operation and a ‘decoding’ operation such that ? If such operations exist, one may say in mathematical terminology that is a retract of , with constituting a retraction .
For example, under mild assumptions, we can encode in : take an encoding that maps a pair to the function , and a decoding that maps a function to the pair . However, one would not expect to be able to encode in in a similar fashion.
To make our question precise, we have to clarify two things:

What do we mean by an ‘operation’ of type or ? One possibility is to take this to mean a closed term of the appropriate type in some simplytyped calculus (with product types), taking to be the term . A minimal choice would be the pure simplytyped calculus itself; we shall denote this by . A slightly less minimal choice would be the simplytyped calculus with constants
subject to the conversion rules
We shall denote this language by . A much more generous choice would be the whole of Plotkin’s PCF (suitably formulated with a single base type and with product types). Even richer languages might also be considered.^{1}^{1}1Alternatively, one could construe an ‘operation’ to mean an element of the appropriate type within some model of simplytyped calculus. We shall not emphasize this ‘semantic’ point of view in this note, although it is clearly related to the syntactic one, as there are often close relationships between particular calculi and particular models.

What does the ‘’ mean in the equation ? For instance, a very strict kind of equality would be computational equality, the congruence on terms generated by the conversion rules of the language in question. We shall write for computational equality on (this is generated by the rule plus the equations , ), and for computational equality on (where the conversion rules for are added). A much looser kind of equality would be some kind of observational equivalence of terms, such as the familiar notion of observational equivalence in ; we shall denote this by .
We shall refer to a choice of a simplytyped calculus together with an equality relation on its terms as a language theory. Once a language theory has been fixed on, it becomes a precise question for which we have ; we may thus write to mean that the language theory makes a retract of . However, one can imagine that the relation might vary according to the language theory chosen: in principle, the richer the language, and the more generous the equality, the easier it might become to construct a retraction . More precisely, suppose we set iff and ; in this situation, it is clear that implies , though the converse need not hold in general. It is thus not initially obvious whether any ‘stable’ or ‘robust’ answer to the question raised at the outset should be expected.
There has been a body of previous work on characterizing in the case , both with respect to equality [1] and more nontrivially with respect to equality [3, 10, 8, 11]. Such questions have also been considered in the presence of multiple base types [9, 11]. Closely related to this is a body of work on characterizing the isomorphism relation between types in pure typed calculi, often with much richer type systems than the one considered here (see [4] for an informative survey). It is also the case that for languages with the power of or above, particular examples of definable retractions arise routinely used in higherorder computability theory: for example, one frequently exploits the fact that every simple type is encodable in a pure type (see [7, Chapter 4]). However, as far we are aware, there has hitherto been no systematic attempt to map out the encodability relation for all types in languages of this kind.
Our purpose of this paper is to study the question of encodability in the setting of such languages. We will show, in fact, that the relation remains stable across a significant class of language theories ; furthermore, this relation is easy to characterize syntactically and enjoys some very pleasing properties. More specifically, we shall establish the following:

The relation is the same for all language theories with
We henceforth write this relation on types as .

We have if and only if are trivially isomorphic, i.e. iff there is an isomorphism between them generated by canonical isomorphisms of type
(This is a wellknown axiomatization for isomorphisms of types built from in pure calculi: see e.g. [4].) It follows, for instance, that if can be isomorphic relative to (i.e. there are terms , with , ) only if they are trivially isomorphic.

For any , we have either or . Thus, induces a total ordering (which we also write as ) on types modulo trivial isomorphism.

This total ordering on classes of types is in fact a wellordering of order type . What is more, there is a close syntactic correspondence between simple types and Cantor normal forms for ordinals below . As we shall see, this correspondence leads to a simple syntactic characterization of showing that this relation is readily decidable.
We note at the outset that this picture may break if languages more powerful than are admitted, or if equalities more generous than observational equivalence are considered. On the one hand, if we move to a language such as PCF+parallelor+exists or PCF+catch for which a universal type exists, then many other nontrivial encodings between types will be possible [7]. (In an extension of PCF with higherorder references, even nontrivial isomorphisms between types can appear [2].) On the other hand, if our language is (or even System T) and we work up to equivalence with respect to observing contexts drawn only from System T, we will find that every type actually becomes definably isomorphic to the pure type of the same level. This is an easy consequence of Theorem 4.2.9 of [7], which establishes this fact for extensional total type structures over under mild hypotheses.
In Section 2 we establish the ‘positive’ content of the above results: the existence of an ordinal ranking on types leading to the definition of a total preorder with associated equivalence ; the existence of a trivial isomorphism whenever ; and the existence of a retraction whenever . In Sections 3 and 4 we proceed to the ‘negative’ part, namely the fact that if then no retraction exists, even with respect to . We establish this using the technology of nested sequential procedures for PCF. Since the argument in full generality is quite complex, we first treat the case when is a pure type , then use this to motivate some of the ideas required for the general case.
I am grateful to Dag Normann, both for raising the question of characterizing the encodability relation for all simple types in the setting of , and also for the key insight that deeply nested constituents of types contribute more to their complexity than shallow ones: e.g. is a more complex type than , which is more complex than . This was the idea that led to the ordinal ranking of types as exhibited in Section 2.
2 An ordinal ranking for types
Let us begin by defining the relation on types to be the congruence generated by the ‘trivial’ equivalences mentioned above:
Clearly, each of these generating equivalences corresponds to an isomorphism of types expressible in ; it follows easily that if then . If , we shall say that are trivially isomorphic. We note in particular that for any , and that for any permutation of .
To define and the ordinal ranking of types, we shall work with the subclass of types generated by the grammar
We shall refer to these here as uncurried types (ad hoc terminology). It is easy to see that every is trivially isomorphic to some uncurried type.
The following inductive clauses assign ordinal ranks to certain wellbehaved uncurried types:

.

If then .

If then .
We may refer to the types to which a value for is assigned by this inductive definition as canonical types. (We might also add the empty product type and declare that , but this would introduce complications later on which we prefer to avoid.)
Note that a canonical type is in effect a representation of the Cantor normal form for the ordinal . For our purposes, Cantor normal forms will be formal expressions generated inductively by the following clauses (we generate them simultaneously with a valuation mapping them to actual ordinals). Note that we here modify the usual definition so as to exclude 0.

is a Cantor normal form, where .

If is a Cantor normal form then so is , where .

If are Cantor normal forms with , then is a Cantor normal form with .
In practice, we shall sometimes blur the distinction between Cantor normal forms and the ordinals they denote.
The correspondence between Cantor normal forms and canonical types is now immediate. The wellknown fact that every ordinal below has a unique Cantor normal form now gives us:
Proposition 1
For any ordinal , there is a unique canonical type with .
It is also easy to see by induction on type levels that every uncurried type, and hence every type , is isomorphic to a unique canonical type (simply by admitting permutations of products ). This allows us to extend our ranking to all types, and we may now define iff .
It is thus clear that is a total preorder on types, that iff iff , and that is readily decidable. We now work towards showing that if then . This will in fact be easy once we have established a certain way of inductively generating the order relation on ordinals below . Let us say a formal sum is a Cantor sum if the Cantor normal form of is , where is the Cantor normal form of (this amounts to the condition that ). Now let be the binary relation on ordinals generated by the following clauses:

.

implies .

.

for any .

If then , where , are Cantor sums.

If then .
Clearly if then , since also satisfies the above properties. Moreover:
Proposition 2
If then .
Proof: We show by complete induction on that for all we have . For this is trivial by clause 1 above. For the successor case, if the induction claim holds for , then for any we have either , in which case by clause 1, or , in which case by the induction hypothesis and clauses 3 and 2. For limit ordinals, suppose is expressed as a Cantor sum where . If is itself a successor, say , then , so for any , either (in which case clause 1 applies) or for some we have . But by the induction hypotheses for we have , and by clauses 4 and 5 we have . Hence by clause 2 we have as desired.
The remaining case is that is expressed as a Cantor sum where is a limit ordinal. Since , we have , so we may use the induction hypothesis for . Taking any sequence with limit , we have that , so for any we again have either (so that clause 1 applies) or for some . But in the latter case, we have by the induction hypothesis for ; but also by the induction hypothesis for , whence by clauses 5 and 6. Hence again by clause 2.
The following now establishes the existence of the required retractions. Note that with Proposition 2 in hand, only the most trivial manipulations of terms are needed.
Proposition 3
Whenever , we have : that is, there are terms and such that .
Proof: In view of Proposition 2, it suffices to show by induction on the generation of that if then . This is thus just a question of treating each of the six clauses for in turn. For clauses 1 and 2, we use the usual identity and composition of retractions. For clause 3, a retraction is given by the terms and . For clause 4, we note that (writing for the product of copies of ) and . We may thus embed the former in the latter by the mapping
and project the latter to the former by the mapping
It is routine to check that the composition of these is convertible to the identity. For clauses 5 and 6, we use the familiar liftings of a retraction to and .
Theorem 4
Whenever , we have , whence for any language theory .
Proof: Immediate from Proposition 3 and the trivial isomorphisms , .
From the above proofs it is also easy to extract an algorithm which, given any types with , constructs terms and that constitute a retraction.
3 A nonencodability result for pure types
It remains to show that if then no retraction can exist even with respect to . In view of the results of Section 2, it will suffice to show that we never have for any : that is, for no type can we have . In this section we shall establish this for the case when is a pure type (where and ); this will introduce many of the key ingredients in a relatively uncluttered form, in preparation for the general case which we treat in Section 4.
We assume that the reader is familiar with the language and the associated notion of observational equivalence, and knows how to set up a version of with product types and the single base type . We shall write for the extension of with an ‘oracle constant’ for every (classical) partial function .
We also assume familiarity with the nested sequential procedure (NSP) model for as presented in [7, Chapter 6] or [6], and with the notation and terminology used there. We write for observational equivalence of NSPs, and for the observational preorder on them. As it stands, the model does not have product types, but this is not an essential limitation. Indeed, it is wellknown that any type may be converted to a trivially isomorphic type in curried form — that is, one of the form where each is free — in such a way that any free is its own curried form. For a general type with curried form , one may therefore simply define the set to be the product .
We shall in fact show something a little stronger than the nonexistence of a retraction. The following concepts will be useful:
Definition 5
(i) We say is a pseudoretract of , and write , if there are closed terms and such that , where is the observational preorder on terms.
Equivalently, in terms of sequential procedures, we may say that if respectively have curried forms and , then a pseudoretraction consists of sequential procedures
such that for each we have . We say is a pseudoretract of if such a pseudoretraction exists; the standard theory of sequential procedures implies that this agrees with the definition via terms.
(ii) A pseudoretraction as above is strict if : more formally, if for all we have .
(iii) A pseudoretraction is leftstrict with respect to if .
Although we shall not always bother to distinguish between different ways of bracketing complicated product types, it is important to note that the concept of leftstrictness is defined relative to a certain way of dividing up the product type on the righthand side—more specifically, relative to the identification of as the ‘first’ component of the product. If is empty, then of course leftstrictness coincides with strictness.
Our goal in this section will be to prove:
Theorem 6
For any , the type is not a pseudoretract of .
This will follow readily from:
Lemma 7
Suppose , and is any sequence of types of level . Then any pseudoretraction must be leftstrict with respect to . More formally, given any NSPs
such that , we must have that .
We formulate the lemma in terms of a finite sequence of types rather than just a single type of level so as to cater smoothly for the case , when is simply .
To see that the lemma implies the theorem, suppose we had a pseudoretraction comprised by
This gives rise to a pseudoretraction comprised by
To see that this is nonstrict, we note that , but that since whereas . This implies that , contradicting Lemma 7 in the case . (This argument actually shows that if our language were extended with the unit type , then even would not be a retract of .)
The proof of the lemma itself will be modelled largely on the proof of [7, Theorem 7.7.1] (see also [6, Theorem 12] for a slightly improved exposition); we shall refer to this below as the ‘standard proof’. We reason by induction on .
The case is trivial: here we must have since there are no types of level , so it suffices to note that if then for some , hence is not invertible.
Suppose then that where the lemma holds for , and suppose we have
where , . We wish to show that .
Let , so that .
Claim 1: has the syntactic form , where .
Proof (transcribed from standard proof): Clearly does not have the form or , and the only other alternative form is . In that case, however, we would have
contradicting . This establishes Claim 1.
Now let denote the ‘dummy substitution’ .
Claim 2: , or equivalently .
Proof (adapted from standard proof): By the NSP context lemma, it will suffice to show that for any and . (Here and in what follows, the application to should be omitted in the case .) So suppose whereas for some . Take , so that whenever . Then by the context lemma, so we have since is maximal in . By the definition of , it follows that , whence , whereas , contradicting . This completes the proof of Claim 2.
Now consider the head reduction sequence
where . The head on the right hand side will have some ancestor within ; and since is not free in , this must appear as the head of some application within either or .
Case 1: comes from , say from . Since where , all bound variables within are of level . Let be the list of bound variables of in scope at the critical occurrence of . Then as in the standard proof, by tracking the subterm through the above reduction sequence, we easily see that for some metaterms . So we have
This exhibits as a retract of some finite product of level types , contradicting Theorem 7.7.1 of [7]. Alternatively, it contradicts the induction hypothesis of the present proof, since we can easily extend this to a retraction where (or just in the case ).
Case 2: comes from . Write as , where has type , but all variables bound within are of level . Let be the list of bound variables of that are in scope at the critical occurrence of . Then as above, we have that for some metaterms . So we have
Let denote the product of the types of the . Then , and the above constitutes a pseudoretraction , where is given by , by , and by .
By the induction hypothesis, this pseudoretraction is leftstrict w.r.t. : that is, , or more formally . To show that this implies that our original pseudoretraction is leftstrict, we require a further argument that did not feature in the standard proof.
Claim 3: has head variable , whence .
Proof: Since somewhere contains the application , is not a constant procedure, and the only other alternative is that has the form (omitting in the case ). By tracking the transformation of to through the head reduction sequence for , we now see that this sequence must have contained reductions
where . Specializing via , we obtain
where . But , so can only be the procedure . Thus the subterm above evaluates to , and so itself must evaluate to some numeral, say . Finally, specializing to , we have , so , contrary to what was established above by the induction hypothesis. (For the case , the references to should of course be deleted.)
4 Nonencodability in the general case
We now wish to generalize the above proof to show that:
Theorem 8
The type is not a pseudoretract of for any .
This will follow from a lemma proved by induction on the ordinal rank of . In the general setting, however, this lemma will need to be formulated somewhat more subtly than Lemma 7. Motivated by the arguments of the previous section, we introduce the following concept:
Definition 9
Suppose and are lists of free types. A quasiretraction consists of terms
such that we have a head reduction sequence
where .
The reader will see that a situation very close to this appeared in the course of the proof of Lemma 7. As we will shortly see, a quasiretraction gives rise to a pseudoretraction where the are in some sense ‘lower’ types. Nevertheless, in the general setting, the existence of a quasiretraction turns out to afford a more suitable induction hypothesis than the existence of such a pseudoretraction, since the former implicitly imposes some useful additional constraints on how these components will behave.
We shall often identify a list of types with the corresponding product type . Thus, if and are any types in curried form, respectively the products of the lists and , then we may refer to a quasiretraction also as a quasiretraction .
In the situation of Definition 9, the head on the righthand side will originate from some application subterm within . Let be the list of bound variables of in scope at the point of this subterm’s occurrence. These will consist of the toplevel bound variables of , of types (we call these the major variables of ), plus possibly some others, say of types (we call these the minor variables). The latter will be associated with applications within that contain the critical occurrence of . (Note that none are associated with applications , since the outer would then prevent the critical from emerging as the head variable.) It follows that if the types are all of level , then the types are all of level ; however, there may be types among that are higher than some among .
We also have in this situation that for some metaterms . Writing for the substitution , it follows that , so that exhibit a pseudoretraction . We call this the associated pseudoretraction of the quasiretraction , and refer to the and as its major and minor components respectively.
Definition 10
We say a quasiretraction is leftstrict with respect to if the associated pseudoretraction is leftstrict with respect to . In the case that is empty, we may also say simply that such a quasiretraction is strict.
Once again, we note that the notion of a leftstrict quasiretraction is defined relative to some identification of the ‘first’ component of the product on the right hand side.
It will be helpful to know that any quasiretraction can be replaced by an ‘equivalent’ one of a more restricted form. Specifically, we shall say a quasiretraction is simple if contains just a single free occurrence of , and this is at the head of (i.e. has the form ). We then have: