# The encodability hierarchy for PCF types

Working with the simple types over a base type of natural numbers (including product types), we consider the question of when a type σ is encodable as a definable retract of τ: that is, when there are λ-terms e:σ→τ and d:τ→σ with d ∘ e = id. In general, the answer to this question may vary according to both the choice of λ-calculus and the notion of equality considered; however, we shall show that the encodability relation ≼ between types actually remains stable across a large class of languages and equality relations, ranging from a very basic language with infinitely many distinguishable constants 0,1,... (but no arithmetic) considered modulo computational equality, up to the whole of Plotkin's PCF considered modulo observational equivalence. We show that σ≼τ≼σ iff στ via trivial isomorphisms, and that for any σ,τ we have either σ≼τ or τ≼σ. Furthermore, we show that the induced linear order on isomorphism classes of types is actually a well-ordering of type ϵ_0, and indeed that there is a close syntactic correspondence between simple types and Cantor normal forms for ordinals below ϵ_0. This means that the relation ≼ is readily decidable, and that terms witnessing a retraction στ are readily constructible when σ≼τ holds.

Comments

There are no comments yet.

## Authors

• 3 publications
11/15/2017

### Statman's Hierarchy Theorem

In the Simply Typed λ-calculus Statman investigates the reducibility rel...
05/12/2020

### Session Types with Arithmetic Refinements

Session types statically prescribe bidirectional communication protocols...
01/27/2021

### Normalization for Cubical Type Theory

We prove normalization for (univalent, Cartesian) cubical type theory, c...
09/08/2020

### On principal types and well-foundedness of terms in ECC

When we investigate a type system, it is helpful if we can establish the...
07/13/2019

### Sharing Equality is Linear

The λ-calculus is a handy formalism to specify the evaluation of higher-...
01/25/2019

### Calculational HoTT

Based on a loose correspondence between, on one hand, a first order vers...
10/11/2021

### Logical Foundations of Quantitative Equality (long version)

Quantitative reasoning provides a flexible approach capable to deal with...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Consider the simple types generated by

 σ,τ ::= N ∣ σ→τ ∣ σ×τ

where we take to be right-associative and to be left-associative, and we think of as the type of natural numbers.

Loosely speaking, we shall be interested in the question: when can a type be encoded in a type ? In other words, for which pairs of types can one provide an ‘encoding’ operation and a ‘decoding’ operation such that ? If such operations exist, one may say in mathematical terminology that is a retract of , with constituting a retraction .

For example, under mild assumptions, we can encode in : take an encoding that maps a pair to the function , and a decoding that maps a function to the pair . However, one would not expect to be able to encode in in a similar fashion.

To make our question precise, we have to clarify two things:

• What do we mean by an ‘operation’ of type or ? One possibility is to take this to mean a closed term of the appropriate type in some simply-typed -calculus (with product types), taking to be the term . A minimal choice would be the pure simply-typed -calculus itself; we shall denote this by . A slightly less minimal choice would be the simply-typed -calculus with constants

 ˆn : N,        ifn : N→N→N→N~{}~{}~{}for each n∈N,

subject to the conversion rules

 ifnˆnPQ = P,        ifnˆmPQ = Q~{}~{}for each m≠n.

We shall denote this language by . A much more generous choice would be the whole of Plotkin’s PCF (suitably formulated with a single base type and with product types). Even richer languages might also be considered.111Alternatively, one could construe an ‘operation’ to mean an element of the appropriate type within some model of simply-typed -calculus. We shall not emphasize this ‘semantic’ point of view in this note, although it is clearly related to the syntactic one, as there are often close relationships between particular -calculi and particular models.

• What does the ‘’ mean in the equation ? For instance, a very strict kind of equality would be computational equality, the congruence on -terms generated by the conversion rules of the language in question. We shall write for computational equality on (this is generated by the -rule plus the equations , ), and for computational equality on (where the conversion rules for are added). A much looser kind of equality would be some kind of observational equivalence of terms, such as the familiar notion of observational equivalence in ; we shall denote this by .

We shall refer to a choice of a simply-typed -calculus together with an equality relation on its terms as a language theory. Once a language theory has been fixed on, it becomes a precise question for which we have ; we may thus write to mean that the language theory makes a retract of . However, one can imagine that the relation might vary according to the language theory chosen: in principle, the richer the language, and the more generous the equality, the easier it might become to construct a retraction . More precisely, suppose we set iff and ; in this situation, it is clear that implies , though the converse need not hold in general. It is thus not initially obvious whether any ‘stable’ or ‘robust’ answer to the question raised at the outset should be expected.

There has been a body of previous work on characterizing in the case , both with respect to -equality [1] and more non-trivially with respect to -equality [3, 10, 8, 11]. Such questions have also been considered in the presence of multiple base types [9, 11]. Closely related to this is a body of work on characterizing the isomorphism relation between types in pure typed -calculi, often with much richer type systems than the one considered here (see [4] for an informative survey). It is also the case that for languages with the power of or above, particular examples of definable retractions arise routinely used in higher-order computability theory: for example, one frequently exploits the fact that every simple type is encodable in a pure type (see [7, Chapter 4]). However, as far we are aware, there has hitherto been no systematic attempt to map out the encodability relation for all types in languages of this kind.

Our purpose of this paper is to study the question of encodability in the setting of such languages. We will show, in fact, that the relation remains stable across a significant class of language theories ; furthermore, this relation is easy to characterize syntactically and enjoys some very pleasing properties. More specifically, we shall establish the following:

1. The relation is the same for all language theories with

 (L1,=1) ⊑ (L,=) ⊑ (PCF,=\scriptsize\rm obs).

We henceforth write this relation on types as .

2. We have if and only if are trivially isomorphic, i.e. iff there is an isomorphism between them generated by canonical isomorphisms of type

 (ρ×ρ′)→ρ′′ ≅ ρ→ρ′→ρ′′ ρ→(ρ′×ρ′′) ≅ (ρ→ρ′)×(ρ→ρ′′) ρ×(ρ′×ρ′′) ≅ (ρ×ρ′)×ρ′′ ρ×ρ′ ≅ ρ′×ρ

(This is a well-known axiomatization for isomorphisms of types built from in pure -calculi: see e.g. [4].) It follows, for instance, that if can be isomorphic relative to (i.e. there are terms , with , ) only if they are trivially isomorphic.

3. For any , we have either or . Thus, induces a total ordering (which we also write as ) on types modulo trivial isomorphism.

4. This total ordering on -classes of types is in fact a well-ordering of order type . What is more, there is a close syntactic correspondence between simple types and Cantor normal forms for ordinals below . As we shall see, this correspondence leads to a simple syntactic characterization of showing that this relation is readily decidable.

We note at the outset that this picture may break if languages more powerful than are admitted, or if equalities more generous than observational equivalence are considered. On the one hand, if we move to a language such as PCF+parallel-or+exists or PCF+catch for which a universal type exists, then many other non-trivial encodings between types will be possible [7]. (In an extension of PCF with higher-order references, even non-trivial isomorphisms between types can appear [2].) On the other hand, if our language is (or even System T) and we work up to equivalence with respect to observing contexts drawn only from System T, we will find that every type actually becomes definably isomorphic to the pure type of the same level. This is an easy consequence of Theorem 4.2.9 of [7], which establishes this fact for extensional total type structures over under mild hypotheses.

In Section 2 we establish the ‘positive’ content of the above results: the existence of an ordinal ranking on types leading to the definition of a total preorder with associated equivalence ; the existence of a trivial isomorphism whenever ; and the existence of a -retraction whenever . In Sections 3 and 4 we proceed to the ‘negative’ part, namely the fact that if then no retraction exists, even with respect to . We establish this using the technology of nested sequential procedures for PCF. Since the argument in full generality is quite complex, we first treat the case when is a pure type , then use this to motivate some of the ideas required for the general case.

I am grateful to Dag Normann, both for raising the question of characterizing the encodability relation for all simple types in the setting of , and also for the key insight that deeply nested constituents of types contribute more to their complexity than shallow ones: e.g.  is a more complex type than , which is more complex than . This was the idea that led to the ordinal ranking of types as exhibited in Section 2.

## 2 An ordinal ranking for types

Let us begin by defining the relation on types to be the congruence generated by the ‘trivial’ equivalences mentioned above:

 (ρ×σ)→τ ∼ ρ→σ→τ ρ→(σ×τ) ∼ (ρ→σ)×(ρ→τ) ρ×(σ×τ) ∼ (ρ×σ)×τ σ×τ ∼ τ×σ

Clearly, each of these generating equivalences corresponds to an isomorphism of types expressible in ; it follows easily that if then . If , we shall say that are trivially isomorphic. We note in particular that for any , and that for any permutation of .

To define and the ordinal ranking of types, we shall work with the subclass of types generated by the grammar

 ρ ::= N ∣ θ→N θ ::= ρ ∣ θ×ρ

We shall refer to these here as uncurried types (ad hoc terminology). It is easy to see that every is trivially isomorphic to some uncurried type.

The following inductive clauses assign ordinal ranks to certain well-behaved uncurried types:

• .

• If then .

• If then .

We may refer to the types to which a value for is assigned by this inductive definition as canonical types. (We might also add the empty product type and declare that , but this would introduce complications later on which we prefer to avoid.)

Note that a canonical type is in effect a representation of the Cantor normal form for the ordinal . For our purposes, Cantor normal forms will be formal expressions generated inductively by the following clauses (we generate them simultaneously with a valuation mapping them to actual ordinals). Note that we here modify the usual definition so as to exclude 0.

• is a Cantor normal form, where .

• If is a Cantor normal form then so is , where .

• If are Cantor normal forms with , then is a Cantor normal form with .

In practice, we shall sometimes blur the distinction between Cantor normal forms and the ordinals they denote.

The correspondence between Cantor normal forms and canonical types is now immediate. The well-known fact that every ordinal below has a unique Cantor normal form now gives us:

###### Proposition 1

For any ordinal , there is a unique canonical type with .

It is also easy to see by induction on type levels that every uncurried type, and hence every type , is isomorphic to a unique canonical type (simply by admitting permutations of products ). This allows us to extend our ranking to all types, and we may now define iff .

It is thus clear that is a total preorder on types, that iff iff , and that is readily decidable. We now work towards showing that if then . This will in fact be easy once we have established a certain way of inductively generating the order relation on ordinals below . Let us say a formal sum is a Cantor sum if the Cantor normal form of is , where is the Cantor normal form of (this amounts to the condition that ). Now let be the binary relation on ordinals generated by the following clauses:

1. .

2. implies .

3. .

4. for any .

5. If then , where , are Cantor sums.

6. If then .

Clearly if then , since also satisfies the above properties. Moreover:

###### Proposition 2

If then .

Proof: We show by complete induction on that for all we have . For this is trivial by clause 1 above. For the successor case, if the induction claim holds for , then for any we have either , in which case by clause 1, or , in which case by the induction hypothesis and clauses 3 and 2. For limit ordinals, suppose is expressed as a Cantor sum where . If is itself a successor, say , then , so for any , either (in which case clause 1 applies) or for some we have . But by the induction hypotheses for we have , and by clauses 4 and 5 we have . Hence by clause 2 we have as desired.

The remaining case is that is expressed as a Cantor sum where is a limit ordinal. Since , we have , so we may use the induction hypothesis for . Taking any sequence with limit , we have that , so for any we again have either (so that clause 1 applies) or for some . But in the latter case, we have by the induction hypothesis for ; but also by the induction hypothesis for , whence by clauses 5 and 6. Hence again by clause 2.

The following now establishes the existence of the required retractions. Note that with Proposition 2 in hand, only the most trivial manipulations of -terms are needed.

###### Proposition 3

Whenever , we have : that is, there are terms and such that .

Proof: In view of Proposition 2, it suffices to show by induction on the generation of that if then . This is thus just a question of treating each of the six clauses for in turn. For clauses 1 and 2, we use the usual identity and composition of retractions. For clause 3, a retraction is given by the terms and . For clause 4, we note that (writing for the product of copies of ) and . We may thus embed the former in the latter by the mapping

 ⟨f1,…,fk⟩ ↦ λ⟨x,z⟩. if0z(f0x)(if1z(f1x)(⋯(ifk−1z(fk−1x)ˆ0)⋯))

and project the latter to the former by the mapping

 g ↦ ⟨λx.g⟨x,ˆ0⟩,⋯,λx.g⟨x,ˆk−1⟩⟩.

It is routine to check that the composition of these is -convertible to the identity. For clauses 5 and 6, we use the familiar liftings of a retraction to and .

###### Theorem 4

Whenever , we have , whence for any language theory .

Proof: Immediate from Proposition 3 and the trivial isomorphisms , .

From the above proofs it is also easy to extract an algorithm which, given any types with , constructs terms and that constitute a -retraction.

## 3 A non-encodability result for pure types

It remains to show that if then no retraction can exist even with respect to . In view of the results of Section 2, it will suffice to show that we never have for any : that is, for no type can we have . In this section we shall establish this for the case when is a pure type (where and ); this will introduce many of the key ingredients in a relatively uncluttered form, in preparation for the general case which we treat in Section 4.

We assume that the reader is familiar with the language and the associated notion of observational equivalence, and knows how to set up a version of with product types and the single base type . We shall write for the extension of with an ‘oracle constant’ for every (classical) partial function .

We also assume familiarity with the nested sequential procedure (NSP) model for as presented in [7, Chapter 6] or [6], and with the notation and terminology used there. We write for observational equivalence of NSPs, and for the observational preorder on them. As it stands, the model does not have product types, but this is not an essential limitation. Indeed, it is well-known that any type may be converted to a trivially isomorphic type in curried form — that is, one of the form where each is -free — in such a way that any -free is its own curried form. For a general type with curried form , one may therefore simply define the set to be the product .

We shall in fact show something a little stronger than the non-existence of a retraction. The following concepts will be useful:

###### Definition 5

(i) We say is a pseudo-retract of , and write , if there are closed terms and such that , where is the observational preorder on terms.

Equivalently, in terms of sequential procedures, we may say that if respectively have curried forms and , then a pseudo-retraction consists of sequential procedures

 zσ00,…,zσl−1l−1 ⊢ t0:τ0,…,tm−1:τm−1,      xτ00,…,xτm−1m−1 ⊢ r0:σ0,…,rl−1:σl−1

such that for each we have . We say is a pseudo-retract of if such a pseudo-retraction exists; the standard theory of sequential procedures implies that this agrees with the definition via terms.

(ii) A pseudo-retraction as above is strict if : more formally, if for all we have .

(iii) A pseudo-retraction is left-strict with respect to if .

Although we shall not always bother to distinguish between different ways of bracketing complicated product types, it is important to note that the concept of left-strictness is defined relative to a certain way of dividing up the product type on the right-hand side—more specifically, relative to the identification of as the ‘first’ component of the product. If is empty, then of course left-strictness coincides with strictness.

Our goal in this section will be to prove:

###### Theorem 6

For any , the type is not a pseudo-retract of .

This will follow readily from:

###### Lemma 7

Suppose , and is any sequence of types of level . Then any pseudo-retraction must be left-strict with respect to . More formally, given any NSPs

 zk⊢t:¯¯¯k,      zk⊢ui:ρi  (i

such that , we must have that .

We formulate the lemma in terms of a finite sequence of types rather than just a single type of level so as to cater smoothly for the case , when is simply .

To see that the lemma implies the theorem, suppose we had a pseudo-retraction comprised by

 z′:¯¯¯k,y′:N ⊢ t:¯¯¯k,      x′:¯¯¯k ⊢ p:¯¯¯k, q:N

This gives rise to a pseudo-retraction comprised by

 z′:¯¯¯k ⊢ t′≡t[y↦λ.0]:¯¯¯k,     x′:¯¯¯k ⊢ p:¯¯¯k

To see that this is non-strict, we note that , but that since whereas . This implies that , contradicting Lemma 7 in the case . (This argument actually shows that if our language were extended with the unit type , then even would not be a retract of .)

The proof of the lemma itself will be modelled largely on the proof of [7, Theorem 7.7.1] (see also [6, Theorem 12] for a slightly improved exposition); we shall refer to this below as the ‘standard proof’. We reason by induction on .

The case is trivial: here we must have since there are no types of level , so it suffices to note that if then for some , hence is not invertible.

Suppose then that where the lemma holds for , and suppose we have

 zk⊢t:¯¯¯k,      zk⊢→u:→ρ,      xk,→y→ρ⊢r:¯¯¯k

where , . We wish to show that .

Let , so that .

Claim 1: has the syntactic form , where .

Proof (transcribed from standard proof): Clearly does not have the form or , and the only other alternative form is . In that case, however, we would have

 ≪v[z↦λwk−1.0]≫⋅⊥k−1 = ⊥,

contradicting . This establishes Claim 1.

Now let denote the ‘dummy substitution’ .

Claim 2: , or equivalently .

Proof (adapted from standard proof): By the NSP context lemma, it will suffice to show that for any and . (Here and in what follows, the application to should be omitted in the case .) So suppose whereas for some . Take , so that whenever . Then by the context lemma, so we have since is maximal in . By the definition of , it follows that , whence , whereas , contradicting . This completes the proof of Claim 2.

Now consider the head reduction sequence

 z ⊢ r[x↦t,→y↦→u]  ⇝∗h  λfk−1.casezPof(⋯),

where . The head on the right hand side will have some ancestor within ; and since is not free in , this must appear as the head of some application within either or .

Case 1: comes from , say from . Since where , all bound variables within are of level . Let be the list of bound variables of in scope at the critical occurrence of . Then as in the standard proof, by tracking the subterm through the above reduction sequence, we easily see that for some meta-terms . So we have

 fk−1 ⊢ p′∗[→y′↦→T′∗] = P∗ ≈ p∗ ⪰ fη.

This exhibits as a retract of some finite product of level types , contradicting Theorem 7.7.1 of [7]. Alternatively, it contradicts the induction hypothesis of the present proof, since we can easily extend this to a retraction where (or just in the case ).

Case 2: comes from . Write as , where has type , but all variables bound within are of level . Let be the list of bound variables of that are in scope at the critical occurrence of . Then as above, we have that for some meta-terms . So we have

 fk−1 ⊢ p′∗[x′↦T′∗,→y′↦→U′∗] = P∗ ≈ p∗ ⪰ fη.

Let denote the product of the types of the . Then , and the above constitutes a pseudo-retraction , where is given by , by , and by .

By the induction hypothesis, this pseudo-retraction is left-strict w.r.t. : that is, , or more formally . To show that this implies that our original pseudo-retraction is left-strict, we require a further argument that did not feature in the standard proof.

Claim 3: has head variable , whence .

Proof: Since somewhere contains the application , is not a constant procedure, and the only other alternative is that has the form (omitting in the case ). By tracking the transformation of to through the head reduction sequence for , we now see that this sequence must have contained reductions

 λf.caset0T′of(⋯) ≡ λf.case(λx′.casex′qof(⋯))T′of(⋯) ⇝h λf.case(caseT′q′of(⋯))of(⋯) ⇝h λf.caseT′q′of(⋯),

where . Specializing via , we obtain

 r[x↦t∗,→y↦→u∗]  ⇝∗h  λf.caseT′∗q′∗of(⋯)

where . But , so can only be the procedure . Thus the subterm above evaluates to , and so itself must evaluate to some numeral, say . Finally, specializing to , we have , so , contrary to what was established above by the induction hypothesis. (For the case , the references to should of course be deleted.)

The second part of the claim follows trivially, giving the desired left-strictness of . This completes the proof of Lemma 7, and hence of Theorem 6.

## 4 Non-encodability in the general case

We now wish to generalize the above proof to show that:

###### Theorem 8

The type is not a pseudo-retract of for any .

This will follow from a lemma proved by induction on the ordinal rank of . In the general setting, however, this lemma will need to be formulated somewhat more subtly than Lemma 7. Motivated by the arguments of the previous section, we introduce the following concept:

###### Definition 9

Suppose and are lists of -free types. A quasi-retraction consists of terms

such that we have a head reduction sequence

 z ⊢ r[x↦u]  ⇝∗h  λ→f→τ.casez→Pof(⋯)

where .

The reader will see that a situation very close to this appeared in the course of the proof of Lemma 7. As we will shortly see, a quasi-retraction gives rise to a pseudo-retraction where the are in some sense ‘lower’ types. Nevertheless, in the general setting, the existence of a quasi-retraction turns out to afford a more suitable induction hypothesis than the existence of such a pseudo-retraction, since the former implicitly imposes some useful additional constraints on how these components will behave.

We shall often identify a list of types with the corresponding product type . Thus, if and are any types in curried form, respectively the products of the lists and , then we may refer to a quasi-retraction also as a quasi-retraction .

In the situation of Definition 9, the head on the right-hand side will originate from some application subterm within . Let be the list of bound variables of in scope at the point of this subterm’s occurrence. These will consist of the top-level bound variables of , of types (we call these the major variables of ), plus possibly some others, say of types (we call these the minor variables). The latter will be associated with applications within that contain the critical occurrence of . (Note that none are associated with applications , since the outer would then prevent the critical from emerging as the head variable.) It follows that if the types are all of level , then the types are all of level ; however, there may be types among that are higher than some among .

We also have in this situation that for some meta-terms . Writing for the substitution , it follows that , so that exhibit a pseudo-retraction . We call this the associated pseudo-retraction of the quasi-retraction , and refer to the and as its major and minor components respectively.

###### Definition 10

We say a quasi-retraction is left-strict with respect to if the associated pseudo-retraction is left-strict with respect to . In the case that is empty, we may also say simply that such a quasi-retraction is strict.

Once again, we note that the notion of a left-strict quasi-retraction is defined relative to some identification of the ‘first’ component of the product on the right hand side.

It will be helpful to know that any quasi-retraction can be replaced by an ‘equivalent’ one of a more restricted form. Specifically, we shall say a quasi-retraction is simple if contains just a single free occurrence of , and this is at the head of