1 Introduction
In type theory and programming languages, recursive types are types where the variable bound by in stands for the entire type expression again. The relationship of a recursive type to its onestep unrolling is the basis for the important distinction of iso and equirecursive types [Crary+99] (see also Section 20.2 of [pierce02]). With isorecursive types, the two types are related by constanttime functions and which are mutual inverses (composition of these two in any order produces a function that is extensionally the identity function). With equirecursive types, the recursive type and its onestep unrolling are considered definitionally equal, and unroll and roll are not needed to pass between the two.
Without restrictions, adding recursive types as primitives to an otherwise terminating theory allows typing of diverging terms. For example, let abbreviate . Then, we see that is equivalent to , allowing us to assign type to . From that type equivalence, we see that we may also assign type to this term, allowing us to type the diverging term .
Diverging terms usually must be avoided in type theory to retain soundness of the theory as a logic under the CurryHoward isomorphism [Sorensen06]. The usual restriction on recursive types is to require that to form (alternatively, to introduce or to eliminate) , the variable must occur only positively in , where the functiontype operator preserves polarity in its codomain part and switches polarity in its domain part. For example, occurs only positively in , while occurs both positively and negatively. Since positivity is a syntactic condition, it is not compositional: if occurs positively in and in containing also variable , this does not mean it will occur positively in (the substitution of for in ). For example, take to be and to be .
In search of a compositional restriction for ensuring termination in the presence of recursive types, [matthes98, matthes02] investigated monotone isorecursive types in a theory that requires evidence of a property of monotonicity equivalent to the following property of a type scheme (where the center dot indicates application to a type):
In Matthes’s work, monotone recursive types are an addition to an underlying
type theory, and the resulting system must be analyzed anew for such properties
as subject reduction, confluence, and normalization. In the present paper, we
take a different approach by deriving monotone recursive types within an
existing type theory, the Calculus of Dependent Lambda Eliminations
(CDLE) [stump17, stump18c]. Given any type scheme F
satisfying a form
of monotonicity, we show how to define a type Rec ·F
together with
constanttime functions recRoll
and recUnroll
witnessing the
isomorphism between Rec ·F
and F ·(Rec ·F)
. The definitions are
carried out in Cedille, an implementation of CDLE. The main benefit to this
approach is that the existing metatheoretic results for CDLE – namely,
confluence, logical soundness, and normalization for a class of types that
includes ones defined here – apply, since they hold globally and hence perforce
for the particular derivation of monotone recursive types.
Recursive representations of data in lambda calculi
One important application of recursive types is their use in forming inductive datatypes, especially within a pure type theory where data must be encoded using λexpressions. The most wellknown method of lambda encoding is the Church encoding, or iterative representation, of data, which produces terms typable in unextended System F. The main deficiency of Churchencoded data is that data destructors, such as predecessor for naturals, can take no better than linear time to compute [parigot89, SU99_TypeFixpointsIterationRecursion]. As practical applications of Cedille’s derived recursive types, we derive generically two recursive representations of data (described by [parigot1992, parigot89]), the Parigot encoding and the Scott encoding, for which efficient destructors are known to exist (see [SF16_EfficiencyofLambdaEncodingsinTotalTypeTheory] for discussion of the efficiency of these and other lambda encodings). For both encodings, we derive also a recursion scheme and induction principle. That this can be done for the Scott encoding in CDLE is itself quite a surprising result that builds on the derivations by [lepigre+19, parigot88] of a strongly normalizing recursor for Scott naturals in resp. a Curry style type theory and logical framework.
Overview of this paper.
We begin the remainder of this paper with a short introduction to CDLE (Section 2), before proceeding to the derivation of monotone recursive types (Section 3). Presentation of the applications of recursive types for deriving inductive datatypes with lambda encodings follows a common structure: Section 4 covers Scott encodings by first giving a concrete derivation of naturals with a weak induction principle, then the fully generic derivation; Section 5 gives a concrete example for Parigot naturals with the expected induction principle, then the fully generic derivation, and some important properties of the generic encoding (proven within Cedille); and Section 6 revists the Scott encoding, showing concretely how to derive the recursion principle for naturals, then generalizes to the derivation of the standard induction principle for generic Scottencoded data, and shows that the same properties hold also for this derivation. Finally, Section 7 concludes by discussing related and future work. All code and proofs appearing in listings can be found in full at https://github.com/cedille/cedilledevelopments/tree/master/recursiverepresentationofdata.
2 CDLE, Cedille, and Lambda Encodings
The Calculus of Dependent Lambda Eliminations (CDLE), implemented in the Cedille proof assistant, is a logically consistent constructive type theory based on pure lambda calculus [stump17]. Datatypes supporting an induction principle are derived within CDLE via encodings like the wellknown Church encoding. Geuvers proved that such derivations are impossible in pure secondorder dependent type theory [geuvers01]. To overcome this fundamental limitation, CDLE extends the Calculus of Constructions with three new type constructs (see below). Using these, induction was first derived for Churchencoded natural numbers [Stu18_FromRealizibilitytoInduction]. Subsequently, derivations were carried out generically, both for the Church encoding and for the less wellknown Mendler encoding: given a type scheme satisfying certain properties, the inductive type with its (categorical) constructor, destructor, and induction principle was derived [firsov18b, firsov18].
Because it does not incorporate a datatype subsystem, a core version of Cedille (“Cedille Core”) may be described very concisely, in 20 typing rules, occupying half a page [stump18b]
. These have been implemented in less than 1000 lines of Haskell in a checker that comes with Cedille. Cedille itself checks code written in a higherlevel language, including support for inductive datatypes and a form of patternmatching recursion, which elaborates down to Cedille Core.
We recapitulate the core ideas of Cedille. CDLE is an extrinsic (aka Currystyle) type theory, whose terms are exactly those of the pure untyped lambda calculus, with no additional constants or constructs. Cedille has a system of annotations for such terms, which contain sufficient information to type terms algorithmically. But these annotations play no computational role, and are erased both during compilation and by definitional equality. The latter is the congruential extension of equality on erased terms and (at present) just equality on types.
CDLE extends the (Currystyle) Calculus of Constructions (CC) with three constructs: implicit products, primitive heterogeneous equality, and dependent intersection types. Figure 1 shows the typing rules for annotated terms, for these constructs. The erasures of these annotations are given in Figure 2. In more detail, the additional constructs are:
The implicit product type
∀ x: T. T’
of
[miquel01]. This can be thought of as the type for
functions which accept an erased (computationally irrelevant) input x
of type T
, and
produce a result of type T’
. There are term constructs
Λ x. t
for introducing an implicit input x
, and
t t’
for instantiating such an input with t’
. The
implicit arguments exist just for purposes of typing. They
play no computational role, and indeed definitional equality is
defined only for erased terms (no implicit introductions or eliminations).
When x
is not free in T’
, we allow ∀ x: T. T’
to be written as T ➾ T’
, similarly to writing T ➔ T’
for Π x: T. T’
.
An equality type
{t₁ ≃ t₂}
on untyped terms. The terms
t₁
and t₂
must have no undeclared free variables, but need not
be typable. We introduce this with the term β<t>{t’}
, which proves
{t ≃ t}
and erases to (the erasure of) t’
. If omitted, t’
defaults to λ x. x
. Combined with definitional equality, β
can
be used to prove {t₁ ≃ t₂}
for any equal t₁
and
t₂
whose free variables are all declared in the typing context. By
allowing the term to erase to any closed (in context) term, we
effectively add a top type to the language, since every term proves a true
equation. We dub this the Kleene trick, as one may find the idea in
Kleene’s later definitions of numeric realizability, where any number is
allowed as a realizer for a true atomic formula [kleene65].
We eliminate the equality type by rewriting, using the
construct ρ q  t
.
If the expected type of the expression ρ q  t
is , and
q
proves {t₁ ≃ t₂}
, then t
is checked against a type
produced by replacing all occurrences of (terms convertible with) t₁
with t₂
.
For convenience, the Cedille tool also implements an enhanced variant of
rewriting invoked by ρ+
where the expected type T
is successively
reduced and, for each reduction, the resulting type has all occurrences of
t₁
replaced by t₂
.
The construct φ q  t₁{t₂}
casts a term t₂
to type
T
, provided that t₁
has type T
and q
proves {t₁ ≃ t₂}
. The term φ q  t₁{t₂}
erases to
t₂
. This is similar to the direct computation rule of
NuPRL (see Section 2.2 of [allen+06]).
The dependent intersection type
ι x: T. T’
of
[kopylov03]. This is the type for terms t
which can be
assigned both the type T
and the type [t/x]T’
, the substitution
instance of T’
by t
. In the annotated language, we introduce a
value of ι x: T. T’
by construct [ t, t’ @ x.T]
, where t
has type T
, t’
has type [t/x]T’
, and the
erasure t
is definitionally equal to the erasure t’
. The
annotation @ x.T
serves to specify the desired unsubstitution of the type
of the second component. The T
or [t.1/x]T’
view of a term
t
of type ι x: T. T’
is selected with t.1
and t.2
,
respectively.
We give two of the main metatheoretic results of CDLE. For the full definition of the theory including kinding rules for types, as well as a semantics for types and proofs of the following theorems, see [stump18c]:
Theorem 1 (Logical consistency)
There is no term such that .
Theorem 2 (Callbyname normalization of functions)
Suppose , is closed, and there exists a closed term which erases to and whose type is for some . Then is callbyname normalizing.
In the code below, we elide annotations on the introduction forms for equalities and for dependent intersections, as they are inferred by Cedille. Cedille also infers many type arguments to functions; where needed, they are written with center dot.
3 Deriving Recursive Types in Cedille
To derive recursive types in Cedille, we implement a proof of Tarski’s fixedpoint theorem for monotone functions over a complete lattice. We recall here just the needed simple corollary of Tarski’s more general result (cf. [lassez82]).
3.1 Tarski’s Theorem
Theorem 3 ([tarski55])
Suppose is a monotone operation on complete lattice . Let and . Then .
First prove . For this, it suffices to prove for every , since this will imply that is a lower bound of . Since is the greatest lower bound of by definition of complete lattice, any other lower bound of (i.e., ) must then be less than or equal to . So assume . We have since is a lower bound of . By monotonicity of , we then obtain . Since , we have , and by transitivity of we obtain . From this, we obtain (hence showing both inclusions and thus equality of and ): from we obtain by monotonicity. Thus, is in , and hence .
Notice in this proof prima facie impredicativity: we pick a fixedpoint of by reference to a collection which contains . We will see that this impredicativity carries over to Cedille. We may also observe that, actually, the above proof applies directly to show the following stronger statement (stronger because it holds with weaker assumptions):
Theorem 4
Suppose is a monotone operation on a preorder , and that the set has a greatest lower bound . Then and .
We will need this strengthening – that need not form a complete lattice – to translate the proof to Cedille. But first, we must answer several questions:

how should the ordering be implemented;

ho do we express the idea of a monotone function; and

how should the meet operation be implemented?
One possibility for these that is available in System F is to choose functions as the ordering , and positive type schemes (having a free variable , and such that implies ) as monotonic functions. This approach, described in e.g. [Wad90_RecursiveTypesforFree], is essentially a generalization of order theory to category theory, and recursive types are defined using the Church encoding. However, recursive types so derived in System F lack the crucial property that roll and unroll are constanttime operations. Before we consider the alternative choices for these used in this paper (Section 3.3), we must first introduce the (derived) notion of a cast in Cedille.
3.2 Casts
A cast is a function from to that is provably (intensionally) equal to λ x. x (cf. [breitner+16], and [firsov18b] for the related notion of “identity functions”). With types playing the role of elements of the preorder, existence of a cast from types to will play the role of the ordering in the proof above. Let us now walk through the definitions given in Figure 3.
This first definition from Figure 3 makes Cast ·A ·B
the
type of terms c
which are both functions from A
to B
and
also witness the fact that they are equal to the identity function. Thanks to
the Kleene trick any term can witness a true equality, so requiring that
c
witness that it is equal to λ x. x does not restrict the terms
that can be casts
In intrinsic type theory, there would not be much more to say:
identity functions cannot map from to unless and are convertible
types. But in an extrinsic type theory like CDLE, there are many nontrivial
casts. For example, (assuming types List
and Bool
) we may map from
∀ A: ★. List ·A
to List ·Bool
using the function
λ l. l ·Bool
. This function erases to λ l. l
, and hence is indeed
a cast. For another example, we may cast from ι x: A. B
to A
using
the function λ x. x.1
. This function also erases to λ x. x
and
hence is also a cast.
Next from Figure 3: if we have a Cast ·A ·B
, the
eliminator elimCast
allows us to convert something of type
A
to something of type B
. This may seem unsurprising,
since something of type Cast ·A ·B
is a function from A
to B
. So of course one can turn an A
into a B
,
just by applying that function.
But this is not how the definition of elimCast
works. The cast
itself is an erased input to elimCast
; (using the erasedargument arrow),
so elimCast
cannot simply apply that function
to turn an A
into a B
. Instead, we use the φ
construct (strong
direct computation). The term φ (ρ c.2  β)  (c.1 a) { a }
in
the body of the definition of cast
erases to a
. But it
has type B
, the same type as c.1 a
, because we can prove
{c.1 a ≃ a}
given that c
equals λ x. x
.
This proof is the first subterm of φ
(i.e., ρ c.2  β
).
Note also that elimCast
itself erases to λ a. a
, because
the φ
term erases to a
, and the Λ
abstractions
are all erased.
Next from Figure 3: intrCast
takes in a function f
from
A
to B
, together with a proof that this function is extensionally
the identity (expressed by Π a: A. {f a ≃ a}
). These arguments are both
erased. Given these, intrCast
produces a cast from A
to B
as
follows. The cast has two parts, introduced with the squarebracket notation for
dependent intersections:

a function from
A
toB
, and 
a proof that this function equals
λ x. x
.
One would think that the proof (e
in the code) that
f
is extensionally the identity should be incorporated in the second
part. The trick is to incorporate it in the first: the function we write from
A
to B
is
x. (e x)  (f x) {x}
This function takes in x
of type A
and just returns it,
using the proof e
in the φ
term to show that f x
, which has
type B
as desired, equals just x
. This function erases to
λ x. x
, and is thus trivially shown by β
in the second component
of the proof to be intensionally the identity. As an aside, recall that by
default β
erases to λ x. x
, so the two components of the
squarebracket term indeed have the same erasure as required. So, even though
Cedille lacks function extensionality, we may still define casts extensionally.
Finally, we may compose casts, and every type has an identity cast
(castTrans
and castRefl
in Figure 3). Thus, we may
think of Cast
as a partial order on types, and it is with respect to this
order that we may express monotonicity. Furthermore, Cast harmonizes
with the notion of preorder: for any types A and B, there can
exist at most one Cast ·A ·B, just as in a preorder there is at most
one way in which .
3.3 Translating the proof of Theorem 4 to Cedille
Figure 4 shows the translation of the proof of
Theorem 4 to Cedille, deriving monotone recursive
types. Cedille’s
module system allows us to parametrize the module shown in
Figure 4 by the type scheme F
. Monotonicity of F
is expressed
with respect to the partial order induced
by Cast
:
Mono = X: . Y: . Cast X Y Cast (F X) (F Y).
As noted in Section 3.1, it is enough to require that the set of
closed sets has a greatest lower bound. Semantically, the meaning of an
impredicative quantification ∀ X: ★. T
is the intersection of the
meanings (under different assignments of meanings to the variable X
) of
the body. Such an intersection functions as the greatest lower bound, as we will
see. The definition of Rec
in Figure 4 thus expresses the
intersection of the set of all F
closed types X
. This Rec
corresponds to q
in the proof of Theorem 4. Semantically,
we are taking the intersection of all those sets X
which are
F
closed. So, the greatest lower bound of the set of all closed
elements becomes the intersection of all F
closed types, where X
’s
being F
closed means there is a cast from F ·X
to X
. We
require just an erased argument of type Cast ·(F ·X) ·X
. By making the
argument erased, we express the idea that we are taking the intersection of sets
satisfying a property (being F
closed).
Next from Figure 4: recCast
implements the fact that if X
is
F
closed, then Rec ·F
is less than or equal to X
; it
corresponds to the first part of the proof above that for
any . The function recRoll
implements the part of the proof that
establishes . The function recUnroll
implements the
second part, that . It is there that the impredicativity
noted above appears. In recCast, casting the Rec ·F argument
d to the type X involves instantiating the type argument of
d to X; in recUnroll
, the chosen instantiation is
F ·(Rec ·F). This would not be possible in predicative type theory.
Since elimCast
erases (as noted in Section 3.2) to
λ a. a
, it is not hard to confirm by inspection what Cedille indeed
checks, that recRoll
and recUnroll
both erase to λ x. x
(proved using syntax _ for anonymous definitions), and are thus
constanttime operations. This makes the proofs recIso1
and
recIso2
trivial.
4 Scott encoding
As a first application of monotone recursive types in Cedille – and as a warmup for the more general derivations to come – we show how to derive Scottencoded natural numbers supporting a weak form of induction, where by “weak” we mean that the inductive hypothesis is only available as an erased argument. In contrast to the Church encoding, which identifies datatypes with their associated iteration scheme, is derivable in System F, and which suffers from lineartime destructors (such as predecessor for naturals [parigot89]), the Scott encoding supports constanttime destructors by identifying datatypes with their case scheme, but it is not known how to express the type of Scottencoded data in System F ([SU99_TypeFixpointsIterationRecursion] points towards a negative result). The Scott encoding was first described in unpublished lecture notes by Dana Scott [Sco62_ASystemofFunctionalAbstraction], and appears also in [parigot89, parigot1992] wherein it is referred to as a “recursive representation” of data, and Scottencoded naturals are referred to as “stacks.”
We illustrate the with a concrete example: Scottencoded naturals are defined in the untyped λcalculus by the following constructors:
In System F extended with recursive types, the type
can be given to Scott
naturals. The preceeding section provides the typelevel fixpoint operator
Rec
that allows stating this type in Cedille. This, along with the
definitions of several operations on, and weak induction principle for,
Scottencoded naturals is given in Figures 5 and
6 which we now describe in detail. In Section
6, we show that this weak form of induction can be used to
define a recursor and standard induction principle for Scott naturals.
4.1 Scottencoded naturals, concretely
NatF, zeroF, and sucF.
The scheme NatF is the usual impredicative encoding of the signature
functor for naturals. Terms zeroF
and sucF
are its constructors,
quantifying over the parameter N
; it is easy to confirm using the rules
of Figure 2 that these erase to the untyped constructors for
Scott naturals given above.
WkInductiveNatF and NatFI.
Next, and following a similar recipe to
[Stu18_FromRealizibilitytoInduction] for deriving inductive types in
Cedille, we define predicate WkInductiveNatF
(parameterized by a type
N
) over terms of type NatF ·N
. WkInductiveNatF ·N n
says
that, to prove P n
(for any property P
over NatF ·N
) for
the given n, it suffices to show certain cases for zeroF
and
sucF
. The case for sucF
is somewhat tricky: it says that for any
m
whose type is the intersection of types N
and NatF ·N
, we
may assume a proof that P
holds of m
(viewed as type
NatF ·N
) when showing P
holds of the successor of m
(viewed
as type N
). We are justified ex post facto in assuming that
m
has this intersection type by our future choice for the instantiation
of N
to Nat
,
defined further below in the figure. Notice also that the inductive hypothesis P m.2
is
erased, as functions defined over Scott naturals are only given direct
(computationally relevant) access the predecessor, not to previously computed results.
Finally, type
scheme NatFI
is formed as the intersection type of terms x
of type
NatF ·N
with proofs WkInductiveNatF ·N x
that x
is (weakly) inductive.
monoNatF and monoNatFI.
Term monoNatF is a proof that the type scheme NatF is
monotonic (Mono, defined Section 3.3).
Given types X and Y and a cast c between them, the
goal is to form a cast between NatF ·X and NatF ·Y, which in
turn is done by invoking intrCast with a function of type NatF
·X ➔ NatF ·Y, and a proof that this equal to λ x. x
. That the
functional argument has the desired type is straightforward to see, so consider its erasure,
which is λ n. n.
The bound occurrence of n in this erased term
can be expanded to λ z. λ s. n z s (the abstraction
over type Z is erased), the
bound occurrence of s expanded to λ r. s r, and
finally the coercion elimCast c may be inserted, as it does not change the
equivalence class of the erasure of the term.
So, the second argument to intrCast does indeed prove that the first is
extensionally equal to the identity (since it is intensionally so).
The definition of monoNatFI is more complex, as there are now in
parts of the definition of the type NatFI ·X bound occurrences of x: NatF ·X,
which must be coerced to type NatF ·Y (where again X and
Y are arbitrary types with a cast between them).
Since NatFI is defined as a dependent intersection, the body of the
first argument to intrCast is dependent intersection introduction,
where both components must be equal to the bound variable n.
That this is true for the first component is easy to verify given the erasure of
elimCast and of dependent intersection projection.
The second component again sees n expanded (the type argument
to n.2 is erased, and so is abstraction over P), and the bound variable
s is expanded. The bound variable r has type
ι x: X. NatF ·X, easily coerced to type ι x: Y. NatF ·Y.
Finally, the type argument to n.2 is the kindcoercion of predicate
P: NatF ·Y ➔ ★ to a predicate of kind NatF ·X ➔ ★ (type Y
occurs
contravariantly in the kind of P
), ensuring the whole expression is welltyped.
The process of deriving monotonicity proofs such as monoNatF and monoNatFI is rather mechanical once the general idea is understood, so we omit such definitions from the remaining code listings.
Nat, zero, suc, and caseNat.
In Figure 6, the type Nat
is defined as a fixpoint of
type scheme NatFI
, with
its associated rolling and unrolling operations, constructors, and predecessor
function. If we consider now the assumed type of the predecessor in the
successor case of WkInductiveNatF
, we may confirm that a term of type
Nat
also has type NatF ·Nat
(as unrollNat
is defined by a
cast, and every NatFI ·Nat
is also a NatF ·Nat
). Concerning the
constructors: for zero
it is easy to see that the two components of the
intersection have the same erasure, as required; in the definition of
sucFI
, the λbound s
is given one relevant argument [ n ,
(unrollNat n).1 ] (definitionally equal to n
by erasure and that
unrollNat is defined by a cast) and one irrelevant argument (a
recursively computed proof of P n
), which gives us again that both
components of the intersection defining suc
have the same erasure.
Lastly, we define the casescheme caseNat for Scott naturals, and the predecessor pred in terms of it. Notice that pred enjoys the expected runtime behavior: pred (suc x) reducing to x in a constant number of steps for any x (we use the name _ for anonymous proofs):
_ x: Nat. { pred (suc x) x } = x. .
LiftNat and wkInductionNat.
To derive the weak induction principle for Scott naturals, we first define the
typelevel function LiftNat
that transforms predicates over Nat
to related predicates over NatF ·Nat
.
We require this because the proof
principle WkInductiveNatF ·Nat associated with each Nat only
supports proof of properties over NatF ·Nat
.
Furthermore, rollNat
will not give us this predicate transformation, as
it can only be used to convert terms of type NatFI ·Nat
(and not terms of
type NatF ·Nat
) to Nat
.
So, LiftNat ·P x
is the
type of witnesses that, when given some m
of type Nat
that is
equal to x
, proves P
holds for x
, where x
has been
cast to the type of m
using φ
(see Figure 1 for the
typing and erasure of φ).
In effect, the definition of LiftNat leverages the Currystyle typing
of our setting to condition our proof on the assumption that x
also
has type Nat.
The type given for wkInductionNat
is the expected type of an induction
principle for naturals, except that in the successor case the inductive
hypothesis P m
is given as an erased argument. In the body, we unroll the type
of n
, select the view of it as a proof of
WkInductiveNat ·Nat (unrollNat n).1
, and use this to prove
LiftNat ·P
by cases. In the base case, assumption z
suffices, as
its type is P zero
, convertible (by the erasure of φ) with the expected
type P (φ eq  m {zeroF})
. In the successor case, we use s
to
prove P (suc m.1)
(again convertible with the expected type), with the
second (erased) argument a recursively computed proof that P
holds for
m
. Finally, we must discharge the obligations introduced by
LiftNat
itself, so we provide some term of type Nat
(n
) and
a proof that it is equal to (unrollNat n).1
(provable by reflexivity:
unrollNat
erases to λ x. x
, and n.1
erases to n
).
With wkInductionNat so defined, we have the pleasing result that the
computational content of this proof principle for Scott naturals is precisely
that of the casescheme:
_ { wkInductionNat caseNat } = .
Example
We can use wkInductionNat to prove that the singlestep rebuilding of a natural n by constructors is equal to n:
_ n: Nat. { n n zero suc } = n. wkInductionNat n ( x: Nat. {x x zero suc}) ( m. pf. ) .
Notice that the inductive hypothesis pf goes unused, as the predecessor is not itself recursively rebuilt with constructors. The question arises whether including an erased inductive hypothesis adds any power over simple “proof by cases,” or more generally whether anything nontrivial can be computed from Scottencoded data in Cedille. We return to this question in Section 6, answering the affirmative in the form of a derivation of a recursion and standard induction principle for them.
4.2 Scottencoded data, generically
In this section we derive Scottencoded data with a weak induction principle. This derivation is generic, in the sense that it works for any functor . We begin with a general description of the iteration scheme and casescheme for datatypes. An arbitrary inductive datatype can be understood as the least fixpoint of a signature functor , with generic constructor (for example, constructors zero and suc of Figure 5 can be merged together into a single constructor inNat : NatFI ·Nat ➔ Nat). What separates inductive datatypes from the notion of (monotone) recursive types is that the latter need not be the least fixpoint. Within a type theory, this additional property translates to the existence of an iterator for satisfying the following typing and reduction rule (with the usual lifting operation of functions to that respects identity and composition of functions):
In category theory, this is captured by the notion of initial algebras. An algebra is an object (e.g., a type) together with morphism (e.g., a function), where is again understood to be a functor. The algebra is said to be initial when for every algebra there is a unique morphism such that , or equivalently that the following diagram commutes:
The iteration scheme (both its typing and computation law) for data in type theory is expressed in category theory as the guarantee of the existence of , and the induction principle is expressed as the uniqueness of (c.f. [JR11_Anintroductiontoalgebraandinduction] for further discussion on this correspondence).
The casescheme for datatype in type theory is a function (call this the discriminator for ) satisfying the following typing and reduction rule:
The casescheme on its own is not a common subject of study in the categorical semantics of datatypes, so we invent some terminology. Call an algebra discriminative if for any morphism there exists a unique morphism such that ; equivalently, that the following diagram commutes:
We are unaware of any standard nomenclature for , so call this the krisimorphism (from the Greek meaning judgment, decision).
Using the iteration scheme for data, the typing rule for the case scheme can be satisfied by assigning , iteratively rebuilding data with constructor . In category theory, the equality holds as a consequence of initiality, but in type theory this definition of the casescheme does not satisfy the desired reduction rule. This is, in fact, a more general statement of the problem of linear runtime for computing predecessor for Churchencoded naturals.
For the derivation in this section, we use a modification of the casescheme discussed above. This modification is due to subtle issues of alignment in Cedille – that is, ensuring that certain expressions are definitionally equal (after erasure) to each other. We describe this modification categorically: let denote the unitary product of object , with the unitary product of (for all such ). It is clearly an equivalent condition to say that is discriminative iff for any morphism there exists a unique morphism such that , as . To give informal intuition, in Cedille the unitary product provides space to “sneak in” an erased inductive hypothesis when defining the (weak) induction principle for datatype S.
Our generic derivation of Scottencoded data is defined directly in terms of discriminative algebras, resulting in an efficient casescheme. In particular, we make essential use of our derived recursive types to define in terms of triples (where ). The remainder of this section gives some preliminary definitions and details the generic derivation. Proofs of essential properties (in particular, that the discriminative algebra we define is also an initial algebra) are postponed until Section 6, wherein we derive the (nonweak) induction principle for datatype S.
4.2.1 WkSigma, Wrap, and Unit
In Figure 7 we define the unitary product type Wrap in terms of the more general WkSigma (in code listings, <..> denotes an omitted definition). Type WkSigma is analogous to the impredicative encoding of a dependent pair, except that its second component is erased, and so for example is suitable for tupling together subdata with erased inductive hypotheses. Its constructor intrWkSigma takes an erased second argument, its eliminator elimWkSigma expects as an argument a function whose second argument is erased, and while the first projection wkproj1 is easily definable, its second projection cannot be defined. For proofs: indWkSigma is an induction principle stating that, to prove a property P for some WkSigma ·A ·B, it suffices to show that the property holds of those weak pairs constructed using intrWkSigma; etaWkSigma proves that rebuilding a weak pair with its constructor reproduces the original pair. Type Wrap, then, is defined by setting the second type argument to constant function returning Unit, the type with a single element (Figure 8). The induction principles for WkSigma and Unit can be derived in Cedille following the methods of [Stu18_FromRealizibilitytoInduction], and the respective extensionality principles (etaWkSigma and etaUnit) follow from these.
4.2.2 Functors
We define Functor and the associated functor identity and composition laws in Figure 9. Additionally, we define an optional property FmapExt, where FmapExt fmap expresses a kind of parametricity property of fmap. Precisely, it states that if functions f and g are extensionally equal, so too are fmap f and fmap g. This condition is required for showing in Section 5.2.3 that the recursive algebra we shall derive is unique.
Notice also that our definition of the identity law FmapId has a bynow
familiar extrinsic twist: the domain X and codomain Y of the
lifted function c need not be convertible types in order for the
constraint {c x ≃ x} (for every x: X) to be satisfied.
Phrasing the identity law in this way allows us derive a useful lemma (Figure
10) monoFunctor which shows every type scheme
that is a Functor is monotonic (Mono, defined in Section
3.3), yielding the utility function fcast (which is
definitionally equal to λ x. x
):
4.2.3 Definition of generic datatype S
Figures 11 and 12 gives the definition of type S of generic Scottencoded data and some operators. We walk through these figures in detail.
Type families AlgS and Sf
are our first steps to defining the Scott encoding generically. AlgS corresponds to the family of triples we shall informally call Scottstyle pseudo algebras (with ). The definition of SF is similar to the standard definition of the least fixpoint of F in polymorphic λcalculi, but defined in terms of AlgS instead of usual algebras. Term monoSF is a proof that SF is monotonic.
PreS, PrfS, and preIn.
Type family PreS is a “predefinition” of the type of Scottencoded data. As with the concrete derivation of Scott naturals, ex post facto the definition of PreS is justified by the coming definition of datatype S, from which there will be a cast to the type PreS ·S. For any type S and predicate P: SF ·S ➔ ★, PrfS ·S ·P is the type of weak pairs of some x of type PreS ·S and proofs that P holds for x.2. Definition preIn is similarly a “predefinition” of the morphism component of a discriminative algebra; from the definition alone, it is clear that some preCase could be defined satisfying the desired computation law for the modified casescheme (though not yet the typing law). Monotonicity for PreS and PrfS is given by monoPreS and monoPrfS (definitions omitted) – the latter requires the extensionality principle for WkSigma to show that eliminating a weak pair with its constructor rebuilds the original weak pair.
WkPrfAlgS, WkInductiveS, and WkIF.
In Figure 12, WkPrfAlgS is a Scottstyle variant of the notion of a proof algebra, which itself was first described by [firsov18] as a dependently typed version of an algebra. For any type S and property P over SF ·S, WkPrfAlgS ·S ·P takes some xs in which all subdata (of type PreS ·S) are tupled together (using PrfS) with erased proofs that they satisfy P, and must return a proof that P holds for the value constructed (by preIn) from xs, after removing the WkSigma wrapping.
Weak inductivity predicate WkInductiveS ·S x is the property of some type S and x: SF ·S that, for all properties P: SF ·S, to show that P holds of x it suffices to give a weak proof algebra WkPrfAlgS ·S ·P. WkIF is, finally, the type scheme whose fixpoint is the datatype S we wish to derive; it is defined as a family (over a type S) of the intersection type of terms x of type SF ·S which themselves satisfy WkInductiveS ·S x. Term monoWkIF proves that this typescheme is monotonic.
S, rollS, unrollS, and toPreS.
Type S is the type of generic Scottencoded data, defined as a fixpoint of WkIF. Functions rollS and unrollS are the fixpoint rolling and unrolling operations for this type; because they are defined using casts, both are definitionally equal to λ x. x. This point is essential, as it allows us to define the cast toPreS from S to PreS ·S. In particular for any x of type S, (unroll x).1 has type SF ·S and is definitionally equal to x.
case, in, and out.
We can now define case, the discriminator for datatype S. In the body of the definition, (unrollS x).1 has type SF ·S (convertible with ∀ X: ★. AlgS ·S ·X ➔ X) and is given a suitable argument a.
For generic constructor in, the first component preIn (fcast toPreS xs) of the introduced intersection is straightforward. The second component requires a proof of WkInductive (preIn xs) (for clarity we omit the inserted type coercion fcast toPreS in the following discussion). So, after introducing P: SF ·S ➔ ★ and a: WkPrfAlgS ·S ·P we rewrite the expected type,
P (preIn xs) 
using the functor identity and composition laws to introduce additional wrapping and unwrapping, producing
P (preIn (fmap unwrap (fmap wrap xs))) 
Now we can invoke a on xs to produce a term of this expected type by first using fmap to produce, from the S subdata in xs, terms of type PrfS ·S ·P using intrWkSigma, where in particular the proofs of P are recursively computed, but erased. Because of this, and because wrap and intrWkSigma are definitionally equal, the two components of this intersection are indeed equal, and furthermore in is definitionally equal to preIn.
With the definitions of in and case, we have that both the expected typing and computation laws for our modified case scheme hold by definition (in Section 6, we show that the typing and computation laws for the usual case scheme hold by the functor laws). The definition of destructor out is relatively straightforward, using case and providing it a function that simply unwraps all subdata.
LiftS and wkInduction.
As with the concrete derivation of Scott naturals, before defining the weak form of induction for the Scott encoding we first define a typelevel function LiftS that lifts properties over S to properties over SF ·S, as the proof principle of a term of type S works only for the latter. LiftS ·P x is the type of functions which, given an erased m: S and erased proof eq that m is equal to x, returns a proof that P holds of x after casting this to the type of m. Then, wkInduction ·P a x proves the expected P x by invoking the proof principle of x (after unrolling it to type WkIF ·S) to prove LiftS ·P and providing the proof algebra a, a term of type S (x) and proof it is equal to (unrollS x).1; the given type of the body, P (φ β  x (unrollS x).1) is convertible with the expected type.
We describe the properties that hold of our generic derivation of Scottencoded data Section 6, where we derive their recursion and (full) induction principles.
5 Parigot encodings
In this section we derive inductive Parigotencoded naturals, showing with a concrete example in Section 5.1 the main techniques used for the generic derivation of inductive Parigotencoded data in Section 5.2. The Parigot encoding, first described by [parigot88, parigot1992] for naturals and later for datatypes generally in [Ge14_ChurchScottEncoding] (wherein it is called the ChurchScott encoding), combines the approaches of the Church and Scott encoding to allow functions over data both access to previously computed results and efficient (under callbyname operational semantics) access to immediate subdata. For example, in the untyped λcalculus Parigotencoded naturals are constructed as follows:
In System F extended with recursive types, the type can be given to Parigot naturals. The advantages of Parigotencoding data are offset by a steep increase in the size required to represent them, with the encoding of natural taking space. Additionally, another deficit is that the type given above is not precise enough as it admits terms formed by nonsense constructors such as .
As with Scottencoded data, the recursive types derived in Section 3 allows us to state the type of Parigot naturals in Cedille. However, the approach taken for our derivation of them differs from the derivation of Scott naturals: we will “bakein” to the definition of the type scheme the data’s reflection law, i.e. that recursively rebuilding numbers with their constructors reproduces the same number. One consequence of this bakingin is that it rules out nonsense constructors such as , leaving only the desired and . To accomplish this, we find it convenient to use the Kleene trick (Section 2) to define a type Top of all (wellscoped) terms of the untyped λcalculus; this is so that we may reason directly about the computational behavior of terms before they could otherwise be defined. The definition of Top is given in Figure 13.
5.1 Parigotencoded naturals, concretely
Figures 14 and 15 give the derivation of Parigot naturals supporting an induction principle. We describe this in detail.
zeroU, sucU, recNatU, and reflectNatU.
Definition recNatU gives the untyped recursion
principle for Parigot numerals, where the bound x
is interpreted as the base case, f
as
the inductive case taking both the previously computed results and a predecessor
value directly, and n
as the numeral to recurse over.
Following this are the untyped constructors zeroU
and sucU
.
Term reflectNatU is the untyped version of a function that
recursively rebuilds Parigot numerals with their constructors.
NatF and Nat.
We can now define NatF, the type scheme whose fixpoint is the type of “inherently reflective” Parigot naturals. It is defined by a dependent intersection as the type of terms n which have the expected type, and for which rebuilding with constructors produces the same n (thanks to the Kleene trick, if this equation holds then it is trivial for n to be a proof of it). We have that NatF is monotonic by monoNatF, which allows us to define the type of Parigotencoded data Nat as a fixpoint of NatF along with its rolling and unrolling operations (as they are defined by casts, both are equal to λ x. x).
zero and suc.
Next, we define the constructors
zero and suc for Parigot naturals.
The former is defined by
rolling of a term of type NatF ·Nat, for which the second component
β{zeroU}
proves that {reflectNatU zero ≃ zero}
, as this
equality holds βequivalence.
For the latter: in the first component of the
introduced intersection we compute the second argument to the λbound s
by unrolling the predecessor n and projecting out the view of it as
having type ∀ X: ★. X ➔ (Nat ➔ X ➔ X) ➔ X; in the second component, we
prove that for the first (which is definitionally equal to sucU n) the
reflection law also holds by rewriting with the proof that this holds for
n.
recNat and pred.
The recursion principle for Parigot naturals is given by recNat by simply invoking the first component of the (unrolled) number argument n on the base and step cases for recursion x and f. Notice also that recNat is definitionally equal to recNatU. We use recNat to define pred, an efficient (under callbyname operational semantics) predecessor which acts in the successor case by discarding the previously computed result and returning the previous number m directly. This is witnessed by the following definitionally true equality:
_ x: Nat. { pred (suc x) x } = x. .
InductiveNat and NatI
We now define InductiveNat (Figure 15), a predicate over Nat, with InductiveNat x stating that for all properties P: Nat ➔ ★, to show P x it suffices to first show P zero and to show P (suc m) assuming some m: Nat and a proof P m. Then, NatI is the type of naturals which are also themselves proofs that they satisfy InductiveNat. From this we can define the inductive variant of the constructors for Parigot naturals, zeroI and sucI (for the latter, in the second component of the introduced intersection we must show InductiveNat (suc n.1), which after abstracting over assumptions P, z: P zero, and s: Π m: Nat. P m ➔ P (suc m) this goal is discharged by invoking s and giving as a second argument n.2 z s of type P n.1). For both definitions, both first and second components are definitionally equal.
reflectNat, toNatI, and inductionNat.
The purpose of bakingin the reflection law in the definition of NatF is now realized in the definition of reflectNat, which recursively rebuilds its argument with the inductive variant of constructors. Since reflectNat is definitionally equal to reflectNatU, we can define the cast toNatI from Nat to NatI by using the proof associated with each n: Nat that {reflectNatU n ≃ n} and by using the full power of intrCast to produce a function intensionally equal to identity from one that is only extensionally equal to it. identity Finally, we define inductionNat, the induction principle for Parigot naturals, by simply casting the given Nat to NatI. Pleasingly, the computational content of inductionNat is precisely that of recNat:
_ { inductionNat recNat } = .
Examples
For our Parigot naturals, we can define iterative functions such addition (add), recursive functions such as a summation of numbers (sumFrom), and prove by induction that zero is a right identity of addition (addZRight), shown below:
add Nat Nat Nat = m. n. recNat n ( _. suc) m . sumFrom Nat Nat = recNat zero ( m. s. add (suc m) s). addZRight n: Nat. {add n zero n} = inductionNat ( x: Nat. {add x zero x}) ( m. ih. + ih  ).
5.2 Parigotencoded data, generically
In this section we derive inductive Parigotencoded data generically for a signature functor . The Parigot encoding identifies datatypes with their recursion scheme, which allows functions defined over data to access both the previously computed result (as in the iteration scheme given by the Church encoding) and all immediate subdata (as in the case scheme given by the Scott encoding). In type theory, that datatype supports a recursion scheme translates to the existence of a recursor for satisfying a certain typing and reduction rule: assuming is the usual functorial lifting of a function to , the product type of types and introduced with when has type and has type , and is the generic constructor of , the typing and reduction rules are:
Independently, [Ge92_InductiveCoinductiveTypesIterationRecursion] and [Me91_PredicativeUniversesPrimitiveRecursion] defined recursive algebras to give a categorical semantics for the recursion scheme for datatypes (see Section 4.2 for a brief discussion of algebras and initial algebras). An algebra is recursive if for any morphism there exists a morphism such that (where is the productforming morphism of the identity and ). This is depicted visually by the following commuting diagram:
If is an initial algebra, and in the underlying category all pairs of objects have products, then it is also recursive, with a paramorphism [Mee92_Paramorphisms] uniquely defined (up to isomorphism) by the catamorphism (iteration). However, and as with the case scheme, in type theory this definition of the recursion scheme suffers from same inefficiency as plagues the predecessor for Churchencoded naturals.
As with the Scott encoding, our generic derivation of the Parigot encoding avoids
this problem by being defined directly in terms of recursive
algebras, using recursive types to define in terms of
triples , resulting in efficient data accessors under callbyname
operational semantics.
We bakein to our encoding the paramorphic reflection law
(where is the second projection for any product ). In the generic development, the righthand side of this equation
describe a function reflectI which rebuilds data P to a type
IP supporting an induction principle; that the equation holds
allows us to define a cast toIP
from P to IP.
The remainder of this section gives some preliminary definitions, details the generic derivation, and outlines properties of the data so derived including a proof that the morphism is unique (that is, any other morphism making the diagram commute is extensionally equal to ), from which it is an easy corollary that is an initial algebra.
5.2.1 Sigma and Pair
Product type Pair and dependent product types Sigma (Figure 16) are derivable in Cedille via encodings. We do not describe this here as the approach is essentially the same as in [Stu18_FromRealizibilitytoInduction]. Instead, we simply give the type and kind signatures of the definitions we use, with <..> indicating omitted definitions.
5.2.2 Definition of generic datatype P
Figures 17 and 18 give the definition of generic Parigotencoded datatype P and the essential operations recursion, in, and out. We walk through these in detail.
recU, inU, and reflectU
express the untyped versions of future definitions rec, in, and reflectP, respectively, and are best understood when we explain those. For now, it suffices to say that recU expresses the computational content of the recursion scheme for our data, inU is the datatype’s generic constructor (equivalently, the morphism component of the recursive algebra), and reflectU recursively rebuilds data with its (generic) constructor inU.
AlgP, Pf, and P.
Definition AlgP a type family corresponding to the triples (with ) by which we shall define recursive algebras, and which we informally dub Parigotstyle pseudo algebras. PF is the type scheme whose fixpoint is the carrier of the recursive algebra. It is similar to the standard definition of the fixpoint of F in polymorphic λcalculi, with two important differences; first, it uses AlgP instead of usual algebras; second, we use a dependent intersection (and the Kleene trick) to bake in the reflection law. Term monoPF proves that PF is monotonic, which entitles us to define our datatype P as its fixpoint, along with operations rollR and unrollR (both of which are definitionally equal to λ x. x) using the recursive types derived in Section 3.
rec, in, and out.
Function rec is the datatype’s recursion scheme,
mapping a Parigotstyle pseudo algebra
to a function computing X from P. Its definition is
straightforward, as the first component of the intersection which defines
PF ·P
(which we get by unrolling x
) is a function which, given
some Alg ·P ·X (for any X) returns something of type
X. Notice also we have that rec and recU are
definitionally equal (the syntax _ is used to give an anonymous proof):
_ {rec recU} = .
The definition of in is more involved, and so is broken into three
parts. The first term inP1 computes from some xs: F ·P
an expression whose type is the first part of the intersection type defining
PF ·P.
Its definition comes directly from the leftthenbottom path of the commuting
diagram of the categorical definition of a recursive algebra.
Given some type X and term alg: AlgP ·P ·X, we first
fmap a function over xs that tuples together all subdata
(x) with recursively computed results (rec alg x
), producing
an expression of type F ·(Pair ·P ·X) which is then given as an argument
to alg.
The second definition reflectionInP1 proves that an instance of the reflection law holds for data constructed from inP1, as required for the second component of the intersection type defining PF ·P. The equation to be proved is
Recalling the definition of reflectU, we see that the lefthand side of this equation is equivalent to an (untyped) expression containing the composition of fmap snd with fmap (λ x. mkpair x (reflectU x)). We invoke the functor law to rewrite this as a single mapping of λ x. snd (mkpair x (reflectU x)), and reducing this transforms the goal into proving
Now, observe that all subdata of xs
are “inherently reflective”
(see the definition of PF
), meaning that mapping reflectU
over
xs should be (extensionally) an identity operation.
We finish the
proof by using this fact together with the functor identity law.
Finally, to define in we combine these two definitions via intersection
to form an expression of type PF ·P and use rollP to get an
expression of type P, where in the second component of the introduced
intersection we use the Kleene trick again to allow inP1 xs to have the
type {reflectU (inP1 xs) ≃ inP1 xs }
.
Notice also that our definition of
in is equivalent (after erasure) to inU:
_ {in inU} =
The last definition out is straightforward, computed by recursion over some term of type P: the Parigotstyle algebra we give takes some expression of type F ·(Pair ·P ·(F ·P)), simply selects the first component of the tupled results (the immediate subdata) and discards the second (the previously computed results). Under a callbyname operational semantics, the discarded computation will not be evaluated, ensuring that out incurs no unnecessary runtime penalty.
PrfAlgP, Inductive, and Ip.
In Figure
18 we define resp. Parigotstyle proof algebras, an
inductivity predicate Inductive for terms of type P, and the
type of terms x which both have type P and themselves realize
the predicate Inductive x. The notion of proof algebras was
first described by [firsov18] as a dependently typed version of an
algebra. For the Parigotstyle version of this, we have as carrier , a
property over , and the inductive hypothesis xs of type
F ·(Sigma ·P ·Q)
(wherein all subdata are paired together with proofs
that Q
holds of it). From this, it must be shown that Q
holds of
the data constructed by in
of xs
(after projecting out just the
subdata of xs
).
Type Inductive x
is a property saying that, for some particular x: P
,
in order to show Q x
(for any property Q
) it suffices to show
PrfAlg ·Q
. Type IP
is the type (formed by intersection) of data of
type P
that proves it is itself inductive.
inIP.
The constructor inIP for type IP is
also defined in three parts, one for each side of the intersection defining
IP
(inIP1 and inIP2, resp.), and one (inIP)
defined by combining these.
Definition inIP1
constructs an element of P
from xs: F ·IP
by simply casting xs
to type F ·P
and invoking in
.
For inIP2
we must show every P
constructed by inIP1
is also
inductive.
After introducing arguments xs
, Q
, and alg
(the proof
algebra), the goal is to prove Q (inIP1 xs)
.
We start by rewriting by the functor identity law to introduce an additional
mapping over xs which first tuples together subdata (coerced to type
P) with recursively computed results, then immeidately selects just the
subdata. The rewritten type is now:
Q (in (fmap (λ x: IP. proj1 (mksigma x.1 (x.2 alg))) xs)) 
Next, we separate this into two mappings over xs using the functor composition law:
Q (in (fmap proj1 (fmap (λ x: IP. mksigma x.1 (x.2 alg)) xs))) 
and this is the type of the expression alg (fmap (λ x:
IP. mksigma x.1 (x.2 alg)) xs) given in inIP2
.
Again, we note that inIP is definitionally equal to in
, as are
inIP1 and inIP2:
_ { in inIP } = .
reflectP and induction.
The derivation of the induction
principle for P concludes in three short steps. First, we define reflectP
,
a function constructing an element of IP
from the data
F ·(Pair ·P ·IP)
by recursively rebuilding with constructor
inIP. Since its erasure is equal to reflectU
, we may use it to
define a cast toIP
from P
to IP
using the bakedin reflection law of
P
. Then, the definition of induction
is simple: take the x
of type P
, cast it to IP
, select the view of this as a proof of
Inductive (unrollP x).1, and give as argument to this the
proofalgebra alg
. This leaves us with the pleasing result that the
computational behavior of induction is precisely that of rec:
_ { induction rec } = .
5.2.3 Properties of P
Our generically derived inductive Parigotencoded data satisfies the expected cancellation law, reflection law, Lambek’s lemma, and (conditional) uniqueness of the universal mapping property of the recursive algebra , and closed terms of type P are callbyname normalizing. Each of these has been proven within Cedille, shown in Figure 19.

Normalization is shown by norm and by appealing to Theorem 2: there exists a cast from P to some function type, so any closed annotated term t of type P is normalizing under a callbyname operational semantics.

The cancellation law proves that the diagram describing recursive algebras at the beginning of this section commutes, giving the computation of
rec
over data constructed byin
. 
The reflection law has been discussed already in the derivation. As it is built into the datatype, its proof is trivial.

Lambek’s lemma states that
out
andin
are mutual inverses;lambek1
holds by the functor laws alone, whereaslambek2
additionally requires the induction principle (in the proof the induction hypothesis is not used, merely dependent caseanalysis).
The second to last proof unique shows uniqueness of the universal
mapping property of the recursive algebra – that
is, for any Parigotstyle algebra a
with carrier X
, if there
is some other morphism h: P ➔ X
which makes the following diagram
commute:
then h
is (extensionally) equal to rec a
. Proof of this fact
requires an additional condition FmapExt ·F fmap
: that is, that invoking
fmap
with extensionally equal functions produces extensionally equal
functions (see Section 4.2.2 Figure 9 for the
definition).
Having this, it is easy to show under the same
condition that is an initial algebra: we
simply define the iterator (catamorphism) fold in terms of the
recursor (paramorphism) rec and appeal to unique.
5.2.4 Example: Parigotencoded lists
We conclude the discussion of Parigotencoded data by instantiating the generic derivation of Section 5.2 to define Parigotencoded lists. In doing so, we show that the expected induction principle for lists is derivable from the generic induction principle, and that the additional parametricity condition FmapExt required for showing initiality in Section 5.2.3 can be satisfied by simple datatypes.
We begin with a brief description Sum, the coproduct type in Cedille. Type Sum ·A ·B represents the disjoint union of types A and B, which can be formed either with a term of type A using in1 or a term of type B using in2. The induction principle indSum states that, in order to prove that a property Q: Sum ·A ·B ➔ ★ holds for some x: Sum ·A ·B, it suffices to show that Q holds of the coproduct constructed with either in1 or in2, given any argument suitable for these. The definitions of Figure 20 can be derived within Cedille using standard techniques (c.f. [Stu18_FromRealizibilitytoInduction]) so their definitions are omitted (indicated by <..>).
Figure 21 defines ListF, the signature functor for lists, in the standard way as the coproduct of the Unit type (Figure 8) and the product (Figure 16) of A and L, where A is the type of elements of the list and L is the standin for recursive occurrences of the datatype. The definitions of the functorial lifting of a function ListFmap and proofs that this respects identity and composition (resp. ListFmapId and ListFmapCompose) are omitted.
To show the additional constraint ListFmapExt on ListFmap, we assume two functions f and g of type X ➔ Y (for any X and Y), and a proof ext that for every x: X, we have {f x ≃ g x}. We further assume an arbitrary l: ListF ·A ·X, and must show that {ListFmap f l ≃ ListFmap g l}. This proof obligation is discharged with indSum: in the case the list is empty, there is no subdata of type X to invoke f or g on, and the proof is trivial; otherwise, we further invoke indPair, the induction principle for products, to reveal the head hd: A and tail tl: X of the list and appeal to ext to show that {in2 (mkpair hd (f tl)) ≃ in2 (mkpair hd (g tl))}.
Finally, we define the type List of Parigotencoded lists in Figure 22. The module in which it is defined takes a type parameter A for the type of elements of the list, and imports the generic derivation (qualified with module name P) with the definitions of Figure 21 instantiated with this parameter. Thus, List is defined directly as P.P (where the signature functor is ListF ·A). Constructors nil and cons are defined in terms of the generic constructor P.in (which expects as an argument some ListF ·A ·List), and the standard induction principle indList is defined (omitted) in terms of the generic P.induction as well as the induction principles for Sum and Pair.
6 Recursive Scott encoding
The induction principle wkInductionNat
we showed for Scott naturals in
Section 4.1 is weak.
In a proof of some property using this principle, in the case that that
natural number in question is of the form suc m
, the inductive hypothesis P m
is
erased and so cannot assist in computing a proof of P (suc m)
.
This situation reflects
the usual criticism of the Scott encoding: that it is not inherently iterative.
There does not appear to be any way, for example, to define a recursor over
Scott naturals of type:
X: . X (Nat X X) Nat X
Amazingly, in some settings this deficit of the Scott encoding is only apparent. [parigot88] showed how to derive using “metareasoning” a strongly normalizing recursor for Scott encoded naturals with a similar type to the one above. More recently, [lepigre+19] also showed how the same recursor can be given the above type in a Currystyle type system featuring a sophisticated form of subtyping utilizing “circular but wellfounded” typing derivations.
Knowing this, we revisit our earlier question from the end of Section
4.1: does access to an erased inductive hypothesis add any power
over mere proof by cases? We answer in the affirmative by showing how to
type the recursor for Scott naturals in Cedille using wkInductionNat
together with an ingenious type definition used by [lepigre+19] (therein
attributed to Parigot) for the same purpose.
The main idea behind this derivation of the recursor for Scott naturals is noticing that the untyped lambda terms encoding zero and successor for Scott naturals may be expanded in such a way that the interpretations for these constructors – the “base” and “step” cases of a function computed over a Scott natural – may be passed copies of themselves:
A usage based on this understanding should: ensure is a constant function ignoring its first two arguments and returning the intended result for the base case; and ensure is a function taking as arguments the predecessor and another copy each of and for making recursive calls invoked by .
The type that supports this intended usage for Scott naturals is far from obvious, but can be defined in (unextended, impredicative) System F. This is given as NatR in Figure 23, which is itself a minor generalization of the one presented by [lepigre+19]. We describe the figures in detail in Section 6.1; the definitions rely upon previous ones given in Section 4.1, Figures 5 and 6. In Section 6.2, we generalize this type definition (and thus the entire derivation) in two orthogonal ways, making it (1) generic, working for any datatype signature functor ; and (2) dependent, transforming the recursor into the standard induction principle.
6.1 Recursor for Scott naturals, concretely
NatRZ, NatRS, and NatR.
The type
definition of NatR is rather tricky, so we endeavor to provide some
intuition for its construction. Compared to the foregoing discussion, the Scott
naturals that NatRclassifies have been expanded once more so
that they may take themselves as arguments (at type Nat).
NatR
can be seen as a supertype of Nat
, a fact we shall
soon demonstrate by deriving a proof of Cast ·Nat ·NatR
. It relies on two
additional definitions: NatRZ
(a type family of “base cases” for
recursion); and NatRS
(a type family of “step cases”). In these two
definitions, quantification over Z
and S
is used to hide recursive
references to NatRZ
and NatRS
, respectively. The intended use of a
term of type NatR
is:

was produced as a type coercion of some of type
Nat
; 
its two arguments of type
NatRZ ·X
are copies of the same “base case” term; 
its two arguments of type
NatRS ·X
are copies of the same “step case” term; and 
its
Nat
argument is (that is, itself)
The functions zeroR
and sucR
give an operational understanding of
the subset of terms of type NatR
that are also Scott naturals. In
zeroR
(which by contraction and erasure is equal zero
) we
take two copies each of the base (z1
and z2
, both of type
NatRZ ·X) and step cases (s1
and s2
, both of type
NatRS ·X) and apply z1
to the second copy of each argument; for
the recursor we shall define, z1
will always be instantiated as a
constant function ignoring its arguments. In the definition of sucR
(which also contracts and erases to suc
) the λbound s1
expects first a term of type NatRZ ·X ➔ NatRS ·X ➔ NatRZ ·X ➔
NatRS ·X ➔ Nat ➔ X (which is the type of the given n ·X), and it is
also given the secondary copies z2
and s2
; in defining the
recursor, s1
and s2
(resp. z1
and z2
) will always be
instantiated with the same term, so in effect this gives s1
a way to make
recursive calls (via z2 and s2) at each predecessor by passing
down z2 and s2 as arguments to the predecessor n ·X,
potentially to be further duplicated.
toNatR’ and toNatR.
We now prove NatR
is a supertype of Nat
. The conversion function
toNatR’
takes some number n
and produces a term that both has type
NatR
and proves itself equal to n
; recall that the Kleene trick
(Section 2) allows any term to be a witness for a true equation.
The conversion function is defined using the weak induction principle for Scott
naturals: in the base case the goal is ι x: NatR. {x ≃ zero}
, readily
proven by [ zeroR , β{zero} ]
; in the successor case, the goal is to prove
ι x: NatR. {x ≃ suc m}
. In the first component of the intersection, we
make use of the erased induction hypothesis r
(whose type is
ι x: NatR. {x ≃ m}
) in order to cast m
to the type NatR
of
r.1 using φ and equation r.2, then apply sucR
to this, resulting
in an expression whose erasure is definitionally equal to the erasure of
suc m
. With toNatR’
we may readily define toNatR
, the cast
from Nat to NatR. In the definition, for the second argument
of intrCast
we assume some n
and must prove
{(toNatR n).1 ≃ n}
, which is given directly by (toNatR n).2
.
recNatBase, recNatStep, and recNat.
We
can now define the recursor recNat for Scott naturals. Helper function
recNatBase
takes some x, a result for the base case, ignores its
other three arguments, and simply returns x
. Function recNatStep
takes a base case x and function f
of type Nat ➔ X ➔ X,
and must produce an expression of type NatRS ·X
, which is itself a
polymorphic function type. The quantified type variables Z
and
S
in NatRS ·X hide resp. the occurrences of NatRZ ·X
and
NatRS ·X
in the types of n
(Z ➔ S ➔ Z ➔ S ➔ Nat ➔ X),
z
(Z
), and s
(S
); the last argument m of
type Nat is intended to always be instantiated as the successor of
n. We invoke f
on the predecessor of m
and the
recursively computed result produced by invoking n
on z
, s
,
and pred m.
Finally, we put these definitions together in recNat
. In its body, we
cast the natural argument n
to the type NatR
, and for arguments
provide it two copies each of recNatBase x
and recNatStep x f
, and a
copy of itself. Notice, for example, that if n
is nonzero then the first
recNatStep f
argument will be given a copy of itself (referred to by the
λbound s
in recNatStep
). The recursor recNat
satisfies the
desired computation laws recNatCompZ and recNatCompS by
definition (though only by βequivalence, and not βreduction alone).
Example
As with the Parigot naturals defined in Section 5.1, we can define for recursive Scott naturals iterative functions such as addition (add) and recursive functions such as a summation of numbers (sumFrom):
add Nat Nat Nat = m. n. recNat n ( p. s. suc s) m. sumFrom Nat Nat = m. recNat zero ( p. s. suc (add p s)) m.
6.2 Full induction for Scottencoded data, generically
In this section, we generalize the technique used to derive a recursor for Scott naturals in the previous section in two orthogonal ways: making it generic in a functor , and making it dependent in order to support an induction principle. The code listing is given in Figures 24 and 25, which we walk through in detail.
IndS and PrfAlg’.
As we did for Scott naturals in Section 6.1, the first step towards defining recursion principle for type S is to define some family of types capturing the notion of a datatype taking two copies of interpretations for its constructors. This is done with the definition IndS, whose first parameter P is a property over S and whose second parameter Y shall always be instantiated with PrfAlg’ ·P, a proofalgebra variant which recursively refers to IndS itself. Comparing to Figure 23, Y should be understood as an algebraic “grouping together” of the quantified variables Z and S for the base and step cases of recursion on naturals (in PrfAlg’ we requantify over Y). As the goal is to find an appropriate instantiation for Y so that every x: S can be cast to the type Y ➔ Y ➔ P x, we make use of this ex ante observation to define IndS ·P ·Y as a dependent intersection of these two types.
The definition of PrfAlg’ (which corresponds to the type families NatRZ and NatRS together in Figure 23) describes a family of functions paramaterized by some property P over S; PrfAlg’ ·P quantifies over Y (hiding recursive occurrences of PrfAlg’ ·P), assumes some collection of subdata xs of type F ·(Wrap ·(IndS ·P ·Y)), and take an additional Y argument (to be given to subdata for further recursive calls), and will return a proof that P holds for the data constructed with in of xs (after unwrapping and projecting out the view of the subdata as having type S with unwrapFIndS).
InductiveS and I.
With predicate InductiveS, we commit to the instantiation of PrfAlg’ ·P (generalizing over P) for the parameter Y of IndS. With datatype I, we make the nowexpected step of forming a dependent intersection type of terms x: S which also prove themselves InductiveS; it is clear that I is isomorphic to the type ∀ P: S ➔ ★. IndS ·P ·(PrfAlg’ ·P). Finally, it is easy to define the cast fromI converting terms of type I to S.
Constructor inI.
Next, we define the generic constructor inI for recursive Scott naturals. Given some collection xs of type F ·I (easily cast to the type F ·S), in the second component of the intersection defining inI we must show InductiveS (in xs). To do this, for any P we assume a1 and a2 of type PrfAlg’ ·P, and now goal is to show P (in xs). Using the functor identity law, we rewrite this to
P (in (fmap (λ x. unwrap (wrap x)) xs)) 
Then, using the functor composition law this is further transformed to
P (in (fmap unwrap (fmap wrap xs))) 
convertible with the type of the given expression a1 (fmap (wrapIndS ·P) xs) a2:
P (in (unwrapFIndS (fmap wrapIndS xs))) 
In essence, we are exploiting definitional equality to exchange wrap and unwrap with the versions mentioned in the return type a1. These versions commit the proof principle of the second component of I to proving property P (a1 expects to work with subdata of type Wrap ·(IndS ·(PrfAlg’ ·P) ·P)). The last argument to a1 is a2; we shall always instantiate these with the same term, meaning a1 is given the capability of making recursive calls of itself via a2. Finally, notice that inI is definitionally equal to in.
toI’ and toI.
We now establish that I is also a supertype of S (fromI shows that it is a subtype, so the two types classify precisely the same set of terms). Conversion function toI’ (analogous to toNatR’ in Figure 23) uses the weak induction principle of S (and the Kleene trick, c.f. Section 2) to return from some x: S some I that is equal to x. Within the body of the proof, eq proves that x: S is equal to in (fmap unwrap xs), and the collection xs has subdata tupled together with erased proofs of the (lifted, see Figure 12) property that these are equal to terms of type I. Local definition mkI constructs from each such weak pair a term of type I (the Cedille construct introduces a local binding) – and, furthermore, is definitionally equal to unwrap (Figure 7), as it erases to λ p. elimWkSigma p (λ s. s). Given the underlying z: PreS ·S and erased proof ih: LiftS ·(λ x: S. ι y: I. {y ≃ x}) z, we show

z and ih must be equal (eqz); and

z must be InductiveS (ind, erasing to ih)
which together allows us to form a term of type I, where in the second component of the introduced intersection we use φ (Figure 1) to cast z to the type of ind via eqz.
This makes the definition of the cast toI easy, as toI’ returns in a single argument the components required from each argument to intrCast.
PrfAlg, fromPrfAlg, and induction.
Next, we show that we can convert any instance of the more mundane Parigotstyle proof algebra PrfAlg (described in Section 5.2.2) to a PrfAlg’. This is done with fromPrfAlg. First, we assume a: PrfAlg ·P, type Y, subdata collection xs: F ·(Wrap ·(IndS ·P ·Y)), and y: Y (our handle for making recursive calls with the PrfAlg’ ·P we are defining). The goal is now to prove:
P (in (unwrapFIndS xs)) 
which is convertible with the type:
P (in (fmap (λ x. proj1 (repackIndS y x)) xs)) 
Helper function repackIndS converts between the wrapping and unwrapping done by PrfAlg (which uses Sigma) and PrfAlg’ (which uses Wrap), tupling together each subcomponent with the recursively computed results (by providing the subcomponent with two copies of y). Finally, we rewrite the expected type by the functor composition law:
P (in (fmap proj1 (fmap (repackIndS y) xs))) 
which is the type of the given expression a (fmap (repackIndS ·P y) xs).
Having fromPrfAlg, defining the induction principle induction is straightforward. Given some PrfAlg ·P, cast the given s: S to type I and provide it with two copies of fromPrfAlg a. The recursion principle rec is even simpler, a nondependent usage of induction.
6.2.1 Properties of S
Our generically derived inductive Scottencoded data enjoys the same properties we showed for Parigotencoded data (Section 5.2.3, wherein they are further elaborated upon): callbyname normalization for closed terms (norm), the cancellation laws (now given for the standard formulation of the casescheme as well as for the recursion scheme), reflection law, and Lambek’s lemma. The second to last proof unique shows uniqueness of the universal mapping property of recursive algebra , from which it is an easy consequence that is an initial algebra. Each of these has been proven within Cedille (Figure 26).
We conclude the discussion of Scottencoded data by observing that particular datatypes can be defined using this generic derivation in almost exactly the same way as with the generic Parigot encoding. For instance, the definition Scottencoded lists proceeds as described in Figure 22 of Section 5.2.4, modulo module imports and name qualifications.
7 Related Work
Monotone inductive types.
In [matthes02], Matthes employs Tarski’s fixpoint theorem to motivate the construction of a typed λcalculus with monotone recursive types. The gap between this ordertheoretic result and the type theory is bridged by way of category theory, with the evidence that a type scheme is monotonic corresponding to the morphismmapping component of a functor. Matthes shows that as long as the reduction rule eliminating an unroll of a roll incorporates the monotonicity witness in a certain way, then strong normalization of System F is preserved by extension with monotone isorecursive types. Otherwise, he shows a counterexample to normalization.
In contrast, our approach can be characterized as an embedding of preoder theory within a type theory, with evidence of monotonicity corresponding to the mapping of a zerocost cast over a type scheme. As mentioned in the introduction, deriving monotone recursive types within the type theory of Cedille has the benefit of guaranteeing that they enjoy precisely the same metatheoretic properties as enjoyed by Cedille itself – no additional work is required.
Recursive algebras.
Our use of casts in deriving recursive types guarantees that the rolling and unrolling operations take constant time, permitting the definition of efficient data accessors for inductive datatypes defined with them. However, what is usually sought after is an efficient recursion scheme for such data, and the derivation in Section 3.3 does not on its own provide this. Independently, [Me91_PredicativeUniversesPrimitiveRecursion, Ge92_InductiveCoinductiveTypesIterationRecursion] developed recursive algebras to give a categorytheoretic semantics of the recursion scheme for inductive data, and [Ge92_InductiveCoinductiveTypesIterationRecursion, matthes02] use this notion in extending a typed λcalculus with typing and reduction rules for an efficient datatype recursor. In our generic derivation of Parigotencoded data, our weaker notion of recursive types (lacking as it is either a recursion or iteration scheme) is sufficient for defining datatypes directly in terms of recursive algebras, guaranteeing an efficient recursor.
Recursor for Scottencoded data.
The type definition used for the (nondependent) strongly normalizing recursor for Scottencoded naturals in Section 6.1 is due to [lepigre+19]. The type system in which they carry out this construction has builtin notions of least and greatest type fixpoints and a sophisticated form of subtyping that utilizes ordinals and “wellfounded circular proofs” in a typing derivation. Roughly, the correspondence between their type system and that of Cedille’s is so: both theories are Currystyle, enabling a rich subtyping relation, which in Cedille is internalized as Cast; and in defining recursor for Scott naturals, we replace the circular subtyping derivation with an inductive proof within Cedille itself that the subtyping relation holds. Section 6.2 generalizes their construction of an appropriate supertype for Scottencoded data by making it generic (in an arbitrary functor ) and dependent.
We leave as future work the task of providing a more semantic (e.g. categorytheoretic) account of the derivation of a recursor for Scottencoded data.
Lambda encodings in Cedille.
Work prior to ours describes the generic derivation of induction for lambda encoded data in Cedille. This was first accomplished by [firsov18] for the Church and Mendler encodings, which do not require recursive types as derived in this paper. In [firsov18b], the approach for the Mendler encoding was refined to enable efficient data accessors, resulting in the firstever example of a lambda encoding in type theory with derivable induction, constanttime destructor, and whose representation requires only linear space. To the best of our knowledge, this paper establishes that the Scott encoding is the secondever example of a lambda encoding enjoying these same properties.
Conclusion
We have shown in this paper how monotone recursive types with constanttime roll and unroll operations can be derived within the type theory of Cedille by applying Tarski’s fixpoint theorem to a preorder on types constructed from zerocost type coercions. As applications, we use the derived monotone recursive types to derive two recursive representations of data, the Parigotstyle and Scottstyle lambda encoding, generically in a signature functor . These recursive representations enjoy constanttime data accessors, making them of practical significance. Furthermore, we gave for each encoding an induction principle and proof of a collection of properties arising from the categorical semantics of datatypes as initial algebras. That this can be achieved for the Scott encoding is itself rather remarkable, and the derivation uses an inductive proof that a zerocost type coercion holds between the type of Scottencoded data and a suitable supertype, described by [lepigre+19].
Financial Aid
We gratefully acknowledge NSF support under award 1524519, and DoD support under award FA95501610082 (MURI program).
Comments
There are no comments yet.