Course-of-Value Induction in Cedille

11/29/2018 ∙ by Denis Firsov, et al. ∙ The University of Iowa 0

In the categorical setting, histomorphisms model a course-of-value recursion scheme that allows functions to be defined using arbitrary previously computed values. In this paper, we use the Calculus of Dependent Lambda Eliminations (CDLE) to derive a lambda-encoding of inductive datatypes that admits course-of-value induction. Similar to course-of-value recursion, course-of-value induction gives access to inductive hypotheses at arbitrary depth of the inductive arguments of a function. We show that the derived course-of-value datatypes are well-behaved by proving Lambek's lemma and characterizing the computational behavior of the induction principle. Our work is formalized in the Cedille programming language and also includes several examples of course-of-value functions.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Dependently typed programming languages with built-in infrastructure for defining inductive datatypes allow programmers to write functions with complex recursion patterns. For example, in Agda [1] we can implement the natural definition of Fibonacci numbers:

fib : Nat → Nat
fib zero = zero
fib (suc zero) = 1
fib (suc (suc n)) = fib (suc n) + fib n
This definition is accepted by Agda because its built-in termination checker sees that all recursive calls are done on structurally smaller arguments.

In contrast, in pure polymorphic lambda calculi (e.g., System F), inductive datatypes can be encoded by means of impredicative quantification (without requiring additional infrastructure). For example, if we assume that F is a well-behaved positive scheme (e.g., a functor), then we can express its least fixed point as an initial Mendler-style F-algebra. A Mendler-style algebra differs from a traditional F-algebra (F X → X) in that it takes an additional argument (of type R → X), which corresponds to a function for making recursive calls. Mendler-style algebras introduce a polymorphic type R for recursive scheme arguments, allowing recursive function calls to be restricted to structurally smaller arguments. At the same time, the polymorphic type prevents any kind of further inspection of those arguments.

AlgM ◂ (★ ➔ ★) ➔ ★ ➔ ★ = λ F. λ X. ∀ R : ★. (R ➔ X) ➔ F · R ➔ X.
FixM ◂ (★ ➔ ★) ➔ ★ = λ F. ∀ X : ★. AlgM F X ➔ X.
foldM ◂ ∀ F : ★ ➔ ★. ∀ X : ★. AlgM F X ➔ FixM F ➔ X
 = Λ F. Λ X. λ alg. λ v. v alg.
The simple recursion pattern provided by foldM (also known as catamorphism) can be tricky to work with. Let us define natural numbers as the least fixed point of the functor NF X = 1 + X. Hutton [2] explained that it is possible to use the universality law of initial F-algebras to show that there is no algebra g : AlgM NF Nat such that fib = foldM g. The reason for this is that in the third equation (of the natural definition of fib), the recursive call is made not only on the direct predecessor of the argument (suc n), but also on the predecessor of the predecessor (n).

The usual workaround involves “tupling”. More specifically, we define an algebra AlgM NF (Nat × Nat) where the second Nat denotes the previous Fibonacci number. Then, we fold the input with fibAlg, and finally return the first projection of the tuple (here πᵢ denotes i-th projection from the tuple):

fibAlg ◂ AlgM NF (Nat × Nat) = Λ R. λ rec. λ fr.
 case fr (λ _.  pair (suc zero) zero)      % zero case
         (λ r. let p = rec r in            % suc case
               pair (add (π₁ p) (π₂ p)) (π₁ p)).
fibTup ◂ Nat ➔ Nat = λ n. π₁ (foldM fibAlg n).
In this example, the rec function allows recursive calls to be made explicitly on elements of type R (which is Nat in disguise). This approach requires error-prone bookkeeping. Additionally, observe that the defining equation of the Fibonacci numbers (fibTup (suc (suc n)) = fibTup n + fibTup (suc n)) is true propositionally, but not definitionally (i.e., it does not follow by -reduction).

The alternative solution to tupling is course-of-value recursion (also known as histomorphism), which makes it possible to express nested recursive calls directly. The central concept of this approach is course-of-value F-algebras, which are similiar to usual Mendler-style F-algebras, except that they take an abstract destructor (of type R ➔ F R

) as yet another additional argument. The abstract destructor is a fixed-point unrolling (or, abstract inverse of the initial algebra), and intuitively allows for “pattern-matching” on constructors for the scheme

F.

AlgCV ◂ (★ ➔ ★) ➔ ★ ➔ ★ = λ F. λ X.
  ∀ R : ★. (R ➔ F · R) ➔ (R ➔ X) ➔ F · R ➔ X.
For illustration purposes, assume that F is a functor and that FixCV F is its least fixed point. Also, assume that inCV and outCV are mutual inverses and represent a collection of constructors and destructors, respectively.
inCV  ◂ F (FixCV F) ➔ FixCV F = <..>
outCV ◂ FixCV F ➔ F (FixCV F) = <..>
Then, course-of-value recursion is characterized by the function foldCV, and its reduction behaviour is characterized by the cancellation property (cancel):
foldCV ◂ ∀ X : ★. AlgCV F X ➔ FixCV F ➔ X = <..>
cancel ◂ ∀ X : ★. ∀ alg : AlgCV F X. ∀ fx : F (FixCV F).
  foldCV alg (inCV fx) ≃ alg outCV (foldCV alg) fx = β.
Notice that after unfolding, the first argument of alg is instantiated with outCV (the destructor), and the second argument is instantiated with a partially applied foldCV (the recursive call).

To illustrate the nested recursive calls, we define course-of-value naturals (NatCV) as the course-of-value least fixed point of the functor NF (FixCV NF). Then, the Fibonacci function can be implemented very close to the conventional “pattern-matching” style:

fibCV ◂ NatCV ➔ NatCV = foldCV (Λ R. λ out. λ rec. λ nf.
 case nf (λ _. zero)                       % zero case
  (λ r. case (out r) (λ _. suc zero)       % suc zero case
             (λ r’. rec r + rec r’))).     % suc (suc n) case
Here, out provides an additional layer of pattern-matching on the arguments of the function. Finally, it is important to observe that given that cancel is true by β-reduction, then fibCV (suc (suc n)) ≃ fibCV (suc n) + fibCV n is also true by β-reduction.

The remaining questions is how to define the least fixed point FixCV F for any positive scheme F, and how to derive the corresponding introduction and elimination principles. We might try a usual construction in terms of universal quantification and AlgCV:

FixCV ◂ (★ ➔ ★) ➔ ★ = λ F. ∀ X : ★. AlgCV F X ➔ X.
This definition fails since AlgCV F X is isomorphic to AlgM (Enr’ F) X where
Enr’ ◂ (★ ➔ ★) ➔ ★ ➔ ★ = λ F. λ X. F X × (X ➔ F X).
Enr’ F is a negative scheme and (in general) the least fixed points of negative schemes are undefined in a consistent type theory. As a result, it is common to implement foldCV in terms of general recursion from the host language or to add it as a primitive construction (see related work in Section 6).

The main contribution of this paper is the derivation of course-of-value datatypes in the Calculus of Dependent Lambda Eliminations (CDLE). The key inspiration for our work comes from the categorical construction known as restricted existentials (Section 3). We prove that the least fixed point of the restricted existential of scheme Enr’ F exists, and that it contains course-of-value datatypes as a subset (Section 4.1). Next, we employ heterogeneous equality from CDLE to define the datatype FixCV as this subset (Section 4.2). We also give (according to our best knowledge) the first generic formulation and derivation of a course-of-value induction principle in a pure type theory. Finally, we show examples of functions defined over course-of-value natural numbers (Section 5).

The CDLE type theory is implemented in Cedille, which we use to type-check the formalized development of this paper.111 The Cedille formalization accompanying this paper is available at:
http://firsov.ee/cov-induction

2 Background

2.1 The CDLE Type Theory

CDLE [3, 4] is an extrinsically typed (or, Curry-style) version of the Calculus of Constructions (CC), extended with a heterogeneous equality type (t₁ ≃ t₂)222The most recent version of CDLE [5] has been extended with a more expressive equality type, but this work does not make use of it. , Kopylov’s [6] dependent intersection type (ι x : T. T’), and Miquel’s [7] implicit product type (∀ x : T. T’). To make type checking algorithmic, Cedille terms have typing annotations, and definitional equality of terms is modulo erasure of these annotations. The target of erasure in Cedille is the pure untyped lamba calculus with no additional constructs. Due to space constraints, we omit a more detailed summary of CDLE. However, this work is a direct continuation of our previous work [8], which includes a detailed explanation of all of the constructs of our language.

2.2 Identity Functions and Identity Mappings

Stump showed how to derive induction for natural numbers in CDLE [9]. As a next step, we generically derived induction for Church-style and Mendler-style lambda-encodings of datatypes that arise as a least fixed point of a functor [10]. As usual, a scheme F : ★ ➔ ★ is a functor if it comes equipped with a function lifting (fmap) that satisfies the identity and composition laws:

Functor ◂ (★ ➔ ★) ➔ ★ = λ F.
  Σ fmap : ∀ X Y : ★. (X ➔ Y) ➔ F X ➔ F Y.
  IdentityLaw fmap × CompositionLaw fmap.
Later, we noticed that our development only used fmap on functions which extensionally behave like identity functions. This observation suggested the introduction of a type of “identity functions”, and we generalized the construction of inductive datatypes from least fixed points of functors to a larger family of positive schemes (which we call “identity mappings”). While we omit the implementations below (indicated by <..>), a detailed description of these constructs can be found in [8].

Identity Functions

We define the type Id X Y as the collection of all functions from X to Y that erase to the term (λ x. x):

id ◂ ∀ X : ★. X ➔ X = Λ X. λ x. x.
Id ◂ ★ ➔ ★ ➔ ★ = λ X. λ Y. Σ f : X ➔ Y. f ≃ id.
Because Cedille is extrinsically typed, the domain (X) and codomain (Y) of an identity function need not be the same.

Introduction

Values of Id X Y are introduced by exhibiting that a function (e.g., f : X ➔ Y) behaves extensionally like an identity (i.e., x ≃ f x for any x):

intrId ◂ ∀ X Y: ★. Π f: X ➔ Y. (Π x: X. x ≃ f x) ➔ Id X Y = <..>

Elimination

An identity function Id X Y allows us to “cast” values of type X to values of type Y without changing the values themselves:

elimId ◂ ∀ X Y : ★. ∀ c : Id · X · Y. X → Y = <..>
elimId-prop ◂ ∀ X Y : ★. ∀ c : Id · X · Y.
 ∀ x : X. elimId -c x ≃ x = β.
Therefore, Id X Y says that any value of type X can also be typed as Y. Again, this is possible because Cedille is an extrinsically typed language. Argument c of elimId is quantified using (rather than Π), indicating that it is an implicit (or, erased) argument. In elimId-prop, elimId is applied to -c, where the dash syntactically indicates that this an implicit (or, erased) application.333 Types, as opposed to values, are always erased in terms. Hence, using (∀ X : ★

) in the classifier of a term is sensible but using (

Π X : ★) is not. Additionally, we omit type applications in terms because they are inferred by Cedille.

Identity Mappings

We say that scheme F is an identity mapping if it is equipped with a function that lifts identity functions:

IdMapping ◂ (★ ➔ ★) ➔ ★ = λ F. ∀ X Y: ★.Id X Y ➔ Id (F X) (F Y).
Intuitively, IdMapping F is similar to a functor’s fmap, but it only needs to be defined on identity functions, and no additional laws are required.

Every functor induces an identity mapping (by the application of intrId to fmap and its identity law), but not vice versa [8].

fm2im ◂ ∀ F : ★ ➔ ★. Functor · F ➔ IdMapping · F = <..>

2.3 Inductive Datatypes in Cedille

In our previous work, we showed how to define inductive datatypes as least fixed points of identity mappings. In this section, we review our main results. More comments and implementation details can be found in our previous work and the associated development [8].

To start with, we specify a scheme F and its identity mapping as module-level parameters.

module (F : ★ ➔ ★){imap : IdMapping · F}.
Curly braces around the imap variable indicate that it is quantified implicitly (or, as an erased parameter). Another way of saying this is that none of the definitions should depend on the computational behaviour of imap. Next, we list the basic definitions and results:

  • The type FixIndM represents the carrier of initial Mendler-style algebras, i.e. the least fixed point of F:

    FixIndM ◂ ★ = <..>
    foldM ◂ ∀ X : ★. AlgM · X ➔ FixIndM ➔ X = <..>
    
    To be able to prove induction, FixIndM is defined as an intersection of FixM and a proof of its inductivity (see [8] for details).

  • The functions inFixIndM and outFixIndM are mutual inverses and represent a collection of constructors and destructors of FixIndM.

    inFixIndM  ◂ F · FixIndM ➔ FixIndM = <..>
    outFixIndM ◂ FixIndM ➔ F · FixIndM = <..>
    

Next, we describe the formulation of the induction principle for FixIndM. We define a “dependent” counterpart of Mendler-style algebras that we call Q-proof-F-algebras. A value of type PrfAlgM Q should be understood as an inductive proof that predicate Q holds for every FixIndM built by constructors inFixIndM.

PrfAlgM ◂ (FixIndM ➔ ★) ➔ ★ = λ Q. ∀ R : ★. ∀ c : Id · R · FixIndM.
 (Π r : R. Q (elimId -c r)) ➔
 Π fr : F · R. Q (inFixIndM (elimId -(imap c) fr)).
Proof algebras allow the inductive hypotheses to be explicitly stated for every abstract R by providing an implicit identity function c : Id R X, and a dependent function of type Π r : R. Q (elimId -c r). Recall that we can use the identity function c to convert an abstract R to a concrete FixIndM (via the elimination elimId -c r, which reduces to r). Given an inductive hypothesis for every R, the proof algebra must conclude that the predicate Q holds for every FixIndM that is produced by constructors inFixIndM from any given abstract F R that has been “casted” to a concrete F FixIndM.

Finally, the induction principle says that given a Q-proof-algebra, we can conclude that Q holds for all FixIndM.

induction ◂ ∀ Q : FixIndM ➔ ★. PrfAlgM Q
 ➔ Π e : FixIndM. Q e = <..>

Just like F-algebras, Q-proof-algebras allow users to invoke inductive hypotheses only on direct subdata of a given argument. The rest of the paper is devoted to the formulation and derivation of a generic course-of-value induction principle that allows users to invoke inductive hypotheses on subdata at arbitrary depths.

3 Restricted Existentials

Uustalu and Vene defined a construction called the restricted existential to demonstrate an isomorphism between Church-style and Mendler-style initial algebras [11]. The importance of this is that for any difunctor (or, mixed variant functorial scheme) F, the restricted existential of F is an isomorphic covariant functor.

In this section, we define a variation that we call an identity restricted existential. We also derive its dependent elimination principle, and prove that the identity restricted existential of any scheme F (including negative and non-functorial ones) is an identity mapping. Later in the paper, the restricted existential will be the main tool for deriving course-of-value datatypes.

3.1 Restricted Coends

In the categorical setting, the restricted existential arises as a restricted coend. Our subsequent development requires existentials where the quantifier ranges over types. This can be provided by a restricted coend (RCoend H F), which is isomorphic to the existential type ∃ R. H R × F R (where H is what we are restricting by). Our development defines RCoend by taking advantage of the isomorphism between the universal type ∀ R. H R ➔ F R ➔ Q and the existential type (∃ R. H R × F R) ➔ Q (for any Q) that we have in mind. Now, let us formalize the notion of restricted coend.

Let be an endodifunctor and be a difunctor to . An -restricted -coend is an initial object in the category of -restricted -cowedges. The -restricted -cowedge is a pair where (the carrier) is an object in and is a family of functions (dinatural transformations) between sets and .

We translate this definition to Cedille, where an -restricted -cowedge corresponds to a type (C) and a polymorphic function (RCowedge H F C):

RCowedge ◂ (★ ➔ ★) ➔ (★ ➔ ★) ➔ ★ ➔ ★
 = λ H. λ F. λ C. ∀ R : ★. H · R ➾ F · R ➔ C.
To simplify the subsequent development, we render difunctors as schemes with a single parameter, and the restriction H R is made implicit (denoted by , which is a non-dependent version of ).

The carrier of the initial cowedge can be implemented in terms of universal quantification:

RCoend ◂ (★ ➔ ★) ➔ (★ ➔ ★) ➔ ★ = λ H. λ F.
 ∀ C : ★. RCowedge · H · F · C ➔ C.
The second component of initial cowedges is a polymorphic function, (intrRCoend), which plays the role of the constructor of its carrier (RCoend H F), and is implemented as follows:
intrRCoend ◂ ∀ H F : ★ ➔ ★. RCowedge · H · F (RCoend · H · F)
  = Λ H. Λ F. Λ R. Λ ac. λ ga. (Λ Y. λ q. q · R -ac ga).
The (weak) initiality can be proved by showing that for any cowedge RCowedge H F C, there is a homomorphism from RCoend H F to C:
elimRCoend ◂ ∀ H F: ★ ➔ ★.∀ C: ★. RCowedge · H · F · C ➔ RCoend · H · F ➔ C
 = Λ F. Λ A. Λ C. λ phi. λ e. e phi.

3.2 Dependent Elimination for Restricted Coends

In this section, we utilize the intersection type (denoted by ι in Cedille) to define a restricted coend type for which the induction principle is provable. To do this, we follow the original recipe described by Stump to derive natural-number induction in Cedille. First, we define a predicate expressing that an H-restricted F-coend is inductive.

RCoendInductive ◂ Π H F : ★ ➔ ★. RCoend · H · F ➔ ★
  = λ H. λ F. λ e. ∀ Q : RCoend · H · F  ➔ ★.
  (∀ R : ★. ∀ hr : H · R. Π fr : F · R. Q (intrRCoend -hr fr)) ➔ Q e.
Second, we define the “true” inductive restricted coend as an intersection of the previously defined RCoend and the predicate RCoendInductive. In essence, this says that RCoendInd is the subset of RCoend carved out by the RCoendInductive predicate.
RCoendInd ◂ (★ ➔ ★) ➔ (★ ➔ ★) ➔ ★
 = λ H. λ F. ι x : RCoend · H · F. RCoendInductive · H · F x.
This definition builds on an observation by Leivant that under the Curry-Howard isomorphism, proofs in second-order logic that data satisfy their type laws can be seen as isomorphic to the Church-encodings of those data [12].

Next, we define the constructor for the inductive coend:

intrRCoendInd ◂ ∀ H F : ★ ➔ ★. RCowedge · H · F (RCoendInd · H · F)
 = Λ H. Λ F. Λ R. Λ hr. λ fr.
 [ intrRCoend -hr fr , Λ Q. λ q. q · R -hr fr ].
In Cedille, the term [ t , t’ ] introduces the intersection type ι x : T. T’ x, where t has type T and t’ has type [t/x]T’. Definitionally, values of intersection types reduce (via erasure) to their first components (i.e., [ t , t’ ] is definitionally equal to t). See [8] for more information on intersection types in Cedille. The induction principle is now derivable and has the following type:
indRCoend ◂ ∀ H F : ★ ➔ ★. ∀ Q : RCoendInd · H · F ➔ ★.
  (∀ R : ★. ∀ hr : H · R. Π fr : F · R. Q (intrRCoendInd  -hr fr))
   Π e : RCoendInd · H · F. Q e = <..>

3.3 Identity Restricted Existentials

We define the identity restricted existential of F and the object C as an F-coend restricted by a family of identity functions λ X. Id · X · C:

RExt ◂ (★ ➔ ★) ➔ ★ ➔ ★ = λ F. λ X. RCoendInd · (λ R : ★. Id R X) · F.
Next, we prove that the restricted existential of any F is an identity mapping:
imapRExt ◂ ∀ F : ★ ➔ ★. IdMapping · (RExtInd · F)
 = Λ F. Λ A. Λ B. Λ f. λ c. indRCoend c
   (Λ R. Λ i. λ gr. pair (intrRExtInd -(compose i f) gr) β).
Intuitively, RExtInd F X corresponds to the type ∃ R. Id R X × F R. Notice that RExtInd F X is positive because X occurs positively in Id, and that positivity does not depend on F. With the definition of identity restricted existentials in place, we can now move on towards using them to derive course-of-value induction.

4 Course-of-Value Datatypes

The development in this section is parameterized by an identity mapping:

module (F : ★ ➔ ★){imap : IdMapping · F}.

4.1 Precursor

In [11], Uustalu and Vene showed that it is possible to use restricted existentials to derive a superset of course-of-value natural numbers. We start by generalizing their construction to arbitrary inductive types, in terms of least fixed points of identity mappings.

The main idea is to define a combinator that pairs the value F X with the destructor function (of type X ➔ F X):

Enr’ ◂ ★ ➔ ★ = λ X. F X × (X ➔ F X).
Intuitively, we wish to construct a least fixed point of F and its destructor simultaneously. The resulting scheme Enr’ F is not positive and therefore it cannot be a functor nor an identity mapping. This implies that we cannot take a least fixed point of it directly. Instead, we define CVF’ · F as a restricted existential of Enr’ F. Hence, the scheme CVF’ F is an identity mapping by the property of restricted existentials:
CVF’ ◂ ★ ➔ ★ = RExt · (Enr’ · F).
imCVF’ ◂ IdMapping (CVF’ F) = imapRExt · (Enr’ · F).
It is natural to ask what the relationship between the least fixed point of F and least fixed point of CVF’ F is.
FixCV’ ◂ ★ = FixIndM · (CVF’ · F) -(imCVF’ · F).
It turns out that FixCV’ is not a least fixed point of F, because value F FixCV’ could be paired with any function of type FixCV’ ➔ F FixCV’. We will provide more intuition by describing the destructor and constructor functions of FixCV’.

Destructor

The generic development from Section 2.3 allows us to unroll FixCV’ into a value of CVF’ FixCV’ (which it was made from). Because CVF’ F is a restricted existential, we can use its dependent elimination to “project out” the value F FixCV’:

outCV’ ◂ FixCV’ ➔ F · FixCV’ = λ x. indRExt (outFixIndM -imapRExt x)
   (Λ R. Λ c. λ v. elimId -(imap -c) (π₁ v)).
In the definition above, the variable v has type F R × (R → F R). Because F is an identity mapping, we can cast the first projection of v to F FixCV’ and return it. On the other hand, the function R → F R cannot be casted to type FixCV’ ➔ F FixCV’, because the abstract type R appears both positively and negatively.

Constructor

Similarly, the generic development gives us the function inFixIndM, which constructs a FixCV’ value from the given CVF’ FixCV’. The latter must be built from a pair of F FixCV’ and a function of type FixCV’ ➔ F FixCV’. This observation gives rise to the following specialized constructor of FixCV’:

inCV’ ◂ (FixCV’ ➔ F FixCV’) ➔ F FixCV’ ➔ FixCV’ = <..>
This constructor indicates that FixCV’ represents the superset of course-of-value datatypes, because the function FixCV’ → F FixCV’ is not restricted to the destructor outCV’, and the inductive value might contain a different function of that type at every construtor. We address this issue in the next section.

4.2 Course-of-Value Datatypes with Induction

In our previous work [8], we developed a generic unrolling function for least fixed points of identity mappings:

outFixIndM ◂ ∀ imap : IdMapping F. FixIndM F ➔ F (FixIndM F) = <..>
Observe that the only identity-mapping-specific variable is quantified implicitly. In other words, outFixIndM does not perform any F-specific computations. The same is true for the elimination principle of restricted existentials (indRCoend). Since outCV’ is implemented in terms of these functions, this observation suggests that we can refer to outCV’ as we define the subset of type FixCV’. In particular, we define the scheme Enr by pairing the value F X with the function f : X ➔ F X and the proof that this function is equal to the previously defined outCV’:
Enr ◂ ★ ➔ ★ = λ X. F · X × Σ f : X ➔ F · X. f ≃ outCV’.
This constraint between terms of different types is possible due to heterogeneous equality. Just like in the previous section, we define a least fixed point of the restricted existential of Enr F and its least fixed point:
CVF ◂ ★ ➔ ★ = λ X. RExt · (Enr · F) X.
FixCV ◂ ★ = FixIndM · CVF (imapRExt · Enr).

Destructor

The destructor of FixCV is represented by exactly the same lambda-term as the destructor (outCV’) of FixCV’:

outCV ◂ FixCV ➔ F · FixCV = λ v. indRExt
 (outFixIndM  -imapRExt v) (Λ R. Λ c. λ v. elimId -(imap -c) (π₁ v)).
outCVEq ◂ outCV’ ≃ outCV = β.
Because the only difference between outCV’ and outCV is their typing annotations (which are inferred by the typechecker), they are definitionally equal in Cedille (as witnessed by β, the introduction rule of Cedille’s equality type).

Constructor

Armed with the destructor outCV and the proof outCVEq, we can now define the constructor of FixCV:

inCV ◂ F · FixCV ➔ FixCV = Λ F. Λ imap. λ fcv. inFixIndM -imapRExt
 (intrRExtInd -trivIdExt) (pair fcv (pair (outCV -imap) outCVEq)).

Lambek’s Lemma

As expected, inCV and outCV are mutual inverses, which establishes that FixCV is a fixed point of F.

lambekCV1 ◂ ∀ x : F FixCV. outCV (inCV x) ≃ x = β.
lambekCV2 ◂ ∀ x : FixCV.   inCV (outCV x) ≃ x = <..>

Induction

Recall that the induction principle for the least fixed point FixIndM is stated in terms of proof-algebras. Now, let us define proof-algebras for course-of-value datatypes:

PrfAlgCV ◂ (FixCV ➔ ★) ➔ ★ = ∀ R : ★. ∀ c : Id · R · FixCV.
 Π out : R ➔ F · R. out ≃ outCV ➾
 (Π r : R. Q (elimId -c r)) ➔
 Π fr : F · R.  Q (inCV (elimId -(imap -c) fr))
Notice that PrfAlgCV has an extra argument out (of type R ➔ F R), which represents an abstract unrolling function for abstract type R. Also, we have a proof that the out function is equal to the previously discussed destructor outCV. This evidence is needed when the construction of a particular proof-algebra depends on the exact definition of the unrolling function.

Course-of-value induction is expressible in terms of course-of-value proof-algebras and is proved by combining the induction principle of FixIndM with the dependent elimination principle of restricted existentials.

inductionCV ◂ ∀ Q : FixCV ➔ ★. PrfAlgCV Q ➔ Π x : FixCV. Q x = <..>
It is important to establish the computational behaviour of this proof-principle:
indCancel ◂ ∀ Q : FixCV ➔ ★. ∀ palg : PrfAlgCV Q. ∀ x : F FixCV.
 inductionCV palg (inCV x) ≃ palg outCV (inductionCV palg) x = β.
Above, notice how the abstract unrolling function out : R ➔ F R is being instantiated with the actual unrolling function outCV.

Finally, implementing course-of-value recursion (foldCV) from Section 1 in terms of course-of-value induction (inductionCV) is straightforward.

5 Examples

We now demonstrate the utility of our results with example functions and proofs on natural numbers that require course-of-value recursion. Note that the fibCV example from the introduction (Section 1) works as described, because foldCV is derivable from inductionCV. Recall that natural numbers may be defined as the least fixed point of a functor (NF ◂ ★ ➔ ★ = λ X. Unit + X.). As remarked in Section 2.2, because NF is a functor, it is also an identity mapping (nfimap ◂ IdMapping NF = <..>). We begin by defining the type of natural numbers (NatCV), supporting a constant-time predecessor function, as well as course-of-value induction:

NatCV ◂ ★ = FixCV F nfimap.
zero ◂ NatCV = inCV -nfimap (in1 unit).
suc ◂ NatCV ➔ NatCV = λ n. inCV -nfimap (in2 n).
predCV ◂ NatCV ➔ NatCV = λ n.
  case (outCV -nfimap n) (λ u. n) (λ n’. n’).

5.1 Division

Consider an intuitive definition of division as iterated subtraction:

div : Nat ➔ Nat ➔ Nat
div 0 m = 0
div n m = if (n < m) then 0 else (suc (div (n - m) m))
Such a definition is rejected by Agda (and many languages like it), because Agda requires that recursive calls are made on arguments its termination checker can guarantee are structurally smaller, which it cannot do for an arbitrary expression (like n - m). With our development, the problematic recursive call (on n - m) is an instance of course-of-value recursion because we can define subtraction by iterating the predecessor function, and we have access to recursive results for every predecessor.

For convenience, we define the conventional foldNat as a specialized version of our generic development. Then, minus n m is definable as the m number of predecessors of n.

foldNat ◂ ∀ R : ★. (R ➔ R) ➔ R ➔ NatCV ➔ R
 = Λ R. λ rstep. λ rbase. foldCV (Λ R’. λ out. λ rec. λ nf.
 case nf (λ _. rbase) (λ r’. rstep (rec r’))).

minus’ ◂ ∀ R : ★. (R ➔ NF R) ➔ R ➔ NatCV ➔ R
  = Λ R. λ pr. foldNat (λ r. case (pr r) (λ _. r) (λ r’. r’))
minus ◂ NatCV ➔ NatCV ➔ NatCV = minus’ (outCV -nfimap).

Above, we first define an abstract operation minus’ n m, where the type of n is polymorphic and where that type comes with an abstract predecessor pr. Then, the usual concrete minus n m is recovered by using NatCV for the polymorphic type and the destructor outCV -nfimap for the predecessor.

Now we can use minus’ to define division naturally, returning zero in the base case, and iterating subtraction in the step case. This definition below is accepted purely through type-checking and without any machinery for termination-checking.

div ◂ NatCV ➔ NatCV ➔ NatCV
  = λ n. λ m. inductionCV (Λ R. Λ c. λ pr. Λ preq. λ ih. λ nf.
  case nf
    (λ x. zero)                           % div 0 m
    (λ r. if (suc (elimId -c r) < m)      % div (suc n) m
      then zero
      else (suc (ih (minus’ pr r (pred m)))))) n

Notice that in the conditional statement, we use elimId -c to convert the abstract predecessor r to a concrete natural number, allowing us to apply suc to check if suc n is less than m. In the intuitive definition of div, we match on 0 in the first case, and on any wildcard pattern n in the second case. In contrast, when using inductionCV and case in our example, we must explicitly handle the zero and suc r cases. Consequently, while the intuitive definition recurses on div (n - m) m, we recurse on the predecessors ih (minus’ pr r (pred m)). This is equivalent because minus (suc n) (suc m) is equal to minus n m, for all numbers n and m. Moreover, with our development we can use course-of-value induction to prove this equivalence (minSucSuc below). By direct consequence, we can also prove that the defining equation (divSucSuc below, for the successor case) of the intuitive definition of division holds (the proof is not simply by β because we must rewrite by the equivalence minSucSuc):

minSucSuc ◂ Π n m : NatCV. minus (suc n) (suc m) ≃ minus n m = <..>
divSucSuc ◂ Π n m : NatCV. (suc n < suc m) ≃ ff ➔
  div (suc n) (suc m) ≃ suc (div (minus (suc n) (suc m)) (suc m)) = <..>
While the propositions are stated in terms of concrete minus, the div function is defined in terms of abstract minus’. Nonetheless, the propositions are provable due to the computational behavior of inductionCV, which instantiates the bound pr with outCV, allowing us to identify minus’ pr and minus.

Our accompanying code also includes a proof about division that takes full advantage of course-of-value induction, because it must invoke the inductive hypothesis on minus’ pr r (pred m) in the step case:

divLE ◂ Π n m : Nat. div n m ≤ n ≃ tt = <..>

5.2 Catalan Numbers

Many solutions to counting problems in combinatorics can be given in terms of Catalan numbers. The Catalan numbers are definable as the solution to the recurrence and . This translates to an intuitive functional definition of the Catalan numbers:

cat : Nat → Nat
cat 0 = 1
cat (suc n) = sum (λ i → cat i * cat (n - i)) n
The sum function has type (Nat ➔ Nat) ➔ Nat ➔ Nat, where the lower bound of the sum is always zero (i=0), the second argument is the upper bound of the sum (n), and the first argument is the body of the sum (parameterized by i). Once again, this is not a structurally terminating function recognizable by Agda. While fib and div have a static number of course-of-value recursions (two and one, respectively), the number of recursions made by cat is determined by its input. Nonetheless, we are able to define cat using our development.
catCV ◂ NatCV ➔ NatCV
  = inductionCV (Λ R. Λ c. λ pf. Λ pfeq. λ ih . λ nf.
  case nf
    (λ _. suc zero)  % cat 0
    (λ r. sum        % cat (suc n)
      (λ i. mult
        (ih (minus’ pf r (minus (elimId -c r) i)))
        (ih (minus’ pf r i)))
      (elimId -c r))).
As with div, above r has abstract type R, so we convert it to a NatCV where necessary by applying elimId -c. The intuitive right factor cat (n - i) is directly encoded as ih (minus’ pf r i)). However, we cannot directly encode the intuitive left factor cat i, because i is a natural number and we only have inductive hypotheses for values of abstract type R. However, i is equivalent to n - (n - i) for all i where i ≤ n. We use the abstract minus’ function for the outer subtraction, whose first numeric argument is an abstract r but whose second numeric argument expects a concrete NatCV. Hence, the inner subtraction is a concrete minus, whose first argument is the concrete version of r (converted via identity function c) and whose second argument is the concrete i (of type NatCV). Because the outer subtraction (minus’) returns an abstract R, we can get an inductive hypothesis for an expression equivalent to i. Finally, course-of-value induction allows us to prove the aforementioned equivalence for minus, and by consequence the defining equation (for the successor case) of the intuitive definition of Catalan numbers:
minusId ◂ Π n i : NatCV. (i ≤ n) ≃ tt ➔ minus n (minus n i) ≃ i = <..>
catSuc ◂ Π n : NatCV.
  cat (suc n) ≃ sum (λ i. mult (cat i) (cat (minus n i))) n = <..>
Once again, the discrepancy between abstract minus’ and concrete minus is resolved in the proofs thanks to the computational behavior of inductionCV instantiating pr to outCV.

6 Conclusions and Related Work

Ahn et al. [13] describe a hierarchy of Mendler-style recursion combinators. They implement generic course-of-value recursion in terms of Haskell’s general recursion. Then, they prove that course-of-value recursion for arbitrary “negative” inductive datatypes implies non-termination.

Miranda-Perea [14] describes extensions of System F with primitive course-of-value iteration schemes. He explains that the resulting systems lose strong normalization if they are combined with negative datatypes.

Uustalu et al. [15] show that natural-deduction proof systems for intuitionistic logics can be safely extended with a course-of-value induction operator in a proof-theoretically defensible way.

In contrast to the work described above, we have now shown that course-of-value induction can be derived within type theory (specifically, within CDLE). We will end with comparing our approach to alternative approaches to handling complex termination arguments within type theory. 444 For comparison, our accompanying code includes Agda formalizations of fib, div, and cat in the “Below” style of Section 6.1 and the sized types style of Section 6.2.

6.1 The Below Way

Goguen et al. [16] define the induction principle recNat (generalizing to all inductive types), which they use to elaborate dependent pattern matching to eliminators. In the step case, recNat receives BelowNat P n, which is a large tuple consisting of the motive P for every predecessor of n. Simple functions performing nested pattern matching (e.g., fib) can be written using recNat and nested case-analysis, by projecting out induction hypotheses from BelowNat P n. However, functions with more complex termination arguments (e.g., div and cat) require proving extra lemmas (e.g., recMinus in our accompanying code) to dynamically extract inductive hypotheses from BelowNat P n evidence. In our approach, such lemmas are unnecessary.

6.2 Sized Types

Abel [17] extends type theory with a notion of sized types, which allows intuitive function definitions to be accepted by termination checking. Course-of-value induction (CoVI) and sized types (ST) have trade-offs. ST requires defining size-indexed versions of the datatypes, which necessitates altering conventional type signatures of functions to include size information. While CoVI is derivable within CDLE, ST extends the underlying type theory. On the other hand, CoVI is restricted to functions that recurse strictly on previous values. Hence, a function like merge sort can be written using ST but not with CoVI. As future work, we would like to investigate datatype encodings with a restricted version of abstract constructors (in addition to abstract destructors) for defining functions like merge sort.

References

  • [1] Norell, U.: Towards a practical programming language based on dependent type theory. PhD thesis, Chalmers University of Technology (2007)
  • [2] Hutton, G.: A tutorial on the universality and expressiveness of fold. J. Funct. Program. 9(4) (July 1999) 355–372
  • [3] Stump, A.: The Calculus of Dependent Lambda Eliminations. Journal of Functional Programming 27 e14
  • [4] Stump, A.: From Realizability to Induction via Dependent Intersection. Ann. Pure Appl. Logic (2018) to appear.
  • [5] Stump, A.: Syntax and Semantics of Cedille. (2018)
  • [6] Kopylov, A.: Dependent intersection: A new way of defining records in type theory. In: 18th IEEE Symposium on Logic in Computer Science (LICS). (2003) 86–95
  • [7] Miquel, A.: The Implicit Calculus of Constructions Extending Pure Type Systems with an Intersection Type Binder and Subtyping. In Abramsky, S., ed.: Typed Lambda Calculi and Applications (TLCA). (2001) 344–359
  • [8] Firsov, D., Blair, R., Stump, A.: Efficient mendler-style lambda-encodings in cedille. In: Interactive Theorem Proving - 9th International Conference, ITP 2018, Held as Part of the Federated Logic Conference, FloC 2018, Oxford, UK, July 9-12, 2018, Proceedings. (2018) 235–252
  • [9] Stump, A.: From Realizability to Induction via Dependent Intersection (2017) Under consideration for Annals of Pure and Applied Logic.
  • [10] Firsov, D., Stump, A.: Generic derivation of induction for impredicative encodings in cedille. In: Proceedings of the 7th ACM SIGPLAN International Conference on Certified Programs and Proofs. CPP 2018, New York, NY, USA, ACM (2018) 215–227
  • [11] Uustalu, T., Vene, V.: Mendler-style inductive types, categorically. Nordic J. of Computing 6(3) (September 1999) 343–361
  • [12] Leivant, D.: Reasoning about functional programs and complexity classes associated with type disciplines. In: 24th Annual Symposium on Foundations of Computer Science (FOCS), IEEE Computer Society (1983) 460–469
  • [13] Ahn, K.Y., Sheard, T.: A hierarchy of mendler style recursion combinators: Taming inductive datatypes with negative occurrences. In: Proceedings of the 16th ACM SIGPLAN International Conference on Functional Programming. ICFP ’11, New York, NY, USA, ACM (2011) 234–246
  • [14] Miranda-Perea, F.E.: Some remarks on type systems for course-of-value recursion. Electronic Notes in Theoretical Computer Science 247 (2009) 103 – 121 Proceedings of the Third Workshop on Logical and Semantic Frameworks with Applications (LSFA 2008).
  • [15] Uustalu, T., Vene, V.: Least and greatest fixed points in intuitionistic natural deduction. Theoretical Computer Science 272(1) (2002) 315 – 339 Theories of Types and Proofs 1997.
  • [16] Goguen, H., McBride, C., McKinna, J.: Eliminating dependent pattern matching. In: Algebra, Meaning, and Computation. Springer (2006) 521–540
  • [17] Abel, A.: MiniAgda: Integrating sized and dependent types. In: Workshop on Partiality And Recursion in Interactive Theorem Provers (PAR). (July 2010)