Index-Stratified Types (Extended Version)

05/01/2018 ∙ by Rohan Jacob-Rao, et al. ∙ Digital Asset McGill University 0

We present Tores, a core language for encoding metatheoretic proofs. The novel features we introduce are well-founded Mendler-style (co)recursion over indexed data types and a form of recursion over objects in the index language to build new types. The latter, which we call index-stratified types, are analogue to the concept of large elimination in dependently typed languages. These features combined allow us to encode sophisticated case studies such as normalization for lambda calculi and normalization by evaluation. We prove the soundness of Tores as a programming and proof language via the key theorems of subject reduction and termination.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Recursion is a fundamental tool for writing useful programs in functional languages. When viewed from a logical perspective via the Curry-Howard correspondence, well-founded recursion corresponds to inductive reasoning. Dually, well-founded corecursion corresponds to coinductive reasoning. However, concentrating only on well-founded (co)recursive definitions is not sufficient to support the encoding of meta-theoretic proofs. There are two missing ingredients: 1) To express fine-grained properties we often rely on first-order logic which is analogous to indexed types in programming languages. 2) Many common notions cannot be directly characterized by well-founded (co)recursive definitions. An example is Girard’s notion of reducibility for functions: a term is reducible at type if, for all terms that are reducible at type , we have that is reducible at type . This definition is well-founded because it is by structural recursion on the type indices ( and ), so we want to admit such definitions.

Our contribution in this paper is a core language called Tores that features indexed types and (co)inductive reasoning via well-founded (co)recursion. The primary forms of types are indexed (co)recursive types, over which we support reasoning via Mendler-style (co)recursion. Additionally, Tores features index-stratified types, which allow further definitions of types via well-founded recursion over indices. The main difference between the two forms is that (co)recursive types are more flexible, allowing (co)induction, while stratified types only support unfolding based on their indices. The combination of the two features is especially powerful for formalizing metatheory involving logical relations. This is partly because type definitions in Tores do not require positivity, a condition used in other systems to ensure termination and in turn logical consistency. Despite this, we are able to prove termination of Tores programs using a semantic interpretation of types.

How to justify definitions that are recursively defined on a given index in addition to well-founded (co)recursive definitions has been explored in proof theory (see for example Tiu [2012], Baelde and Nadathur [2012]). While this line of work is more general, it is also more complex and further from standard programming practice. In dependent type theories, large eliminations achieve the same. Our approach, grounded in the Curry-Howard isomorphism, provides a complementary perspective on this problem where we balance expressiveness and ease of programming with a compact metatheory. We believe this may be an advantage when considering more sophisticated index languages and reasoning techniques.

The combination of indexed (co)recursive types and stratified types is already used in the programming and proof environment Beluga, where the index language is an extension of the logical framework LF together with first-class contexts and substitutions [Nanevski et al., 2008, Pientka, 2008, Cave and Pientka, 2012]. This allows elegant implementations of proofs using logical relations [Cave and Pientka, 2013, 2015] and normalization by evaluation [Cave and Pientka, 2012]. Tores can be seen as small kernel into which we elaborate total Beluga programs, thereby providing post-hoc justification of viewing Beluga programs as (co)inductive proofs.

2 Index Language for Tores

The design of Tores is parametric over an index language. Following Thibodeau et al. [2016] we stay as abstract as possible and state the general conditions the index language must satisfy. Whenever we require inspection of the particular index language, namely the structure of stratified types and induction terms, we will draw attention to it.

To illustrate the required structure for a concrete index language, we use natural numbers. In practice, however, we can consider other index languages such as those of strings, types [Cheney and Hinze, 2003, Xi et al., 2003], or (contextual) LF [Pientka, 2008, Cave and Pientka, 2012]. It is important to note that, for most of our design, we accommodate a general index language up to the complexity of Contextual LF. Thus we treat index types and Tores kinds as dependently typed, although we use natural numbers in stratified types and induction terms.

The abstract requirements of our index language are listed throughout this section. To summarize them here, they are: decidable type checking, decidable equality, standard substitution principles, decidable unification as well as sound and complete matching. Implicitly, we also require that each index type intended for use in stratified types and induction terms should have a well-founded recursion scheme, i.e. an induction principle. For an index language of Contextual LF, for example, the recursion scheme can be generated using a covering set of index terms for each index type [Pientka and Abel, 2015]. This inductive structure is necessary to show decidability of type checking (Section 3.3) and termination (Section 4.4) of Tores.

2.1 General Structure

We refer to a term in the index language as an index term , which may have an index type . In the case of natural numbers, there is a single index type , and index terms are built from , , and variables which must be declared in an index context .

Tores relies on typing for index terms which we give for natural numbers in Fig. 1. The equality judgment for natural numbers is given simply by reflexivity (syntactic equality). We also give typing for index substitutions, which supply an index term for each index variable in the domain and describe well-formed contexts. These definitions are generic.

Figure 1: Index language structure

We require that both typing and equality of index terms be decidable in order for type checking of Tores programs to be decidable.

Requirement 1.

Index type checking is decidable.

Requirement 2.

Index equality is decidable.

We can lift the kinding, typing, equality and matching rules to spines of index terms and types generically. We write and for the empty spines of terms and types respectively. If is an index term and is a spine, then is a spine. Similarly if is an index type assignment and is a type spine, then is a type spine. Spines are convenient for setting up the types and terms of Tores. Unlike index substitutions which are built from right to left, spines are built from left to right.

Below we define well-kinded spines of index types and well-typed spines of index terms, which are generic to the particular index language.

Type checking of index spines is decidable.

Proof.

Simply rely on decidable type checking of single index terms (Req. 1). ∎

2.2 Substitutions

Throughout our development we use both a single index substitution operation and a simultaneous substitution operation . For composition of simultaneous substitutions we write . [Composition of index substitutions] Suppose and . Then where

We rely on standard properties of single and simultaneous substitutions which we summarize below. These say that substitutions preserve typing (3.1 and 3.2) and equality (3.3) and are associative (3.4).

Requirement 3 (Index substitution principles).


  1. If and then .

  2. If and then .

  3. If and then .

  4. If and and then .

2.3 Unification and Matching

Type checking of Tores relies on a unification procedure to generate a most general unifier (MGU). A unifier for index terms and in a context is a substitution which transforms and into syntactically equal terms in another context . That is, and . is “most general” if it does not make more commitments to variables than absolutely necessary. A unifying substitution only makes sense together with its range , so we usually write them as a pair . In general, there may be more than one MGU for a particular unification problem, or none at all. However, we require here that each problem has at most one MGU up to -equivalence. We write the generation of an MGU using the judgment , where is either the MGU if it exists or representing that unification failed. To illustrate, we show the unification rules for natural numbers. We write for the identity substitution that maps index variables from to themselves.

The unification procedure is required for type checking the equality elimination forms and in Tores, which we explain in Section 3.2. In each form, the term is a witness of an index equality . In order to use this equality (or determine that it is spurious), we perform unification and check that the result matches the source term. For the term , we check that is an -equivalent unifier to the provided one . For the failure term we check that is , yielding a contradiction. Hence our type checking algorithm for Tores relies on a sound and complete unification procedure. We summarize our requirements for unification below.

Requirement 4 (Decidable unification).

Given index terms and in a context , the judgment is decidable. Either is , the unique MGU up to -equivalence, or is and there is no unifier.

Finally, our operational semantics relies on index matching. This is an asymmetric form of unification: given terms in and in , matching identifies a substitution such that . We describe it using the judgment .

Matching is used during evaluation of the equality elimination to extend the substitution to a full index environment (grounding substitution) . To achieve this, we must lift the notion of matching to the level of index substitutions. This can be done generically given an algorithm for matching index terms. The judgment says that matching discovered a substitution such that .

To illustrate an algorithm for index matching, we provide the rules for our natural number domain in Fig. 2. We also show the generic lifting of the algorithm to match index substitutions.

Figure 2: Index matching for natural numbers and generic substitutions

We then require that index (substitution) matching is both sound and complete. We make these properties precise in our final requirements below. The notion of matching also lifts to the level of index substitutions. We omit the full specifications here and instead state the required properties.

Requirement 5 (Soundness of index matching).


  1. If and then and .

  2. If and then and .

Requirement 6 (Completeness of index matching).

Suppose and and . Then .

3 Specification of Tores

We now describe Tores, a programming language designed to express (co)inductive proofs and programs using Mendler-style (co)recursion. It also features index-stratified types, which allow definitions of types via well-founded recursion over indices.

3.1 Types and Kinds

Besides unit, products and sums, Tores includes a nonstandard function type , which combines a dependent function type and a simple function type. It binds a number of index variables which may appear in both and . If the spine of type declarations is empty then degenerates to the simple function space. We can also quantify existentially over an index using the type , and have a type for index equality . These two types are useful for expressing equality constraints on indices. We model (co)recursive and stratified types as type constructors of kind . These introduce type variables , which we track in the type variable context . There is no positivity condition on recursive types, as the typing rules for Mendler-recursion enforce termination without it.

A stratified type is defined by primitive recursion on an index term. For the index type , the two branches correspond to the two constructors and . Intuitively, will behave like and will behave like . For richer index languages such as Contextual LF we can generate an appropriate recursion scheme following Pientka and Abel [2015].

We illustrate indexed recursive types and stratified types using vectors, i.e. lists indexed by their length, with elements of type

. Vectors are of kind . We omit the kind annotation for better readability in the subsequent type definitions. One way to define vectors is with an indexed recursive type, an explicit equality and an existential type: .

Alternatively, they can be defined as a stratified type: . In this case equality reasoning is implicit. While we have a choice how to define vectors, some types are only possible to encode using one form or the other.

A type that must be stratified is the encoding of reducibility for simply typed lambda terms. This example is explored in detail by Cave and Pientka [2013]; our work gives it theoretical justification.

Here the index objects are the simple types, unit and of index type tp, as well as lambda terms (), and of index type tm. We can define reducibility as a stratified type of kind . This relies on an indexed recursive type (omitted here) that describes when a term steps to a value.

To illustrate a corecursive type, we define an indexed stream of bits following Thibodeau et al. [2016]. The index here guarantees that we are reading exactly bits. Once , we read a new message consisting of the length of the message together with a stream indexed by . In contrast to the recursive type definition for vectors, here the equality constraints guard the observations we can make about a stream.

3.2 Terms

Tores contains many common constructs found in functional programming languages, such as unit, pairs and case expressions. We focus on the less standard constructs: indexed functions, equality witnesses, well-founded recursion and index induction.

Since we combine the dependent and simple function types in , we similarly combine abstraction over index variables and a term variable in our function term . The corresponding application form is . The term of function type receives first a spine of index objects followed by a term . Each equality type has at most one inhabitant refl witnessing the equality. There are two elimination forms for equality: the term uses an equality proof for together with a unifier to refine the body in a new index context . It may also be the case that the equality witness is false, in which case we have reached a contradiction and abort using the term . Both forms are necessary to make use of equality constraints that arise from indexed type definitions and to show that some cases are impossible.

Recursive types are introduced by the “fold” syntax , and stratified types are introduced by terms. Here ranges over constructors in the index language such as and . The important difference is how we eliminate recursive and stratified types. We can analyze data defined by a recursive type using Mendler-style recursion . This gives a powerful means of recursion while still ensuring termination. Stratified types can only be unfolded using according to the index. To take full advantage of stratified types, we also allow programmers to use well-founded recursion over index objects, writing . Intuitively, if the index object is , then we pick the first branch and execute ; if the index object is then we pick the second branch instantiating with and allowing recursive calls inside . While this induction principle is specific to natural numbers, it can also be derived for other index domains, in particular contextual LF (see Pientka and Abel [2015]).

Recall that vectors can be defined using the indexed recursive type or the stratified type . Which definition we choose impacts how we write programs that analyze vectors. We show the difference using a recursive function that copies a vector.

To analyze the recursively defined vector, we use recursion and case analysis of the input vector to reconstruct the output vector. If we receive a non-empty list, we take it apart and expose the equality proofs, before reassembling the list. The recursion is valid according to the Mendler typing rule since the recursive call to is made on the tail of the input vector. The program is fairly verbose due to the need to unpack the -type and to split pairs. We also need to inject values into the recursive type using the tag. In general, we may also need to reason explicitly with equality constraints.

To contrast we show the program using induction on natural numbers and unfolding the stratified type definition of . Note that the first argument is the natural number index paired with a unit term argument, since index abstraction is always combined with term abstraction. The program analyzes and in the case unfolds the input vector before reconstructing it using the result of the recursive call. In this version of the equality constraints are handled silently by the type checker.

Let us now build streams using Mendler-style corecursion. Streams of natural numbers are defined as . We can define a stream of natural numbers starting from as:

Here, the corecursion constructor takes a function whose argument is the seed used to build the stream and produces the pair of the head and the tail of our stream. In this case, the seed is simply the natural number corresponding to the current head of the stream. As we move to the tail, we simply use the corecursive call made available by on the successor of the seed, as the next element of the stream will be this new number.

Let us now a second example: the stream of Fibonacci numbers.

In order to build the Fibonacci stream, we use as a seed a pair of two numbers corresponding of the last two Fibonacci numbers that have been computed so far. From there, the head of the stream is simply the first number of the pair, while the new seed is simply the second number together with the sum of the two, representing the second and third number, respectively. Hence, to obtain the whole Fibonacci stream, we simply write .

Note that Tores does not have an explicit notion of falsehood. This is because it is definable using existing constructs: we can define the empty type as a recursive type , and a contradiction term , for any type . Our termination result with the logical relation in Section 4.3 shows that the type contains no values and hence no closed terms, which implies logical consistency of Tores (not all propositions can be proven).

3.3 Typing Rules

We define a bidirectional type system in Fig. 3 with two mutually defined judgments: checking a term against a type and synthesizing a type for a term .. We can move from checking to synthesis via the conversion rule and from synthesis to checking using a type annotation. The typing rules for unit, products, sums and existentials are standard.We focus here on equality, recursive and stratified types.

Figure 3: Typing rules for Tores

The introduction for an index equality type is simply refl, which is checked via equality in the index domain. Both equality elimination forms rely on unification in the index domain (see Section 2.3). Specifically, the term checks against any type because the unification must fail, establishing a contradiction. For the term , unification must result in the MGU which by Req. 4 is -equivalent to the supplied unifier . We then check the body using the new index context and applied to the contexts and and the goal type .

This treatment of equality elimination is similar to the use of refinement substitutions for dependent pattern matching

[Pientka and Dunfield, 2008, Cave and Pientka, 2012], and is inspired by equality elimination in proof theory [Tiu and Momigliano, 2012, McDowell and Miller, 2002, Schroeder-Heister, 1993]. In the latter line of work, type checking involves trying all unifiers from a complete set of unifiers (which may be infinite!), instead of a single most general unifier. We believe our requirement for a unique MGU is a practical choice for type checking.

Indexed recursive and stratified types are both introduced by injections ( and ), though their elimination forms are different. Stratified types are eliminated (unfolded) in reverse to the corresponding fold rules. For recursive types on the other hand, the naive unfold rules lead to nontermination, so we use a Mendler-style recursion form , generalizing the original formulation [Mendler, 1988] to an indexed type system. The idea is to constrain the type of the function variable so that it can only be applied to structurally smaller data. This is achieved by declaring of type in the premise of the rule. Here represents types exactly one constructor smaller than the recursive type, so the use of is guaranteed to be well-founded.

Type checking of terms is decidable.

Proof.

Since the typing rules are syntax directed, it is straight-forward to extract a type checking algorithm. Note that the algorithm relies on decidability of judgments in the index language, namely index type checking (Req. 1), equality (Req. 2) and unification (Req. 4). ∎

3.4 Operational Semantics

We define a big-step operational semantics using environments, which provide closed values for the free variables that may occur in a term.

Values consist of unit, pairs, injections, reflexivity, and closures. Typing for values and environments, which is used to state the subject reduction theorem, are given in Fig. 6 in the appendixin the appendix.

The main evaluation judgment, , describes the evaluation of a term under environments to a value . Here, stands for a term in an index context and term variable context . The index environment provides closed index objects for all the index variables in , while provides closed values for all the variables declared in , i.e. and . For convenience, we factor out the application of a closure to values and resulting in a value , using a second judgment written . This allows us to treat application of functions (lambdas, recursion and induction) uniformly. Similarly, we factor out the application of to a closure in an additional judgment written . This simplifies the type interpretation used to prove termination.

Figure 4: Big-step evaluation rules

We only explain the evaluation rule for equality elimination . We first evaluate the equality witness under environments to the value refl. This ensures that respects the index equality witnessed by . From type checking we know that : the key is how we extend at run-time to produce a new index environment that is consistent with . This relies on sound and complete index substitution matching (see Section 2.3) to generate such that and . We can then evaluate the body under the new index environment and the same term environment to produce a value .

Notably absent is an evaluation rule for . This term is used in a branch of a case split that we know statically to be impossible. Such branches are never reached at run time, so there is no need for an evaluation rule. For example, consider a type-safe “head” function, which receives a nonempty vector as input. As we write each branch of a case split explicitly, the empty list case must use , but is never executed. We now state subject reduction for Tores.

[Subject Reduction]

  1. If where or , and and , then .

  2. If where and and and and , then .

  3. If where then .

4 Termination Proof

We now describe our main technical result: termination of evaluation. Our proof uses the logical predicate technique of Tait [1967] and Girard [1972]. We interpret each language construct (index types, kinds, types, etc.) into a semantic model of sets and functions.

4.1 Interpretation of Index Language

We start with the interpretations for index types and spines. In general, our index language may be dependently typed, as it is if we choose Contextual LF. Hence our interpretation for index types must take into account an environment containing instantiations for index variables . Such an index environment is simply a grounding substitution .

[Interpretation of index types and index spines ]

The interpretation of an index type under environment is the set of closed terms of type . The interpretation lifts to index spines . With these definitions, the following lemma follows from the substitution principles of index terms (Req. 3).

[Interpretation of index substitution]

  1. If and then .

  2. If and then .

4.2 Lattice Interpretation of Kinds

We now describe the lattice structure that underlies the interpretation of kinds in our language. The idea is that types are interpreted as sets of term-level values and type constructors as functions taking indices to sets of values. We call the set of all term-level values and write its power set as . The interpretation is defined inductively on the structure of kinds.

[Interpretation of kinds ]

A key observation in our metatheory is that each forms a complete lattice. In the base case, is a complete lattice under the subset ordering, with meet and join given by intersection and union respectively. For a kind , we induce a lattice structure on by lifting the lattice operations pointwise. Precisely, we define

The meet and join operations can similarly be lifted pointwise.

This structure is important because it allows us to define pre-fixed points for operators on the lattice, which is central to our interpretation of recursive types. Here we rely on the existence of arbitrary meets, as we take the meet over an impredicatively defined subset of .

[Mendler-style pre-fixed and post-fixed points] Suppose is a complete lattice and . Define by

and by

We will mostly omit the subscript denoting the underlying lattice of the order and pre-fixed and post-fixed points, and .

Note that a usual treatment of recursive types would define the least pre-fixed point of a monotone operator as and the greatest post-fixed point of a monotone operator as , using the Knaster-Tarski theorem. However, our unconventional definition (following Jacob-Rao et al. [2016]) more closely models Mendler-style (co)recursion and does not require to be monotone (thereby avoiding a positivity restriction on recursive types).

4.3 Interpretation of Types

In order to interpret the types of our language, it is helpful to define semantic versions of some syntactic constructs. We first define a semantic form of our indexed function type , which helps us formulate the interaction of function types with fixed points and recursion.

[Semantic function space] For a spine interpretation and functions , define

It will also be convenient to lift term-level tags to the level of sets and functions in the lattice . We define the lifted tags inductively on . If then . If then for all . Essentially, the function attaches a tag to every element in the set produced after the index arguments are received.

Dually we define . If then . If then for all .

Finally, we define the interpretation of type variable contexts . These describe semantic environments mapping each type variable to an object in its respective kind interpretation. Such environments are necessary to interpret type expressions with free type variables.

[Interpretation of type variable contexts ]

We are now able to define the interpretation of types under environments and . This is done inductively on the structure of .

[Interpretation of types and constructors]


  where

The interpretation of the indexed function type contains closures which, when applied to values in the appropriate input sets, evaluate to values in the appropriate output set. The interpretation of the equality type is the set if and the empty set otherwise. The interpretation of a recursive type is the pre-fixed point of the function obtained from the underlying type expression. Finally, interpretation of a stratified type built from relies on an analogous semantic operator . It is defined by primitive recursion on the index argument, returning the first argument in the base case and calling itself recursively in the step case. Note that the definition of is specific to the index type it recurses over. We only use the index language of natural numbers here, so the appropriate set of index values is .

Last, we give the interpretation for typing contexts , describing well-formed term-level environments .

[Interpretation of typing contexts]

4.4 Proof

We now sketch our proof using some key lemmas. The following two lemmas concern the fixed point operators and , and are key for reasoning about (co)recursive types and Mendler-style (co)recursion. These lemmas generalize those of Jacob-Rao et al. [2016] from the simply typed setting.

[Soundness of pre-fixed point] Suppose is a complete lattice, and is as in Def. 4.2. Then .

[Function space from pre-fixed and post-fixed points] Let and and .

  1. If , then .

  2. If , then .

Another key result we rely on is that type-level substitutions associate with our semantic interpretations. Note that single index (and spine) substitutions on types are handled as special cases of the result for simultaneous index substitutions. We omit the definitions of type substitutions for brevity.

[Type-level substitution associates with interpretation]
Suppose or , and and .

  1. If and then and .

  2. If and and or , then .

Proof.

By induction on the structure of . ∎

The next two lemmas concern recursive types and terms respectively.

[Recursive type contains unfolding]
Let where and , and and and . Then .

[Backward closure]
Let be a term, and environments, and .

  1. If , then .

  2. If , then .

Our final lemma concerns the semantic equivalence of an applied stratified type with its unfolding. Note that here we only state and prove the lemma for an index language of natural numbers. For a different index language, one would need to reverify this lemma for the corresponding stratified type. This should be straight-forward once the semantic operator is chosen to reflect the inductive structure of the index language.

[Stratified types equivalent to unfolding]
Let where and , and and <