Manifest contract systems [11, 30, 24, 23, 25, 1, 10, 12, 9, 14, 18], which are typed functional calculi, are one discipline handling software contracts . The distinguishing feature of manifest contract systems is that they integrate contracts into a type system and guarantee some sort of satisfiability against contracts in a program as type soundness. Specifically, a contract is embedded into a type by means of refinement types of the form , which represents the subset of the underlying type such that the values in the subset satisfy the predicate , which can be an arbitrary Boolean expression in the programming language. Using the refinement types, for example, we can express the contract of a division function, which would say “… the divisor shall not be zero …”, by the type . In addition to the refinement types, manifest contract systems are often equipped with dependent function types in order to express more detailed contracts. A dependent function type, written in this paper, is a type of a function which takes one argument of the type and returns a value of the type ; the distinguished point from ordinary function types is that can refer to the given argument represented by . Hence, for example, the type of a division function can be made more specific like . (Here, for simplicity, we ignore the case where devision involves a remainder, though it can be taken account into by writing a more sophisticated predicate.)
A manifest contract system checks a contract dynamically to achieve its goal—as many correct programs as possible can be compiled and run; while some studies [22, 27, 15, 26, 31, 29], which also use a refinement type system, check contract satisfaction statically but with false positives and/or restriction on predicates. The checks are done in the form of explicit casts of the form ; where is a subject, is a source type (namely the type of ), and is a target type.111Many manifest contract systems put a unique label on each cast to distinguish which cast fails, but we omit them for simplicity. A cast checks whether the value of can have the type . If the check fails, the cast throws an uncatchable exception called blame, which stands for contract violation. So, the system does not guarantee the absence of contract violations statically, but it guarantees that the result of successful execution satisfies the predicate of a refinement type in the program’s type. This property follows subject reduction and a property called value inversion —if a value has a type , then the expression obtained by substituting for in is always evaluated into .
. Considering parities (even/odd) of integers, for example, we can state a contract of the addition as a conjunctive form; that is
“An even integer is returned if both given arguments are even integers; and an odd integer is returned if the first given argument is even integer and the second given argument is odd integer; and …”
Using intersection types, we can write the contract as the following type.222
In fact, a semantically equivalent contract could be expressed by using dependent function types found in existing systems as follows, where and .
Thus, one might think it is just a matter of taste in how contracts are represented. However, intersection types are more expressive, that is, there are contracts that are hard to express in existing manifest contract systems. Consider the following (a bit contrived) contract for a higher-order function.
The result type depends on input as the parity contract does. This time, however, it cannot be written with a dependent function type; there is no obvious way to write a predicate corresponding to (or ). Such a predicate must check that a given function returns non-zero for all integers, but this is simply not computable.
1.2 Our Work
We develop a formal calculus PCFv, a manifest contract system with intersection types. The goal of this paper is to prove its desirable properties: preservation, progress, value inversion; and one that guarantees that the existence of dynamic checking does not change the “essence” of computation.
There are several tasks in constructing a manifest contract system, but a specific challenge for PCFv arises from the fact—manifest contract systems are intended as an intermediate language for hybrid type checking . Suppose we annotate a parity contract for the successor function in a surface language as follows.
Then, this program is compiled into the intermediate language, namely PCFv, where casts are inserted for contract checking at run-time.
The problem is that we need to insert different casts into code according to how the code is typed; and one piece of code might be typed in several essentially different ways in an intersection type system since it is a polymorphic type system. For instance, in the example above, is obtained by cast insertion if the function is typed as ; while is obtained when the body is typed as . However, the function must have both types to have the intersection type. It may seem sufficient to just cast the body itself, that is, . However, this just shelves the problem: Intuitively, to check if the subject has the target intersection type, we need to check if the subject has both types in the conjunction. This brings us back to the same original question.
Our contributions are summarized as follows:
we formalize the calculus PCFv; and
we state and prove type soundness, value inversion, and dynamic soundness.
The whole system including proofs is mechanized with Coq333The mechanized definitions and proof scripts are attached as a supplementary material.. We use locally nameless representation and cofinite quantification  for the mechanization.
To concentrate on the PCFv-specific problems, we put the following restrictions for PCFv in this paper compared to a system one would imagine from the phrase “a manifest contract system with intersection types”.
PCFv does not support dependent function types. As we will see, PCFv uses nondeterminism for dynamic checking. The combination of dependent function types and nondeterminism poses a considerable challenge .
We use refinement intersection types rather than general ones. Roughly speaking, is a refinement intersection type if both and refine the same type. So, for example, is a refinement intersection types since types of both sides refine the same type , while is not.
2 Overview of Our Language: PCFv
Our language PCFv is a call-by-value dialect of PCF , extended with intersection types (derived from the -calculus ) and manifest contracts (derived from [9, 11]). So, the baseline is that any valid PCF program is also a valid PCFv program; and a PCFv program should behave as the same way as (call-by-value) PCF. In other words, PCFv is a conservative extension of call-by-value PCF.
2.1 The -calculus
To address the challenge discussed in Section 1, PCFv is strongly influenced by the -calculus by Liquori and Stolze , an intersection type system à la Church. Their novel idea is a new form called strong pair, written . It is a kind of pair and used as a constructor for expressions of intersection types. So, using the strong pair, for example, we can write an identity function having type as follows.
Unlike product types, however, and in a strong pair cannot be arbitrarily chosen. A strong pair requires that the essence of both expressions in a pair be the same. An essence of a typed expression is the untyped skeleton of . For instance, . So, the requirement justifies strong pairs as the introduction of intersection types: that is, computation represented by the two expressions is the same and so the system still follows a Curry-style intersection type system. Strong pairs just give a way to annotate expressions with a different type in a different context.
We adapt their idea into PCFv by letting an essence represent the contract-irrelevant part of an expression, rather than an untyped skeleton. For instance, the essence of is (the erased contract-relevant parts are casts and predicates of refinement types). Now, we can express in Section 1 as follows.
This strong pair satisfies the condition, that is, both expressions have the same essence.
2.2 Cast Semantics for Intersection Types
Having introduced intersection types, we have to extend the semantics of casts so that they handle contracts written with intersection types. Following Keil and Thiemann , who studied intersection (and union) contract checking in the “latent” style  for an untyped language, we give the semantics of a cast to an intersection type by the following rule:
The reduction rule should not be surprising: has to have both and and a strong pair introduces an intersection type from and . For the original cast to succeed, both of the split casts have to succeed.
A basic strategy of a cast from an intersection type is expressed by the following two rules.
The cast tests whether a nondeterministically chosen element in a (possibly nested) strong pair can be cast to .
One problem, however, arises when a function type is involved. Consider the following expression.
can be used as both and . This means can handle arbitrary natural numbers. Thus, this cast should be valid and evaluation of the expression above should not fail. However, with the reduction rules presented above, evaluation results in blame in both branches: the choice is made before calling , the function being assigned into only can handle either or , leading to failure at either or , respectively.
To solve the problem, we delay a cast into a function type even when the source type is an intersection type. In fact, reduces to a wrapped value below
similarly to higher-order casts . Then, the delayed cast fires when an actual argument is given:
3 Formal Systems
In this section, we formally define two languages PCFv and PCFv, an extension of PCFv as sketched in the last section. PCFv is a call-by-value PCF. We only give operational semantics and omit its type system and a type soundness proof, because we are only interested in how its behavior is related to PCFv, the main language of this paper.
The syntax of PCFv is shown in Figure 1. Metavariables , , , , and range over term variables ( and are intended for ones bound to functions); and range over types; , , and range over expressions; ranges over values; and ranges over evaluation frames. The definition is fairly standard, except for one point: instead of introducing a constant for the general fix-point operator, we introduce a form for recursive functions.
Definition 1 (Bound and free variables)
An occurrence of in of and in of is called bound. The set of free variables in is the variables of which there are free occurrence in . We denote the free variables by .
We define -equivalence in a standard manner and identify -equivalent expressions.
Definition 2 (Substitution)
Substitution of for a free variable in , written , is defined in a standard capture-avoiding manner.
Definition 3 (Context application)
Given an evaluation frame and an expression , denotes the expression obtained by just replacing the hole in with .
A small-step operational semantics of PCFv is inductively defined by the rules in Figure 2. Those rules consist of standard (call-by-value) PCF axiom schemes and one rule scheme PCF-Ctx, which expresses the call-by-value evaluation strategy using the evaluation frames.
PCFv is an extension of PCFv. Through abuse of syntax, we use the metavariables of PCFv for PCFv, though we are dealing with the two different languages.
The syntax of PCFv is shown in Figure 3. We introduce some more metavariables: ranges over interface types, a subset of types; ranges over recursion bodies, a subset of expressions; ranges over commands; and ranges over typing contexts. Shaded parts show differences (extensions and modifications) from PCFv. Types are extended with intersection types and refinement types; the restriction that a well-formed intersection type is a refinement intersection type is enforced by the type system. An interface type, which is a single function type or (possibly nested) intersection over function types, is used for the type annotation for a recursive function. Expressions are extended with ones for: strong pairs (namely, pair construction, left projection, and right projection); casts; and run-time expressions of the form that can occur at run time for dynamic checking and not in source code. Recursion bodies are (possibly nested strong pairs) of -abstractions.
Run-time expressions deserve detailed explanation. A delayed check denotes a delayed cast into a function type, which is used in cases such as those discussed in Section 1 for instance. A waiting check denotes a state waiting for the check against until is evaluated into a value. An active check is a state running test to see if satisfies .
We do not include in expressions, although existing manifest contract systems usually include it among expressions. As a consequence, the evaluation relation for PCFv is defined between commands. This distinction will turn out to be convenient in stating correspondence between the semantics of PCFv and that of PCFv, which does not have .
We assume the index variable ranges over to save space.
Definition 4 (Terms)
We call the union of the sets of types and expressions as terms.
denotes that is a sub-expression of .
We define -equivalence in a standard manner and identify -equivalent terms.
We often omit the empty environment. We abuse a comma for the concatenation of environments like . We denote a singleton environment, an environment that contains only one variable binding, by .
Definition 5 (Free variables and substitution)
Free variables and substitution are defined similarly to PCFv; and we use the same notations. Note that since the types and expressions of PCFv are mutually recursively defined, the metaoperations are inductively defined for terms.
Definition 6 (Domain of typing context)
The domain of , written , is defined by: and . We abbreviate to .
The essence of a PCFv term is defined in Figure 4, which is mostly straightforward. The choice of which part we take as the essence of a strong pair is arbitrary because for a well-typed expression both parts have the same essence. Note that the essence of an active check is rather than . This is because is the subject of the expression.
3.3 Operational Semantics of PCFv
The operational semantics of PCFv consists of four relations , , , and . Bearing in mind the inclusion relation among syntactic categories, these relations can be regarded as binary relations between commands. The first two are basic reduction relations, and the other two are contextual evaluation relations (relations for whole programs). Furthermore, the relations subscripted by p correspond to PCFv evaluation, that is, essential evaluation; and ones subscripted by c correspond to dynamic contract checking. Dynamic checking is nondeterministic because of RC-WedgeL/R, EC-PairL, and EC-PairR.
3.3.1 Essential Evaluation .
The essential evaluation, defined in Figure 5, defines the evaluation of the essential part of a program; and thus, it is similar to . There are just three differences, that is: there are two relations; there is no reduction rule for ; and there is a distinguished contextual evaluation rule EP-PairS, which synchronizes essential reductions of the elements in a strong pair. The synchronization in EP-PairS is important since a strong pair requires the essences of both elements to be the same. The lack of predecessor evaluation for is intentional: Our type system and run-time checking guarantee that cannot occur as an argument to .
3.3.2 Dynamic Checking .
Dynamic checking is more complicated. Firstly, we focus on reduction rules in Figure 6. The side-conditions on some rules are set so that an evaluation is less nondeterministic (for example, without the side conditions, both RC-Forget and RC-Delay could be applied to one expression).
The rules irrelevant to intersection types (RC-Nat, RC-Bool, RC-Forget, RC-Delay, RC-Arrow, RC-Waiting, RC-Activate, RC-Succeed, and RC-Fail) are adopted from Sekiyama et al. , but there is one difference about RC-Delay and RC-Arrow. In the original definition delayed checking is done by using lambda abstractions, that is,
The reason we adopt a different way is just it makes technical development easier. Additionally, the way we adopt is not new—It is used in the original work  on higher-order contract calculi.
The other rules are new ones we propose for dynamic checking of intersection types. As we have discussed in Section 2, a cast into an intersection type is reduced into a pair of casts by RC-WedgeI. A cast from an intersection type is done by RC-Delay, RC-WedgeL/R if the target type is a function type. Otherwise, if the target type is a first order type, RC-WedgeN and RC-WedgeB are used, where we arbitrarily choose the left side of the intersection type and the corresponding part of the value since the source type is not used for dynamic checking of first-order values.
The contextual evaluation rules, defined in Figure 7, are rather straightforward. Be aware of the use of metavariables, for instance, the use of in EC-Ctx; it implicitly means that has not been evaluated into (so the rule does not overlap with EB-Ctx). The first rule lifts the reduction relation to the evaluation relation. The next six rules express the case where a sub-expression is successfully evaluated. The rules EC-ActiveP and EC-ActiveC mean that evaluation inside an active check is always considered dynamic checking, even when it involves essential evaluation. The rules EC-PairL and EC-PairR mean that dynamic checking does not synchronize because the elements in a strong pair may have different casts. The other rules express the case where dynamic checking has failed. An expression evaluates to immediately—in one step—when a sub-expression evaluates to . Here is an example of execution of failing dynamic checking.
Definition 7 (Evaluation)
The one-step evaluation relation of PCFv, denoted by , is defined as . The multi-step evaluation relation of PCFv, denoted by , is the reflexive and transitive closure of .
3.4 Type System of PCFv
The rules for well-formed types check that an intersection type is restricted to a refinement intersection type by the side condition in W-Wedge and that the predicate in a refinement type is a Boolean expression by W-Refine. Note that, since PCFv has no dependent function type, all types are closed and the predicate of a refinement type only depends on the parameter itself.
The typing rules, the rules for the third judgment, consist of two more sub-categories: compile-time rules and run-time rules. Compile-time rules are for checking a program a programmer writes. Run-time rules are for run-time expressions and used to prove type soundness. This distinction, which follows, Belo et al. , is to make compile-time type checking decidable.
A large part of the typing rules are adapted from Sekiyama et al.  and PCF. Here, we explain a few interesting rules. The rule T-Pred demands that the argument of predecessor shall not be zero. The rule T-Pair checks a strong pair is composed by essentially the same expressions by . The premise of the rule T-Cast for casts requires the essences of the source and target types to agree. It amounts to checking the two types and are compatible .
In addition to the rules adapted from Sekiyama et al. , there is one new rule T-Delayed for run-time typing rules. The rule T-Delayed is for a delayed checking for function types, which restrict the source type so that it respects the evaluation relation (there is no evaluation rule for a delayed checking in which source type is a refinement type), and inherits the condition on the source and target types from T-Cast.
We start from properties of evaluation relations. As we have mentioned, is essential evaluation, and thus, it should simulate ; and is dynamic checking, and therefore, it should not change the essence of the expression. We formally state and show these properties here. Note that most properties require that the expression before evaluation is well typed. This is because the condition of strong pairs is imposed by the type system.
If and , then .
The proof is routine by induction on one of the given derivations. ∎
If and , then .
The proof is by induction on the given evaluation derivation. ∎
The following corollary is required to prove the preservation property.
If , , , , and ; then .
If and , then .
The proof is by induction on the given evaluation derivation. ∎
Now we can have the following theorem as a corollary of 2 and 3. It guarantees the essential computation in PCFv is the same as the PCFv computation as far as the computation does not fail. In other words, run-time checking may introduce blame but otherwise does not affect the essential computation.
If and , then .
4.1 Type Soundness
We conclude this section with type soundness. Firstly, we show a substitution property; and using it, we show the preservation property.
If and , then .
The proof is by induction on the derivation for . ∎
Theorem 4.2 (Preservation)
If and , then .
We prove preservation properties for each and and combine them. Both proofs are done by induction on the given typing derivation. For the case in which substitution happens, we use 4 as usual. For the context evaluation for strong pairs, we use 1 and 3 to guarantee the side-condition of strong pairs. ∎
Next we show the value inversion property, which guarantees a value of a refinement type satisfies its predicate. For PCFv, this property can be quite easily shown since PCFv does not have dependent function types, while previous manifest contract systems need quite complicated reasoning [25, 23, 18]. The property itself is proven by using the following two, which are for strengthening an induction hypothesis.
We define a relation between values and types, written , by the following rules.
If , then .
The proof is by induction on the given derivation. ∎
Theorem 4.3 (Value inversion)
If , then .