1.1. Software contracts
Software contracts [Meyer_1988_book] are a promising program verification tool to develop robust, dependable software. Contracts are agreements between a supplier and a client of software components. On one hand, contracts are what the supplier guarantees. On the other hand, they are what the client requires. Following Eiffel [Meyer_1988_book], a pioneer of software contracts, contracts in this work are described as executable Boolean expressions written in the same language as the program. For example, the specification that both numbers and are either positive or negative is described as Boolean expression “”.
Contracts can be verified by two complementary approaches: static and dynamic verification. Dynamic verification is possible due to executability of contracts—the run-time system can confirm that a contract holds by evaluating it. Since Eiffel advocated “Design by Contracts” [Meyer_1988_book], there has been extensive work on dynamic contract verification [Rosenblum_1995_IEEETSE, Kramer_1998_TOOLS, Findler/Felleisen_2002_ICFP, Findler/Guo/Rogers_2007_IFL, Wadler/Findler_2009_ESOP, Strickland/Felleisen_2010_DLS, Disney/Flanagan/McCarthy_2011_ICFP, Chitil_2012_ICFP, Dimoulas/Tobin-Hochstadt/Felleisen_2012_ESOP, Takikawa/Strickland/Tobin-Hochstadt_2013_ESOP]. Dynamic verification is easy to use, while it brings possibly significant run-time overhead [Findler/Guo/Rogers_2007_IFL] and, perhaps worse, it cannot check all possible execution paths, which may lead to missing critical errors. Static verification [Rondon/Kawaguchi/Jhala_2008_PLDI, Xu/PeytonJones/Claessen_2009_POPL, Bierman/Gordon/Hrictcu/Langworthy_ICFP_2010, Vazou/Rondon/Jhala_2013_ESOP, Nguyen/Tobin-Hochstadt/Horn_2014_ICFP, Vekris/Cosman/Jhala_2016_PLDI] is another, complementary approach to program verification with contracts. It causes no run-time overhead and guarantees that contracts are always satisfied at run time, while it is difficult to use—it often requires heavy annotations in programs, gives complicated error messages, and restricts the expressive power of contracts.
1.2. Manifest contracts
To take the best of both, hybrid contract verification—where contracts are verified statically if possible and, otherwise, dynamically—was proposed by Flanagan [Flanagan_2006_POPL], and calculi of manifest contracts [Flanagan_2006_POPL, Greenberg/Pierce/Weirich_2010_POPL, Knowles/Flanagan_2010_TOPLAS, Belo/Greenberg/Igarashi/Pierce_2011_ESOP, Sekiyama/Nishida/Igarashi_2015_POPL, Sekiyama/Igarashi/Greenberg_2016_TOPLAS, Sekiyama/Igarashi_2017_POPL] have been studied as its theoretical foundation. Manifest contracts refer to contract systems where contract information occurs as part of types. In particular, contracts are embedded into types by refinement types ,111Although in the context of static verification the underlying type of a refinement type is restricted to be a base type usually, this work allows it to be arbitrary; this extension is useful to describe contracts for abstract data types [Belo/Greenberg/Igarashi/Pierce_2011_ESOP, Sekiyama/Igarashi/Greenberg_2016_TOPLAS]. which denote a set of values of such that satisfies Boolean expression (which is called a contract or a refinement), that is, evaluates to . For example, using refinement types, a type of positive numbers is represented by .
Dynamic verification in manifest contracts is performed by dynamic type conversion, called casts. A cast checks that, when applied to value of source type , can behave as target type . In particular, if is a refinement type, the cast checks that satisfies the contract of . If the contract check succeeds, the cast returns ; otherwise, if it fails, an uncatchable exception, called blame, will be raised. For example, let us consider cast , where is a Boolean function that decides if a given integer is a prime number. If this cast is applied to a prime number other than , the check succeeds and the cast application returns the number itself. Otherwise, if it is applied to , it fails and blame is raised. The superscript (called blame label) on a cast is used to indicate which cast has failed.
Static contract verification is formalized as subtyping, which statically checks that any value of a subtype behaves as a supertype. In particular, a refinement type is a subtype of another if any value of satisfying behaves as and satisfies . For example, is a subtype of because all prime numbers should be positive.
Hybrid contract verification integrates these two verification mechanisms of contracts. In the hybrid approach, for every program point where a type is required to be a subtype of , a type checker first tries to solve the instance of the subtyping problem statically. Unfortunately, since contracts are arbitrary Boolean expressions in a Turing-complete language, the subtyping problem is undecidable in general. Thus, the type checker may not be able to solve the problem instance positively or negatively. In such a case, it inserts a cast from to into the program point in order to dynamically ensure that run-time values of behave as . For example, let us consider function application where and are given types and , respectively. Given this expression, the type checker tries to see if is a subtype of . If the checker is strong enough, it will find out that values of are only three, five, and seven and that the subtyping relation holds and accept ; otherwise, cast is inserted to check satisfies contract at run time and the resulting expression will be evaluated.
1.3. Our work
In this article, we study program reasoning in manifest contracts. The first goal of the reasoning is to justify hybrid contract verification. As described in Section 1.2, a cast is inserted if an instance of the subtyping problem is not solved statically. Unfortunately, due to undecidability of the subtyping problem, it is possible that casts from a type to its supertype—which we call upcasts—are inserted, though they are actually unnecessary. How many upcasts are inserted rests on a prover used in static verification: the more powerful the prover is, the less upcasts are inserted. In other words, the behavior of programs could be dependent on the prover due to the insertion of upcasts, which is not very desirable because the dependency on provers would make it difficult to expect how programs behave when the prover is modified. We show that it is not the case, that is, the presence of upcasts has no influences on the behavior of programs; this property is called the upcast elimination.
In fact, the upcast elimination has been studied in the prior work on manifest contracts [Flanagan_2006_POPL, Knowles/Flanagan_2010_TOPLAS, Belo/Greenberg/Igarashi/Pierce_2011_ESOP], but it is not satisfactory. Flanagan [Flanagan_2006_POPL] and Belo et al. [Belo/Greenberg/Igarashi/Pierce_2011_ESOP] studied the upcast elimination for a simply typed manifest contract calculus and a polymorphic one, respectively, but it turned out that their calculi are flawed [Knowles/Flanagan_2010_TOPLAS, Sekiyama/Igarashi/Greenberg_2016_TOPLAS]. While Knowles and Flanagan [Knowles/Flanagan_2010_TOPLAS] has resolved the issue of Flanagan, their upcast elimination deals with only closed upcasts; while Sekiyama et al. [Sekiyama/Igarashi/Greenberg_2016_TOPLAS] fixed the flaw in Belo et al., they did not address the upcast elimination; we discuss in more detail in Section LABEL:sec:relwork. As far as we know, this work is the first to show the upcast elimination for open upcasts.
We introduce a subsumption-free polymorphic manifest contract calculus and show the upcast elimination for it. is subsumption-free in the sense that it lacks a typing rule of subsumption, that is, to promote the type of an expression to a supertype (in fact, subtyping is not even part of the calculus) and casts are necessary everywhere a required type is not syntactically equivalent to the type of an expression. In this style, static verification is performed “post facto”, that is, upcasts are eliminated post facto after typechecking. A subsumption-free manifest contract calculus is first developed by Belo et al. [Belo/Greenberg/Igarashi/Pierce_2011_ESOP] to avoid the circularity issue of manifest contract calculi with subsumption [Knowles/Flanagan_2010_TOPLAS, Belo/Greenberg/Igarashi/Pierce_2011_ESOP]. However, their metatheory turned out to rest on a wrong conjecture [Sekiyama/Igarashi/Greenberg_2016_TOPLAS]. Sekiyama et al. [Sekiyama/Igarashi/Greenberg_2016_TOPLAS] revised Belo et al.’s work and resolved their issues by introducing a polymorphic manifest contract calculus equipped with delayed substitution, which suspends substitution for variables in casts until their refinements are checked. While delayed substitution ensures type soundness and parametricity, it makes the metatheory complicated. In this work, we adopt usual substitution to keep the metatheory simple. To ensure type soundness under usual substitution, we—inspired by Sekiyama et al. [Sekiyama/Nishida/Igarashi_2015_POPL]—modify the semantics of casts so that all refinements in the target type of a cast are checked even though they have been ensured by the source type, whereas checks of refinements which have been ensured are skipped in the semantics by Belo et al. [Belo/Greenberg/Igarashi/Pierce_2011_ESOP] and Sekiyama et al. [Sekiyama/Igarashi/Greenberg_2016_TOPLAS]. For example, given , our “fussy” semantics checks both and , while Belo et al.’s “sloppy” semantics checks only because is ensured by the source type. Our fussy semantics resolves the issue of type soundness in Belo et al. and is arguably simpler than Sekiyama et al.
In addition to the upcast elimination, we study reasoning about casts to make static contract verification more effective. In particular, this work studies two additional reasoning techniques. The first is selfification [Ou/Tan/Mandelbaum/Walker_2004_TCS], which embeds information of expressions into their types. For example, it gives expression of integer type a more informative refinement type (where is a Boolean equality operator on integers). The selfification is easily extensible to higher-order types, and it is especially useful when given type information is not sufficient to solve subtyping instances; see Section LABEL:sec:reasoning-self for an example. We formalize the selfification by casts: given of , we show that is equivalent to a cast application , where is the resulting type of embedding into . In other words, behaves as an expression of . The second is static cast decomposition, which leads to elimination of more upcasts obtained by reducing nonredundant casts.
We show correctness of three reasoning techniques about casts—the upcast elimination, the selfification, and the cast decomposition—based on contextual equivalence: we prove that (1) an upcast is contextually equivalent to an identity function, (2) a cast application is to , and (3) a cast is to its static decomposition. We have to note that contextual equivalence that relates only terms of the same type (except for the case of type variables) is useless in this work because we want to show contextual equivalence between terms of different types. For example, an upcast and an identity function may not be given the same type in our calculus for the lack of subsumption: a possible type of an upcast is only , whereas types of identity functions take the form , which is syntactically different from for any if . Instead of such usual contextual equivalence—which we call typed contextual equivalence—we introduce semityped contextual equivalence, where a well-typed term and a possibly ill-typed term can be related, and show correctness of cast reasoning based on it.
Since, as is well known, it is difficult to prove contextual equivalence of programs directly, we apply a proof technique based on logical relations [Plotkin_1980_TODO, Reynolds_1983_IFIP]. We develop a logical relation for manifest contracts and show its soundness with respect to semityped contextual equivalence. We also show completeness of our logical relation with respect to well-typed terms in semityped contextual equivalence, via semityped CIU-equivalence [Mason/Talcott_1991_JFP]. The completeness implies transitivity of semityped contextual equivalence, which is nontrivial in manifest contracts.222As we will discuss later, showing transitivity of typed contextual equivalence is not trivial, either.
1.4. Organization and proofs
The rest of this paper is organized as follows. We define our polymorphic manifest contract calculus equipped with fussy cast semantics in Section 2. Section LABEL:sec:ctxeq introduces semityped contextual equivalence and Section LABEL:sec:logical_relation develops a logical relation for . We show that the logical relation is sound with respect to semityped contextual equivalence and complete for well-typed terms in Section LABEL:sec:proving. Using the logical relation, we show the upcast elimination, the selfification, and the cast decomposition in Section LABEL:sec:reasoning. After discussing related work in Section LABEL:sec:relwork, we conclude in Section LABEL:sec:conclusion.
Most of our proofs are written in the pencil-and-paper style, but the proof of cotermination, which is a key, but often flawed, property of manifest contracts, is given by Coq proof script coterm.v at https://skymountain.github.io/work/papers/fh/coterm.zip.
2. Polymorphic Manifest Contract Calculus
This section formalizes a polymorphic manifest contract calculus and proves its type soundness. As described in Section 1.3, our run-time system checks even refinements which have been ensured already, which enables us to prove cotermination, a key property to show type soundness and parametricity without delayed substitution. We compare our fussy cast semantics with the sloppy cast semantics provided by Belo et al. [Belo/Greenberg/Igarashi/Pierce_2011_ESOP] in Section 2.2. Greenberg [Greenberg_2013_PhD] provides a few motivating examples of polymorphic manifest contracts such as abstract datatypes for natural numbers and string transducers; see Section 3.1 in the dissertation for details.
Figure 1 shows the syntax of , which is based on Belo et al. [Belo/Greenberg/Igarashi/Pierce_2011_ESOP]. Types, ranged over by , are from the standard polymorphic lambda calculus except dependent function types and refinement types. Base types, denoted by , are parameterized, but we suppose that they include Boolean type for refinements. We also assume that, for each , there is a set of constants of ; in particular, . Refinement types , where variable of type is bound in Boolean expression , denotes the set of values of such that evaluates to . As the prior work [Belo/Greenberg/Igarashi/Pierce_2011_ESOP, Sekiyama/Nishida/Igarashi_2015_POPL, Sekiyama/Igarashi/Greenberg_2016_TOPLAS], our refinement types are general in the sense that any type can be refined, while some work [Ou/Tan/Mandelbaum/Walker_2004_TCS, Flanagan_2006_POPL] allows only base types to be refined. Dependent function types bind variable of domain type in codomain type , and universal types bind type variable in . Typing contexts are a sequence of type variables and bindings of the form , and we suppose that term and type variables bound in a typing context are distinct.
Values, ranged over by , consist of casts and usual constructs from the call-by-value polymorphic lambda calculus—constants (denoted by ), term abstractions, and type abstractions. Term abstractions and type abstractions bind and in the body , respectively. Casts from source type to target type check that arguments of can behave as at run time. Label indicates an abstract location of the cast in source code and it is used to identify failure casts; in a typical implementation, it would be a pair of the file name and the line number where the cast is given. We note that casts in are not equipped with delayed substitution, unlike Sekiyama et al. [Sekiyama/Igarashi/Greenberg_2016_TOPLAS]. We discuss how this change affects the design of the logical relation in Section LABEL:sec:relwork.
The first line of terms, ranged over by , are standard—values, variables (denoted by , , , etc.), primitive operations (denoted by ), term applications, and type applications. We assume that each base type is equipped with an equality operator to distinguish different constants.
The second line presents terms which appear at run time for contract checking. Waiting checks , introduced for fussy cast semantics by Sekiyama et al. [Sekiyama/Nishida/Igarashi_2015_POPL], check that the value of satisfies the contract by turning themselves to active checks. An active check denotes an intermediate state of the check that of satisfies contract ; is an intermediate term during the evaluation of . If evaluates to , the active check returns ; otherwise, if evaluates to , the check fails and uncatchable exception , called blame [Findler/Felleisen_2002_ICFP], is raised.
We introduce usual notation. We write and for the sets of free term variables and free type variables that occur in , respectively. Term is closed if . and denote terms obtained by substituting and for variables and in in a capture-avoiding manner, respectively. These notations are also applied to types, typing contexts, and evaluation contexts (introduced in Section 2.2). We write for the set of term and type variables bound in . We also write for if does not occur free in , for , and for .
2.2. Operational Semantics
has call-by-value operational semantics in the small-step style, which is given by reduction and evaluation over closed terms. We write and for the reflexive transitive closures of and , respectively. Reduction and evaluation rules are shown in Figure 2.
(R_Op) says that reduction of primitive operations depends on function , which gives a denotation to each primitive operation and maps tuples of constants to constants; for example, denotes . We will describe requirements to in Section 2.3. Term and type applications evaluate by the standard -reduction ((R_Beta) and (R_TBeta)).
Cast applications evaluate by combination of cast reduction rules, which are from Sekiyama et al. [Sekiyama/Nishida/Igarashi_2015_POPL] except (R_Forall). Casts between the same base type behave as an identity function (R_Base). Casts for function types produce a function wrapper involving casts which are contravariant on the domain types and covariant on the codomain types (R_Fun). In taking an argument, the wrapper converts the argument with the contravariant cast so that the wrapped function can accept it; if the contravariant cast succeeds, the wrapper invokes with the conversion result and applies the covariant cast to the value produced by . (R_Fun) renames in the codomain type of the source function type to because expects to be replaced with arguments to but they are actually denoted by in the wrapper. Casts for universal types behave as in the previous work [Belo/Greenberg/Igarashi/Pierce_2011_ESOP, Sekiyama/Igarashi/Greenberg_2016_TOPLAS]; it produces a wrapper which, applied to a type, invokes the wrapped type abstraction and converts the result (R_Forall). Casts for refinements types first peel off all refinements in the source type (R_Forget) and then check refinements in the target type with waiting checks (R_PreCheck). After checks of inner refinements finish, the outermost refinement will be checked by an active check (R_Check). If the check succeeds, the checked value is returned (R_OK); otherwise, the cast is blamed (R_Fail).
Evaluation uses evaluation contexts [Felleisen/Hieb_1992_TCS], given as follows, to reduce subterms (E_Red) and lift up blame (E_Blame).
This definition indicates that the semantics is call-by-value and arguments evaluate from left to right.
Fussy versus sloppy
Our cast semantics is fussy in that, when is applied, all refinements in target type are checked even if they have been ensured by source type . For example, let us consider reflexive cast . When applied to , the cast application forgets the refinements in the source type of the cast (R_Forget):
and then refinements in the target type are checked from the innermost through the outermost by using waiting checks (R_PreCheck):
even though would be typed at and satisfy the refinements.
In contrast, Belo et al.’s semantics [Belo/Greenberg/Igarashi/Pierce_2011_ESOP] is sloppy in that checks of refinements that have been ensured are skipped, which is represented by two cast reduction rules:
where is the reduction relation in the sloppy semantics. The first rule processes reflexive casts as if they are identity functions and the second checks only the outermost refinement because others have been ensured by the source type. Under the sloppy semantics, reduces to in one step. The sloppy semantics allows a logical relation to take arbitrary binary relations on terms for interpretation of type variables [Belo/Greenberg/Igarashi/Pierce_2011_ESOP].
It is found that, however, naive sloppy semantics does not satisfy the so-called cotermination (:reffh-coterm-true), a key property to show type soundness and parametricity in manifest contracts; Sekiyama et al. investigated this problem in detail [Sekiyama/Igarashi/Greenberg_2016_TOPLAS]. Briefly speaking, the cotermination requires that reduction of subterms preserves evaluation results, but the sloppy semantics does not satisfy it. For example, let where is a negation function on Booleans. Since reflexive cast behaves as an identity function in the sloppy semantics, evaluates to for any value . Since , the cotermination requires that also evaluate to because reduction of subterm to must not change the evaluation result. However, checks refinement in , which gives rise to blame; thus, the cotermination is invalidated.
The problem above does not happen in the fussy semantics. Under the fussy semantics, since all refinements in a cast are checked, both casts and check refinement and raise blame.
2.3. Type System
The type system consists of three judgments: typing context well-formedness , type well-formedness , and term typing . They are derived by rules in Figure 3. The well-formedness rules are standard or easy to understand, and the typing rules are based on previous work [Belo/Greenberg/Igarashi/Pierce_2011_ESOP, Sekiyama/Igarashi/Greenberg_2016_TOPLAS]. We suppose that types of constants and primitive operations are provided by function . Requirements to their types will be described at the end of this section. Casts are well typed when their source and target types are compatible (T_Cast). Types are compatible if they are the same modulo refinements. This is formalized by type compatibility , which is derived by the rules shown at the bottom of Figure 3. The type of a term application is required to be well formed (T_App). As we will see the proof in detail, this condition is introduced for showing the parametricity (:reffh-lr-param). The typing rule (T_WCheck) of waiting checks requires to have because it is checked at run time that the evaluation result of satisfies which refers to of . Although waiting checks are run-time terms, (T_WCheck) does not require and to be closed, unlike other run-time typing rules such as (T_ACheck). This relaxation allows type-preserving static decomposition of into a smaller cast and a waiting check for refinement (:reffh-cc-precheck in Section LABEL:sec:reasoning-cast-decomp). Active checks are well typed if is an actual intermediate state of evaluation of (T_ACheck). (T_Forget) and (T_Exact) are run-time typing rules: the former forgets a refinement and the latter adds a refinement that holds.
(T_Conv) is a run-time typing rule to show subject reduction. To motivate it, let us consider application where and are typed at and , respectively. This application would be typed at by (T_App). If reduces to , would be at , which is syntactically different from in general. Since subject reduction requires evaluation of well-typed terms to be type-preserving, we need a device that allows to be typed at . To this end, Belo et al. [Belo/Greenberg/Igarashi/Pierce_2011_ESOP] introduced a type conversion relation which relates and and added a typing rule that allows terms to be retyped at convertible types. Their type conversion turns out to be flawed, but it is fixed in the succeeding work [Greenberg_2013_PhD, Sekiyama/Nishida/Igarashi_2015_POPL]. Our type conversion follows the fixed version. [Type Conversion] The binary relation over types is defined as follows: if there exist some , , , and such that and and . The type conversion is the symmetric transitive closure of .
Finally, we formalize requirements to constants and primitive operations. We first define auxiliary function , which strips off refinements that are not under other type constructors:
Requirements to constants and primitive operations are as follows:
For each constant , (1) , (2) is derivable, and (3) satisfies all refinements in , that is, .
For each primitive operation , is a monomorphic dependent function type of the form where, for any , there exists some such that . Furthermore, we require that return a value satisfying the refinements in the return type when taking constants satisfying the refinements in the argument types, that is:
In contrast, we assume that is undefined if some does not satisfy refinements in , that is, cannot be derived.
This section proves type soundness via progress and subject reduction [Wright/Felleisen_1994_IC]. Type soundness can be shown as in the previous work [Sekiyama/Nishida/Igarashi_2015_POPL, Sekiyama/Igarashi/Greenberg_2016_TOPLAS] and so we omit the most parts of its proof.
We start with showing the cotermination (:reffh-coterm-true), a key property for proving not only type soundness but also parametricity and soundness of our logical relation with respect to contextual equivalence. It states that, if , then and behave equivalently, which means that convertible types have the same denotation. Following Sekiyama et al. [Sekiyama/Igarashi/Greenberg_2016_TOPLAS], our proof of the cotermination is based on the observation that is a weak bisimulation. We also refer to the names of the lemmas in the proof script coterm.v.
[name=Unique Decomposition [lemm_red_ectx_decomp in coterm.v]]fh-red-decomp If and and and , then and .
By induction on .
[name=Determinism [lemm_eval_deterministic in coterm.v]]fh-eval-determinism If and , then .