Recalling a Witness: Foundations and Applications of Monotonic State

07/08/2017 ∙ by Danel Ahman, et al. ∙ 0

We provide a way to ease the verification of programs whose state evolves monotonically. The main idea is that a property witnessed in a prior state can be soundly recalled in the current state, provided (1) state evolves according to a given preorder, and (2) the property is preserved by this preorder. In many scenarios, such monotonic reasoning yields concise modular proofs, saving the need for explicit program invariants. We distill our approach into the monotonic-state monad, a general yet compact interface for Hoare-style reasoning about monotonic state in a dependently typed language. We prove the soundness of the monotonic-state monad and use it as a unified foundation for reasoning about monotonic state in the F* verification system. Based on this foundation, we build libraries for various mutable data structures like monotonic references and apply these libraries at scale to the verification of several distributed applications.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Functional programs are easier to reason about than stateful programs, inasmuch as properties proven on pure terms are preserved by evaluation. In contrast, properties of imperative programs generally depend on their evolving state, e.g., a counter initialized to zero may later contain a strictly positive number, requiring properties that depend on this counter to be revised.

To reign in the complexity of reasoning about ever-changing state, verification can be structured using stateful invariants, i.e., predicates capturing properties that hold in every program state. Defining invariants and proving their preservation are the bread and butter of verification; and many techniques and tools have been devised to aid these tasks. A prominent such example is separation logic (Reynolds, 2002; Ishtiaq and O’Hearn, 2001), which offers a way to compose invariants according to the shape of mutable data.

Whereas separation logic is concerned primarily with spatial properties,111Recent variants of separation logic consider resources more abstract than just heap locations. In such settings, in addition to describing spatial properties of resources, one can encode certain kinds of temporal properties. We compare our work to these modern variants of separation logic in §7, focusing here on more familiar program logics for reasoning about heaps. program verification also makes use of temporal properties that control how the state may evolve. For instance, consider a program with a counter that can only be increased. Reading as its current value should allow one to conclude that its value will remain at least , irrespective of state updates. In turn, this may be used to reason about the generation of unique identifiers from fresh counter values. One may hope to recover some of the ease of reasoning about pure programs in this setting, at least for properties that are preserved by counter increments. Capturing this intuition formally and using it to simplify the verification of stateful programs are the main goals of this paper.

1.1. Stateful Invariants vs. Monotonic State and Stable Predicates

Consider a program written against a library that operates on a mutable set. The library provides abstract operations to read the current value of the set and test whether an element is in the set (). Importantly, its only operation to mutate the state is ‘insert’, which adds an element into the set. From this signature alone, one should be able to conclude that an element observed in the set will remain in it. However, proving this fact is not always easy in existing program logics. To illustrate our point, consider trying to prove the assertion at the end of the following program:

insert v; complex_procedure(); assert (v $\in$ get())

In a Floyd-Hoare logic, one may prove that our code maintains a stateful invariant on the set, i.e., prove the Hoare triple {inv} complex_procedure() {inv}, where inv s is defined as  v $\in$ s. One may rely on separation logic to conduct this proof, for instance by framing inv across complex_procedure provided it does not operate on the set. However, when complex_procedure also inserts elements, one must carry inv through its proof and reason about the effect of these insertions to confirm that they preserve inv. This quickly becomes tedious, e.g., showing that the set also retains some other element w requires another adjustment to the proof of complex_procedure. Besides, although this Hoare triple suffices to prove our assertion, it does not by itself ensure that v remains in the mutable set throughout execution. For example, v could have been temporarily removed and then reinserted. While detailed stateful invariants are often unavoidable, they are needlessly expensive here: knowing that elements can only be inserted, we would like to conclude that v $\in$ s is stable.

Our Solution: Monotonic State and Stable Predicates.

Let be a preorder (that is, a reflexive-transitive relation) on program states, or a fragment thereof. We say that a program is monotonic when its state evolves according to the preorder (that is, we have for any two successive states), and that a predicate on its state is stable when it is preserved by the preorder (that is, when for all states and , we have p $s_0$ /\ $s_0 \leq s_1$ ==>p $s_1$). We outline an extension of Hoare-style program logics in this setting: we restrict any state-changing operations so that they conform with , and we extend the logic and the underlying language with the following new constructs:

  1. [leftmargin=0.7cm]

  2. A logical capability, witnessed, that turns a predicate p on states into a state-independent proposition witnessed p, expressing both that i) p was observed in some past state and ii) it will hold in every future state, including the program’s current state.

  3. A weakening principle, (forall s. p s ==> q s) ==> (witnessed p ==> witnessed q).

  4. Two actions: witness p, to establish witnessed p, given that p is stable and holds in the current state; and recall p, to (re)establish p in the current state, given that witnessed p holds.

Continuing with our example, one may pick set inclusion as the preorder and check that the only mutating operation, insert, respects it. Then, preserving (i.e., framing) stable properties across state updates is provided by the logic whenever explicitly requested using witness and recall. For instance, we can revise our example program as shown below to prove the assertion:

insert v; $\color{dkred}\mathsf{witness~inv};$ complex_procedure(); $\color{dkred}\mathsf{recall~inv};$ assert (v $\in$ get())

Crucially, witness inv yields the postcondition witnessed inv, which, being a pure, state-independent proposition, is trivially maintained across complex_procedure without the need to analyze its definition. With recall inv, we recover inv of the current state without having to prove that it is related to the state in which inv was witnessed, since this follows from the stability of inv with respect to . We could also prepend insert w to this code, and still would not need to revisit the proof of complex_procedure to establish that the final state contains w—we only need to insert the corresponding witness (fun s -> w $\in$ s) and recall (fun s -> w $\in$ s) operations into our program.

1.2. Technical Overview and Contributions

A point of departure for our work is Swamy et al.’s (2013b) proposal to reason about monotonic state using a variant of the witness and recall primitives. Following their work, the F programming language (Swamy et al., 2016) has embraced monotonicity in the design of many of its verification libraries. Although these libraries have been founded on ad hoc axioms and informal meta-arguments, they are used extensively in several large-scale projects, including verified efficient cryptographic libraries (Protzenko et al., 2017; Zinzindohoué et al., 2017) and a partially-verified implementation of TLS (Bhargavan et al., 2017a, b). The theory of monotonic state developed here provides a new, unifying foundation accounting for all prior, ad hoc uses of monotonicity in F, while also serving as a general basis to the verification of monotonic properties of imperative programs in other systems. Our contributions include the following main points.

The Monotonic-State Monad (§ 2.2).

We propose MST, the monotonic-state monad, and its encoding within a sequential, dependently typed language like F. Its interface is simpler and more general than Swamy et al.’s (2013b) with just four actions for programming and reasoning about global, monotonic state (get, put, witness, and recall), together with a new logical connective (witnessed) for turning predicates on states into state-independent propositions.

Formal Foundations of the Monotonic-State Monad (§ 5, § 6).

We investigate the foundations of the monotonic-state monad by designing a sequent calculus for a first-order logic with the witnessed connective, proving it consistent via cut admissibility. We use this to prove the soundness of a Hoare-style logic for a core dependently typed lambda calculus augmented with the MST interface. We present this in two stages. First, we focus on an abstract variant of MST and prove our Hoare logic sound in the sense of total correctness (§ 5). Next, we show how to soundly reveal the representation of MST computations as pure state-passing functions, while carefully ensuring that the preorder enforced by MST is not violated. This pure representation can be used to conduct relational proofs (e.g., showing noninterference) for programs with monotonic state, and to enable clients to safely extend interfaces based on MST with new, preorder-respecting actions (§ 6).

Typed Heaps and Three Flavors of References (§ 2.3 and § 2.4).

Using MST, we encode more convenient forms of mutable state, including heaps with dynamic allocation and deallocation. Our model supports both untyped references uref with strong (non-type-preserving) updates, as well as typed references (ref t) with only weak (type-preserving) updates. Going further, we use MST to program monotonic typed references (mref t rel) that allow us to select a preorder for each reference separately, with ref t just a special case of monotonic references with a trivial preorder. As such, programmers can opt-in to monotonicity whenever they allocate a reference, and retain the generality of non-monotonic state and stateful invariants wherever needed.

Secure File-Transfer, a First Complete Example Application (§ 3).

Based on monotonic references, we verify a secure file-transfer application, illustrating several practical uses of monotonicity: ensuring safe memory initialization; modeling ghost distributed state as an append-only log of messages; and the interplay between stateful invariants, refinement types, and stable predicates. This application provides a standalone illustration of the essential use of monotonicity in a larger scale F verification project targetting the main streaming authenticated encryption construction used in TLS 1.3 (Bhargavan et al., 2017b).

Ariadne, Another Application to State Continuity (§ 4).

We also develop a new case study based on a recent protocol (Strackx and Piessens, 2016) that ensures the state continuity of hardware-protected sub-systems (such as TPMs and SGX enclaves) in the presence of multiple crashes and restarts. Our proof consists of programming a “ghost state machine” to keep track of the progress of the protocol. It illustrates the combination of ghost state and monotonicity as an effective style for verifying distributed algorithms, following the intuition that in a well-designed protocol, the logical properties carried by every message must be stable. While we do not address concurrency in this paper, this case study also illustrates how monotonic state may be usefully composed with other computational effects, such as failures and exceptions.

Example LOC
Memory models 2152
Array library 544
File transfer 630
Ariadne 232

In § 7 we discuss related work and in § 8 we outline future work and conclude. Additional materials associated with this paper are available from https://fstar-lang.org/papers/monotonicity. These include full definitions and proofs for the formal results in § 5 and § 6, as well as the code for all examples in the paper. The table alongside shows the number of lines of F code for these examples. Compared to their code, the listings in the paper are edited for clarity and sometimes omit uninteresting details.

2. The Monotonic-State Monad, By Example

Verifying stateful programs within a dependently typed programming language and proof assistant is attractive for the expressive power provided by dependent types, and for the foundational manner in which a program’s semantics can be modeled. Concretely, we develop our approach in F but our ideas should transfer to other settings, e.g., Hoare Type Theory (Nanevski et al., 2008), which F resembles, and also to other Hoare-style program logics. Since dependent type theory provides little by way of stateful programming primitives, we will have to build support for stateful programming from scratch, starting from a language of pure functions. We focus primarily on modeling stateful programming faithfully, rather than efficiently implementing imperative state. For instance, we will represent mutable heaps as functions from natural numbers to values at some type, although an efficient implementation should use the mutable memory available primitively in hardware.

We start by defining a simple, canonical state monad, the basic method by which stateful programming is introduced in F (§ 2.1). We then present the monotonic-state monad, MST, in its simplest, yet most general form: a global state monad parameterized by a preorder which constrains state evolution (§ 2.2). We instantiate this generic version of MST, first, to model mutable heaps with allocation, deallocation, untyped and typed references (§ 2.3), and then generalize our heaps further to include per-reference preorders (§ 2.4). Throughout, we use small examples to illustrate the pervasive applicability of monotonic state in many verification scenarios.

2.1. F: A Brief Review and a Basic State Monad

We start with a short primer on F and its syntax, showing how to extend it with a state effect based on a simple state monad. In particular, F is a programming language with full dependent types, refinement types, and a user-defined effect system. Its effect system includes an inference algorithm based on an adaptation of Dijkstra’s weakest preconditions to higher-order programs. Programmers specify precise pre- and post-conditions for their programs using a notation similar to Hoare Type Theory and F checks that the inferred specification is subsumed by the programmer’s annotations. This subsumption check is reduced to a logical validity problem that F discharges through various means, including a combination of SMT solving and user-provided lemmas.

Basic Syntax.

F syntax is roughly modeled on OCaml (val, let, match etc.) although there are many differences to account for the additional typing features. Binding occurrences b of variables take the form x:t, declaring a variable x at type t; or #x:t indicating that the binding is for an implicit argument. The syntax fun (b$_1$)  (b$_n$) -> t introduces a lambda abstraction, whereas b$_1$ ->  -> b$_n$ -> c is the shape of a curried function type—we emphasize the lack of enclosing parentheses on the b$_i$. Refinement types are written b{t}, e.g., x:int{x>=0} is the type of non-negative integers (nat

). As usual, a bound variable is in scope to the right of its binding; we omit the type in a binding when it can be inferred; and for non-dependent function types, we omit the variable name. For example, the type of the pure append function on vectors is written

#a:Type -> #m:nat -> #n:nat -> vec a m -> vec a n -> vec a (m + n), with the two explicit arguments and the return type depending on the three implicit arguments marked with #.

Basic State Monad and Computation Types.

Programmers can extend the core F language of pure, total functions to effectful programs, by providing monadic representations for the effects concerned. For example, the programmer can extend F for stateful programming by defining the standard state monad (with the type st a = state -> a * state) together with two actions, get and put. Given this monad, using a construction described by Ahman et al. (2017), F derives the corresponding computation type ST t (requires pre) (ensures post), describing computations that, when run in an initial state s$_0$ satisfying pre s$_0$, produce a result r:t and a final state s$_1$ satisfying post s$_0$ r s$_1$. On top of that, F also derives a get and a put action for ST, with the following types:

val get : unit -> ST state (requires (fun _ -> True)) (ensures (fun s$_0$ s s$_1$ -> s$_0$ == s /\ s == s$_1$))
val put : s:state -> ST unit (requires (fun _ -> True)) (ensures (fun _ _ s$_1$ -> s$_1$ == s))

and, when taking state = nat, the double function below has a trivial precondition (which we often omit) and a postcondition stating that the final state is twice the initial state.

val double : unit -> ST unit (requires (fun _ -> True)) (ensures (fun s$_0$ _ s$_1$ -> s$_1$ == 2 * s$_0$))
let double () = let x = get () in put (x + x)

2.2. Mst: The Monotonic-State Monad

effect MST (a:Type) (requires (pre:(state -> Type))) (ensures (post:(state -> a -> state -> Type)))
(* [get ()]: A standard action to retrieve the current state *)
val get : unit -> MST state (requires (fun _ -> True)) (ensures (fun s$_0$ x s$_1$ -> s$_0$ == x /\ s$_0$ == s$_1$)) $\label{line:MST.get}$
(* [put s]: An action to evolve the state to s‘, when s is related to the current state by rel *)
val put : s:state -> MST unit (requires (fun s$_0$ -> s$_0$ rel s)) (ensures (fun s$_0$ _ s$_1$ -> s$_1$ == s)) $\label{line:MST.put}$
(* [stable rel p]: p is stable if it is invariant w.r.t rel *)
let stable (#a:Type) (rel:preorder a) (p: (a -> Type)) = forall x y. p x /\ x rel y ==> p y $\label{line:MST.stable}$
(* [witnessed p]: p was true in some prior program state and will remain true *)
val witnessed : (state -> Type) -> Type $\label{line:MST.witnessed}$
(* [witness p]: A logical action; if p is true now, and is stable w.r.t rel‘, then it remains true *)
val witness : p:(state -> Type) -> MST unit (requires (fun s$_0$ -> p s$_0$ /\ stable rel p))$\label{line:MST.witness}$
                                       (ensures (fun s$_0$ _ s$_1$ -> s$_0$ == s$_1$ /\ witnessed p))
(* [recall p]: A logical action; if p was witnessed in a prior state, recall that it holds now *)
val recall  : p:(state -> Type) -> MST unit (requires (fun _ -> witnessed p))$\label{line:MST.recall}$
                                     (ensures (fun s$_0$ _ s$_1$ -> s$_0$ == s$_1$ /\ p s$_1$))
(* [witnessed_weaken p q]: witnessed is $\mbox{\color{dkgray}{\textit{functorial}}}$ *)
val witnessed_weaken : p:_ -> q:_ -> Lemma ((forall s. p s ==> q s) ==> witnessed p ==> witnessed q)
Figure 1. MST: The monotonic-state monad (for rel:preorder state)

In a nutshell, we use an abstract variant of st parameterized by a preorder that restricts how the state may be updated—abstraction is key here, since it allows us to enforce this update condition. A preorder is simply a reflexive and transitive relation:

let preorder a = rel: (a -> a -> Type) {(forall x . x rel x) /\ (forall x y z. x rel y /\ y rel z ==> x rel z)}

where  x rel y is F infix notation for rel x y. (Preorders are convenient, inasmuch as we usually do not wish to track the actual sequence of state updates.) Figure 1 gives the signature of the abstract monotonic-state monad MST, parameterized by an implicit relation rel of type preorder state. Analogously to ST, the MST computation type is also indexed by a result type a, by a precondition on the initial state, and by a postcondition relating the initial state, the result, and the final state. The bind and return for MST are the same as for ST. The actions of MST are its main points of interest, although the get action (line LABEL:line:MST.get) is still unsurprising—it simply returns the current state.

Enforcing Monotonicity by Restricting put.

The put action requires that the new state be related to the old state by our preorder (line LABEL:line:MST.put). This is the main condition to enable monotonic reasoning.

Making Use of Monotonicity with witness and recall.

The two remaining actions in MST have no operational significance—they are erased after verification. Logically, they capture the intuition that any reachable state evolves according to rel. The first action, witness p (line LABEL:line:MST.witness), turns a stable predicate p valid in the current state (i.e., p s$_0$ holds) into a state-independent, logical capability witnessed p. Conversely, the second action, recall p, turns such a capability into a property that holds in the current state p s$_1$. In other words, a stable property, once witnessed to be valid, can be freely assumed to remain valid (via recall) irrespective of any intermediate state updates, since each of these updates (via the precondition of put) must respect the preorder.

A Tiny Example: Increasing Counters.

Taking state = nat and rel = <=, we can use MST to model an increasing counter. The F code below (adapted from §1) shows how to use monotonicity to prove the final assertion, regardless of the stateful complex_procedure.

let x = get () in witness (fun s -> x <= s); complex_procedure (); recall (fun s -> x <= s); assert (x <= get())

The key point here is that the proposition witnessed (fun s -> x <= s) obtained by the witness action is state independent and can thus be freely transported across large pieces of code, or even unknown ones like complex_procedure. In F any state independent proposition can be freely transported using either the typing of bind or the rule for subtyping, which plays the same role as the rule of consequence in classical Hoare Logic. The following valid Hoare logic rule illustrates this intuition:

As such, MST provides a modular reasoning principle, which is key to scaling verification in the face of state updates—the success of approaches like separation logic (Reynolds, 2002) and dynamic framing (Kassios, 2006) speak to the importance of modular reasoning. Monotonicity provides another useful, modular principle, one that it is quite orthogonal to physical separation—in the example above, complex_procedure could very well mutate the state of its context, yet knowing that it only does so in a monotonic manner allows x <= get() to be preserved across it.

Discussion: Temporarily Escaping the Preorder.

In some programs, a state update may temporarily break our intended monotonicity discipline. For example, consider a mutable 2D point whose coordinates can only be updated one at a time. If the given preorder were to require the point to follow some particular trajectory (e.g., x=y), it would prevent any update to x or y.

One way to accommodate such examples is to define a more sophisticated preorder to track states where the original preorder can be temporarily broken. For instance, if we want the state s to respect a base rel$_\textsf{s}$: preorder s, while temporarily tolerating violations of this preorder, we could instantiate MST with a richer state type, t = Ok: s -> t | Tmp: s -> s -> t, where ‘Tmp snapshot actual’ represents a program state with ‘actual‘ as its current value and ‘snapshot‘ as its last value obtained by updates that followed rel$_\textsf{s}$. We lift rel$_\textsf{s}$ to rel$_\textsf{t}$ so that once in Tmp, the actual state can evolve regardless of the preorder rel$_\textsf{s}$ (see all the Tmp cases below):

let rel$_\textsf{t}$ t$_0$ t$_1$ = match t$_0$, t$_1$ with
    | Ok s$_0$, Ok s$_1$ | Ok s$_0$, Tmp s$_1$ _ | Tmp s$_0$ _, Ok s$_1$ | Tmp s$_0$ _, Tmp s$_1$ _ -> s$_0$ rel$_\textsf{s}$ s$_1$

To temporarily break and then restore the base preorder rel, we define the two actions below, checking that the current state is related to the last snapshot by rel$_\textsf{s}$ before restoring monotonicity.

val break : unit -> MST unit (requires (fun t$_0$ -> Ok? t$_0$)) (ensures (fun t$_0$ _ t$_1$ -> let Ok s = t$_0$ in t$_1$ == Tmp s s))
val restore : unit -> MST unit (requires (fun t$_0$ -> match t$_0$ with Ok _ -> False $~$| Tmp s$_0$ s$_1$ -> s$_0$ rel$_\textsf{s}$ s$_1$))
                           (ensures (fun t$_0$ _ t$_1$ -> let Tmp _ s$_1$ = t$_0$ in t$_1$ == Ok s$_1$))

On the Importance of Abstraction.

Suppose one were to treat an MST a (fun _ -> True) (fun _ _ _ -> True) computation as a pure state-passing function of type mst a = s$_0$:state -> (x:a * s$_1$:state{s$_0$ rel s$_1$}). It might seem natural to work with this monadic representation of state, but it can quickly lead to unsoundness. To see why, notice that this representation allows the context to pick an initial state that is not necessarily the consequence of state updates adhering to the given preorder. For example, in the code snippet below, we first observe that the program state is strictly positive, and then define a closure f that relies on monotonicity to recall that its state, when called, is also positive. The precondition of f then requires the state-independent proposition witnessed (fun s -> s > 0) to be valid, which is provable because of the call to witness (fun s -> s > 0) before the let-binding. More importantly, however, this state-independent precondition of f does not put any restrictions on the actual state values with which one can call f (), a pure mst state-passing function. As a result, we can call f () with whichever state we please, e.g., , causing a division by zero error at runtime.

put(get() + 1); witness (fun s -> s > 0); let f () = recall (fun s -> s > 0); 1 / get() in f () 0 (* $\longleftarrow\mbox{\color{dkred}\textbf{BROKEN!}}$ *)

For the next several sections, we treat MST abstractly. We return to this issue in § 6 and carefully introduce two coercions, reify and reflect, to turn MST computations into mst functions and back, while not compromising monotonicity. Using reify, we show how to prove relational properties of MST computations, e.g., noninterference; and using reflect, we show how to add new actions that respect the preorder (while potentially temporarily violating it in an unobservable way).

2.3. Heaps and References, Both Untyped and Typed

The global monotonic state provided by MST is a useful primitive, but for practical stateful programming we need richer forms of state. In this section we show how to instantiate the state and preorder of MST to model references to mutable, heap-allocated data. Our references come in two varieties. Untyped references (type uref) represent transient locations in the heap whose type may change as the program evolves, until they are explicitly deallocated. Typed references (type ref t) are always live (at least observably so, given garbage collection) and contain a t-typed value.

Prior works on abstractly encoding mutable heaps within dependent types, e.g., HTT (Nanevski et al., 2008) and CFML (Charguéraud, 2011), only include untyped references; while primitive and general, they are also harder to use safely, since one must maintain explicit liveness invariants of the form stating that an untyped reference contains a value of type in the current state. Swierstra’s (2009) shape-indexed references are more sophisticated, but still require a proof to accompany each use of the reference to establish that the current heap contains it. Using our monotonic-state monad MST, we show how to directly account for both typed and untyped references, where typed references are just as easy to use as in more mainstream, ML-like languages, without the need for liveness invariants or explicit proof accompanying each use: a value r:ref t is a proof of its own well-typed, membership in the current heap.

We model heap (see code below) as a map from abstract identifiers to either Unused or Used cells, together with a counter (ctr) tracking the next Unused identifier. As in the CompCert memory model (Leroy and Blazy, 2008), identifiers are internally represented by natural numbers (type nat) but these numbers are not connected to the real addresses used by an efficient, low-level implementation. New cell identifiers are freshly generated by bumping the counter and are never reused.

type tag = Typed : tag | Untyped : bool -> tag
type cell = Unused | Used : a:Type -> v:a -> tag -> cell
type heap = H : h:(nat -> cell) -> ctr:nat{forall (n:nat{ctr <= n}). h n == Unused} -> heap

A ‘Used a v tag’ cell is a triple, where a is a type and v is a value of type a. As such, heaps are heterogeneous maps (potentially) storing values of different types for each identifier. The tag is either Typed, marking a cell referred to by a typed reference; Untyped true, for a live, allocated untyped cell; or Untyped false, for a cell that was once live but has since been deallocated. (We distinguish Untyped false from Unused to simplify our model of freshness—a client of our library can treat a newly allocated reference as being distinct from all previously allocated ones.)

Using monotonicity, we now show how to define our two kinds of references. In particular, a typed reference ref t is an identifier id for which the heap has been witnessed to contain a Used, Typed cell of type t. An untyped reference uref, on the other hand, has a much weaker invariant: it was witnessed to contain a Used, Untyped cell that could have since been deallocated.

let has_a_t (id:nat) (t:Type) (H h _) = match h id with Used a _ Typed -> a == t | _ -> False
abstract type ref t = id:nat{witnessed (has_a_t id t)}
let has (id:nat) (H h _) = match h id with Used _ _ (Untyped _) -> True $~$| _ -> False
abstract type uref  = id:nat{witnessed (has id)}

To enforce these invariants on state-manipulating operations, we define a preorder rel on heap that constrains the heap evolution. It states that every Used identifier remains Used; every Typed reference has a stable type; and that no Untyped reference may be reused after deallocation.

let rel (H h$_0$ _) (H h$_1$ _) = forall id. match h$_0$ id, h$_1$ id with
                            | Used a _ Typed, Used b _ Typed -> a == b
                            | Used _ _ (Untyped live$_0$), Used _ _ (Untyped live$_1$) -> not live$_0$ ==> not live$_1$
                            | _ -> False

Instantiating MST with state=heap and the preorder above, we can implement the expected operations for allocating, reading, writing typed references. The alloc action below allocates a new, typed reference ref t by generating a fresh identifier id; extending the heap at id with a new typed cell; and witnessing the new state, ensuring that id will contain a t-typed cell for ever. Reading and writing a reference are similar: they both recall the reference exists in the heap at its expected type.

let alloc #t (v:t) : MST (ref t) (ensures (fun h$_0$ id h$_1$ -> fresh id h$_0$ h$_1$ /\ modifies {} h$_0$ h$_1$ /\ h$_1$.[id] == v)) =
    let H h id = get () in put (H (upd h id (Used t v Typed)) (id + 1)); witness (has_a_t id t); id
let (!) #t (r:ref t) : MST t (ensures (fun h$_0$ v h$_1$ -> h$_0$ == h$_1$ /\ has_ref r h$_1$ /\  h$_1$.[r] == v)) =
    recall (has_ref r); let h = get () in h.[r]
let (:=) #t (r:ref t) (v:t) : MST unit (ensures (fun h$_0$ _ h$_1$ -> modifies {r} h$_0$ h$_1$ /\ has_ref r h$_1$ /\ h$_1$.[r] == v)) =
    recall (has_ref r); let H h ctr = get () in put (H (upd h r (Used t v Typed), ctr))

These functions make use of a few straightforward auxiliary definitions for freshness of identifiers (fresh) and for the write-footprint of a computation (modifies). The one subtlety is in the definition of h.[r], a total function to select a reference r from a heap h. Unlike the stateful (!) operator, h.[r] has a precondition requiring that h actually contain r—even though the type of r indicates that it has been witnessed in some prior heap of the program, this does not suffice to recall that it is actually present in an arbitrary heap h. In other words, pure functions may not use recall. On the other hand, the stateful lookup (!) is free to recall the membership of r in the current heap in order to meet the precondition of h.[r].222In our revision to the libraries of F, we use a more sophisticated representation of ref t that enables a variant of h.[r] without the has_ref h r precondition. This variant is convenient to use in specifications, since its well-typedness is easier to establish. We omit it for simplicity, but the curious reader may consult the FStar.Heap library for the full story.

let has_ref #t (r:ref t) h = has_a_t r t h
let fresh #t (r:ref t) (H h$_0$ _) (H h$_1$ _) = h$_0$ r == Unused /\ has_ref r h$_1$
let modifies (ids:set nat) (H h$_0$ _) (H h$_1$ _) = forall id. id $\not\in$ ids /\ Used? (h$_0$ id) ==> h$_0$ id == h$_1$ id
let _.[_]‘ #t h (r:ref t{has_ref r h}) : t = let H h _ = h in match h r with Used _ v _ -> v

The operations of untyped references are essentially simpler counterparts of alloc, (!) and (:=) with weaker types. The free operation is easily defined by replacing an Untyped cell with a version marking it as deallocated. The precondition of free prevents double-frees, and is necessary to show that the preorder is preserved as we mark the cell deallocated.

let live (r:uref) (H h _) = match h r with Used _ _ (Untyped live) -> live | _ -> false
let free (r:uref) : MST unit (requires (live r)) (ensures (fun h$_0$ _ h$_1$ -> modifies {r} h$_0$ h$_1$)) =
   let H h ctr = get () in put (H (upd h r (Used unit () (Untyped false))) ctr)

2.4. Monotonic References

Typed references ref t use a fixed global preorder saying that the type of each ref-cell is invariant. However, we would like a more flexible form, allowing the programmer to associate a preorder of their choosing with each typed reference. In this section, we present a library for providing a type ‘mref a ra’, a typed reference to a value of type ‘a’ whose contents is constrained to evolve according to a preorder ‘ra’ on ‘a’. Using mrefs, one can for instance encode a form of typestate programming (Strom and Yemini, 1986) by attaching to an mref a preorder that corresponds to the reachability relation of a state machine.

As above, our global, monotonic-state monad MST can be instantiated with a suitable heap type for the global state (defined below) and a preorder on the global state that is intuitively the pointwise composition of the preorders associated with each mref that the state contains. In this setting, the type ref t can be reconstructed as a derived form, i.e., ref t = mref t (fun _ _ -> True)

An Interface for Monotonic References.

When allocating a monotonic reference one picks both the initial value and the preorder constraining its evolution. An mref a ra can be dereferenced unconditionally, whereas assigning to an mref a ra requires maintaining the preorder, analogous to the precondition on put for the global state.

type mref : a:Type -> ra:preorder a -> Type
val (:=) : #a:Type -> #ra:preorder a -> r:mref a ra -> v:a -> MST unit
    (requires (fun h -> h.[r] ra v))
    (ensures  (fun h$_0$ _ h$_1$ -> modifies {r} h$_0$ h$_1$ /\ h$_1$.[r] == v))

The local state analog of witness on the global state allows observing a predicate p on the global heap as long as the predicate is stable with respect to arbitrary heap updates that respect the preorder only on a given reference. Using recall to restore a previously witnessed property remains unchanged.

let stable #a #ra (r:mref a ra) (p:(heap -> Type)) = forall h$_0$ h$_1$. p h$_0$ /\ h$_0$.[r] ra h$_1$.[r] ==> p h$_1$
val witness : #a:Type -> #ra:preorder a -> r:mref a ra -> p:(heap -> Type){stable r p} -> MST unit
    (requires (fun h -> p h))
    (ensures  (fun h$_0$ v h$_1$ -> h$_0$==h$_1$ /\ witnessed p))

Implementing Monotonic References.

To implement mref we choose the following, revised representation of heap and its global preorder. We enrich the tags from §2.3 to additionally record a preorder with every typed cell. Correspondingly, the global preorder on heaps is, as mentioned earlier, the pointwise composition of preorders on typed cells (the cell and heap types are unchanged).

type tag a = Typed : preorder a -> tag a | Untyped : bool -> tag a
type cell = Unused | Used : a:Type -> a -> tag a -> cell
type heap = H : h:(nat -> cell) -> ctr:nat{forall (n:nat{ctr <= n}). h n == Unused} -> heap
let rel (H h$_0$ _) (H h$_1$ _) = forall id. match h$_0$ id, h$_1$ id with
   | Used a$_0$ v$_0$ (Typed ra$_0$), Used a$_1$ v$_1$ (Typed ra$_1$) -> a$_0$ == a$_1$ /\ ra$_0$ == ra$_1$ /\ v$_0$ ra$_0$ v$_1$
   | Used _ _ _ (Untyped live$_0$), Used _ _ _ (Untyped live$_1$) -> not live$_0$ ==> not live$_1$
   | _ -> False
let mref a ra = id:nat{witnessed (fun (H h _) -> match h id with Used b _ (Typed rb) -> a == b /\ ra == rb)}

Monotonic references are our main building block for defining more sophisticated abstractions. While we have shown them here in the context of a flat heap, our libraries provide monotonic references within hyper-heaps (Swamy et al., 2016) and hyper-stacks (Protzenko et al., 2017), more sophisticated, region-based memory models used in F to keep track of object-lifetimes and to encode a weak form of separation.

3. Monotonic state in action: A secure file-transfer application

Consider transferring a file from a sender application to a receiver application . To ensure that the file is transferred securely, the applications rely on a protocol , which is designed to ensure that receives exactly the file that sent (or detects a transmission error) while no one else learns anything about . For instance, could be based on TLS, and use cryptography and networking to achieve its goals, but its details are unimportant. Our example addresses the following concerns:

  1. [leftmargin=0.55cm]

  2. Low-level buffer manipulation: For efficiency reasons, and prepare byte buffers shared with . For instance, the receiver allocates an uninitialized buffer and requests to fill the buffer with the bytes it receives. Monotonicity ensures that once memory is initialized it remains so.

  3. Modeling distributed state: To state and prove the correctness of a distributed system, even one as simple as our 2-party file-transfer scenario, it is common to describe the state of the system in terms of some global ghost (i.e., purely specificational) state. We use monotonicity to structure this ghost state as an append-only log of messages sent so far, ensuring that an observation of the state of the protocol remains consistent even while the state evolves.

  4. Fragmentation and authenticity: Protocol dictates a maximum size of messages that can be sent at once. As such, the sender has to fragment the file into several chunks and the receiver must reconstruct the file. We use monotonicity to show that always reads a prefix of the stream that has sent so far, i.e., receives authentic file fragments from in the correct order.

  5. Secrecy: Finally, we consider possible implementations of the protocol itself and prove that under certain, standard cryptographic hypotheses, leaks no information about the file , other than some information about its length.

3.1. Safely Using Uninitialized and Frozen Memory

Reasoning about safety in the presence of uninitialized memory is a well-known problem, with many bespoke program analyses targeting it, e.g., Qi and Myers’s (2009) masked types. The essence of the problem involves reasoning about monotonic state—memory is unreadable until it is initialized, and remains readable thereafter. A dual problem is deeming an object no longer writable. For instance, after validating its contents, one may want to freeze an object in a high-integrity state.

Using monotonic references, we designed and implemented a verified library for modeling the safe use of uninitialized arrays that may eventually be frozen, including support for a limited form of pointer arithmetic to refer to prefixes or suffixes of the array. We sketch a fragment of this library here, showing only the parts relevant to the treatment of uninitialized memory.

The main type provided by our library is an abstract type ‘array a n’ for a possibly uninitialized array indexed by its contents type a and length n. An array is implemented under the hood by using a monotonic reference containing a seq (option a) and constrained by the preorder remains_init.

abstract type array (a:Type) (n:nat) = mref (repr a n) remains_init
$\mbox{\textit{where}}$ repr a n = s:seq (option a){len s == n}
$\mbox{\textit{and\;\;\;}}$ remains_init #a #n (s$_0$:repr a n) (s$_1$:repr a n) = forall (i:nat{i < n}). Some? s$_0$.(i) ==> Some? s$_1$.(i)

Notice the interplay between refinements types and monotonic references. The refinement type s:seq (option a){len s == n} passed to mref constrains the stored sequence to be of the appropriate length in every state. Though concise and powerful, refinements of this form can only enforce invariants of each reachable program state taken in isolation. In order to constrain how the array contents can evolve, the preorder remains_init states that the sequence underlying an array can evolve from s$_0$ to s$_1$ only if every initialized index in s$_0$ remains initialized in s$_1$.

Given this representation of array, the rest of the code is mostly determined. For instance, its create function takes a length n but no initial value for the array contents.

abstract let create (a:Type) (n:nat) : ST (array a n) (ensures (fun h$_0$ x h$_1$ -> fresh x h$_0$ h$_1$ /\ modifies { } h$_0$ h$_1$)) =
  alloc (Seq.create n None) remains_init

The pure function as_seq h x enables reasoning about an array x in state h as a sequence of optional values. Below we use it to define which parts of an array are initialized, i.e., those indices at which the array contains Some value.

let index #a #n (x:array a n) = i:nat{i < n}
let initialized #a #n (x:array a n) (i:index x) (h:heap) = Some? (as_seq h x).(i)

The main use of monotonicity in our library is to observe that initialized is stable with respect to remains_init, the preorder associated with an array. As such, we can define a state-independent proposition (x init_at$ $ i) using the logical witnessed capabilities.

let init_at #a #n (x:array a n) (i:index x) = witnessed (initialized x i)

We can now prove that writing to an array at index i ensures that it becomes initialized at i, which is a necessary precondition to read from i. Notice the use of witness when writing, to record the fact that the index is initialized and will remain so; and the use of recall when reading, to recover that the array is initialized at i and so the underlying sequence contains Some v at i.

abstract let write (#a:Type) (#n:nat) (x:array a n) (i:index x) (v:a) : ST unit
 (ensures  (fun h$_0$ _ h$_1$ -> modifies {x} h$_0$ h$_1$ /\ (as_seq h$_1$ x).(i) == Some v /\ x init_at i)) =
 x := Seq.upd !x i (Some v); witness (initialized x i)
abstract let read (#a:Type) (#n:nat) (x:array a n) (i:index x{x init_at i}) : ST a
 (ensures (fun h$_0$ r h$_1$ -> modifies { } h$_0$ h$_1$ /\ Some r == (as_seq h$_0$ x).(i))) =
 recall (initialized x i); match !r.(i) with Some v -> v

It is worth noting that in the absence of monotonicity, the init_at predicate defined above would need to be state-dependent, and thus carried through the subsequent stateful functions as a stateful invariant, causing unnecessary additional proof obligations and bloated specifications.

Freezing and Sub-Arrays.

Our full array library supports freezing an array once it is fully initialized. We add a stateful predicate is_mutable x h indicating that the array x is mutable in state h. By design, is_mutable is not a stable predicate–freezing an array explicitly revokes mutability. While is_mutable is a precondition of write, freezing an initialized array provides a stable predicate frozen_with x s, where s:seq a represents the stable snapshot of the array contents–the clients can later recall that x’s contents still correspond to s. The library also supports always mutable arrays in the same framework, for which is_mutable is indeed a stable predicate. Consequently, the write operation on always mutable arrays has no stateful precondition, since its argument’s mutability can just be recalled. Internally, these temporal properties of an array are managed by a preorder capturing the state-transition system shown alongside. Aside from these core operations, our library provides functions to create aliases to a prefix or a suffix of an array, while propagating information about the fragments of the array that are initialized or frozen to the aliases.

3.2. Modeling the Protocol’s Distributed State

Beyond simple safety, to prove our file-transfer application correct and secure, we first need to model the protocol . Conceptually, we model each instance of the protocol using a ghost state shared between the protocol participants (in this case, just and ). The shared state contains a log of message fragments already sent by ; the main invariant of the protocol is that successful receive operations return fragments from a prefix of the stream of fragments sent so far. This style of modeling distributed systems has a long tradition (Chandy and Lamport, 1985), and several recent program logics and verification systems have incorporated special support for such shared ghost state (Bhargavan et al., 2010; Swamy et al., 2013a; Sergey et al., 2018)—we just make use of our monotonic references library for this.

Connection State.

The state of a protocol instance is the abstract type connection. Its interface (shown below) provides a function log c h, representing the messages sent so far on c; and a sequence counter ctr c h, representing the current position in the log. Whereas the sender and receiver each maintain their own counters, the log is shared, specification-only state between the two participants.333The log is only needed for specification and modeling purposes. So, in practice, we maintain the log in computationally irrelevant state, which F supports. However, we elide this level of detail from our example here, since it is orthogonal.

type connection
type message = s:seq byte{len s <= fragment_size}
val log: connection -> heap -> seq message
val ctr: connection -> heap -> nat
val is_receiver: connection -> bool
let receiver = c:connection{is_receiver c}
let sender = c:connection{not (is_receiver c)}

Monotonic Properties of the Protocol.

Monotonicity comes into play with the snap operation. A snapshot of the distributed protocol state remains stable as the state evolves. Obtaining such stable snapshots is a basic component of designing and verifying distributed protocols, and our libraries make this easy to express. In particular, the snapshotted log remains a prefix of the log and the counter never decreases. The clients can also recall that in any state, the counter value is bounded by the length of the log, i.e., the valid transitions of the counter are dependent on the log, as depicted in the figure above.

let snapshot c h$_0$ h = log c h$_0$ is_a_prefix_of log c h /\ ctr c h$_0$ <= ctr c h /\ ctr c h$_0$ <= len (log c h)
val snap: c:connection -> MST unit (ensures (fun h$_0$ _ h$_1$ -> h$_0$ == h$_1$ /\ witnessed (snapshot c h$_0$)))
val recall_counter: c:connection -> MST unit (ensures (fun h$_0$ _ h$_1$ -> h$_0$ == h$_1$ /\ ctr c h$_0$ <= len (log c h$_0$)))

Receiving a Message.

Receiving a message using receive buf c requires buf to contain enough space for a message, and that buf and c use disjoint state. It ensures that only the connection and the buffer are modified; that at most fragment_size bytes are received into buf; and the bytes correspond to the message recorded in the log at the current counter position, which is then incremented.

val receive: #n:nat{fragment_size <= n} -> buf:array byte n -> c:receiver{disjoint c buf} -> MST (option nat) $\label{line:protocol.receive}$
 (ensures  (fun h$_0$ ropt h$_1$ -> match ropt with None -> h$_0$ == h$_1$
                        | Some r -> modifies {c, buf} h$_0$ h$_1$ /\ r <= fragment_size /\ modifies_array buf 0 r /\
                                  all_init (prefix buf r) /\ ctr c h$_1$ == ctr c h$_0$ + 1 /\ log c h$_1$ == log c h$_0$ /\
                                  sub_sequence (as_Seq buf h$_1$) 0 r == (log c h$_0$).(ctr c h$_0$)))

Sending a Message.

Sending a message using send buf c, requires buf to be an array of initialized bytes.444Of course, send and receive actually send or receive messages on the network, so they have more than just MST effect. However, for simplicity here, we model IO in terms of state, as detailed in §3.3. The postcondition ensures that only the connection state is modified, at most fragment_size bytes are sent, the ctr is incremented by one, and the log is extended with the sent message.

val send : #n:nat -> buf:array byte n{all_init buf} -> c:sender{disjoint c buf} -> MST nat
  (ensures (fun h$_0$ sent h$_1$ -> modifies {c} h$_0$ h$_1$ /\ sent <= min n fragment_size /\ ctr c h$_1$ == ctr c h$_0$ + 1 /\
                         log c h$_1$ == snoc (log c h$_0$) (sub_sequence (as_seq buf h$_0$) 0 sent)))

Look Ma, No Stateful Invariants!

To reiterate the importance of monotonicity, note that a salient feature of our protocol interface is its lack of stateful preconditions and invariants. This means, for instance, that we may create and interleave the use of several connections without needing to worry about interference among the instances. In contrast, were our interface to use stateful invariants (e.g., that the counter is always bounded by the length of the log, among other properties), one would need to prove that the invariant is maintained through, or framed across, all state updates. Of course, one should not expect to always eliminate stateful invariants; however, monotonic state, when applicable, can greatly simplify the overhead of maintaining stateful invariants.

3.3. Implementing the Protocol Securely

There are many possible ways to securely implement our protocol interface. We chose perhaps the simplest option, assuming the sender and the receiver share a private source of randomness, and use one-time pads for perfectly secure encryption with a message authentication code (MAC) authenticating the cipher and a sequence number. However, more complex and broadly deployable alternatives have been proven secure using monotonic state in F

, including the main “authenticated encryption with additional data” (AEAD) constructions used in TLS 1.3 (Bhargavan et al., 2017b).

Connection.

A connection comes in two varieties: a pair (S rand entries) for the sender, or a triple (R rand entries ctr) for the receiver. Both sides share the same source of randomness (a stream of random fragments) and a log of entries; the receiver additionally has a monotonic counter to keep track of where in the stream it is currently reading from. Each entry in the log is itself a -tuple of: an index i into the randomness; the plain text m; the cipher text c, with an invariant claiming it to be computed from m via the one-time pad; and the message authentication code, mac. We supplement the log with an invariant that the th entry in the log has index —this will be needed for proving the receiver correct.

type network_message = s:seq byte{len s == fragment_size}
type randomness = nat -> network_message
type entry (rand:randomness) =
  | Entry : i:nat -> m:message -> c:network_message{pad m $\oplus$ rand i == c} -> mac:tag i c -> entry
type entries rand = s:seq (entry rand){forall (i:nat{i < len s}). let Entry j _ _ _ = s.(i) in j == i}
type connection =
  | S : rand:randomness -> entries:mref (entries rand) is_a_prefix_of -> connection
  | R : rand:randomness -> entries:mref (entries rand) is_a_prefix_of
      -> ctr:mref (n:nat{witnessed (fun h -> n <= len h.[entries])}) increasing -> connection

Logs, Counters, and Snapshots.

The types above show several forms of interplay between monotonicity, stable predicates, and refinement types. The preorders is_a_prefix_of and increasing restrict how the log and the counter evolve. Within an Entry, the refinement on the cipher enforces an invariant on the data stored in the log. Perhaps most interestingly, the type of the counter mixes a refinement and a stable predicate: it states that the monotonic reference holding the counter contains n:nat that is guaranteed to be bounded by the number of entries in the current state. This combination encodes a form of monotonic dependence among multiple mutable locations, allowing the preorder of the counter to evolve as the log evolves, i.e., a log update allows the counter to be advanced further along its increasing preorder.

With our type definitions in place, we can implement the signature from the previous subsection.

let entries_of (S _ es | R _ es _) = es
let log c h = Seq.map (fun (Entry _ msg _ _) -> msg) h.[entries_of c] (* the plain texts sent so far *)
let ctr c h = match c with S _ es -> len h.[es] | R _ _ ctr -> h.[ctr] (* write at end; read from a prefix *)
let recall_counter c = match c with S _ _ -> () | R _ es ctr -> let n = !ctr in recall (fun h -> n <= len h.[es])
let snap c = let h$_0$ = ST.get () in recall_counter c; witness (snapshot c h$_0$)

Encrypting and Sending a Message Fragment.

To implement send buf c, we take a message of size at most ‘min n fragment_size’ from buf, pad it if needed, encrypt it using the current key from rand, and compute the MAC using the cipher and the current counter. We then add a new entry to the log, preparing to send the cipher and the MAC, and then call network_send c cipher mac, which expects the cipher and mac to be the same as in the last log entry.

Receiving, Authenticating and Decrypting a Message.

The receiver’s counter indicates the sequence number of the next expected fragment. To receive it, we parse the bytes received from the network into a pair of a cipher and a MAC. We then authenticate that this is indeed the th message sent using mac_verify, which, based on a cryptographic model of the MAC, guarantees that the log contains an appropriate entry for the cipher, sequence number, and MAC. We then decipher the message using the one-time pad, and the invariant on the entries (together with properties about pad, unpad, and ) guarantees that the received message msg is indeed recorded in the log of entries. Before returning the number of bytes received, we must increment the counter. Manipulating the counter’s combination of refinement and stable predicate requires a bit of care: we have to first witness that the new value of the counter will remain bounded by length of the log.

let receive #n buf c = let R rand entries ctr = c in let i = !ctr in
  match parse (network_receive c) with
  | None -> None
  | Some (cipher, mac) ->
    if mac_verify cipher mac i entries then (* guarantees that (Entry i _ cipher mac _) is in entries *)
      let msg = unpad (cipher $\oplus$ rand i) in Array.fill buf msg;
      witness (fun h -> i + 1 <= len h.[entries]);  ctr := i + 1; Some (len msg)
    else None

3.4. Correctness, Authenticity, and Secrecy of File Transfer

Using this protocol, we program and verify the top-level applications for sending and receiving entire files. Our goal is to show that the sender application successfully uses the protocol to fragment and send a file and that the receiver reconstructs exactly that file, i.e., file transfer is correct and authentic. Additionally, we prove that file transfer is confidential, i.e., under a suitable cryptographic model, a network adversary gains no information about the transferred file.

We first specify what it means to send a file correctly in terms of the abstractions provided by the underlying protocol. We use ‘sent_bytes f c from to’ to indicate that the protocol’s log contains exactly the contents of the file f starting at position from until to. The stable property sent f c then expresses that the file f was sent on connection c at some point in the past.

let sent_bytes (file:seq byte) (c:connection) (from:nat) (to:nat{from <= to}) (h:heap) =
  let log = log c h in to <= len log /\ file == flatten (sub_sequence log from to)
let sent (file:seq byte) (c:connection) = $\exists$ from to. witnessed (sent_bytes file from to)

The sender’s top-level function calls into the protocol repeatedly sending message-sized chunks of the file, until no more bytes remain. Its specification ensures the file was indeed sent.

let send_file (#n:nat) (file:array byte n{all_init file}) (c:sender{disjoint c file}) : MST unit
  (ensures  (fun h$_0$ _ h$_1$ -> modifies {c} h$_0$ h$_1$ /\ sent (as_seq file h$_0$) c))
  = let rec aux (from:nat) (pos:nat{pos <= n}) : MST unit
     (requires (fun h$_0$ -> from <= ctr c h$_0$ /\ sent_bytes (sub_seq (as_seq file h$_0$) 0 pos) c from (ctr c h$_0$) h$_0$))
     (ensures  (fun h$_0$ _ h$_1$ -> modifies {c} h$_0$ h$_1$ /\ from <= ctr c h$_1$ /\
                           sent_bytes (as_seq file h$_0$) c from (ctr c h$_1$) h$_1$))
      = if pos <> n then let sub_file = suffix file pos in let sent = send sub_file c in aux from (pos + sent) in
    let h$_0$ = ST.get () in aux (ctr c h$_0$) 0;
    let h$_1$ = ST.get () in witness (sent_bytes (as_seq file h$_0$) c (ctr c h$_0$) (ctr c h$_1$))

The receiver’s top-level application is dual and similar to the sender. In the receiver’s function ‘receive_file file c’, file starts off as a potentially uninitialized buffer, but the postcondition of receive_file guarantees that on a successful run, the file is partially filled with messages from a file that was previously sent on the same connection, i.e., file transfer is authentic.

let received (#n:nat) (file:array byte n) (c:receiver) (h:heap) = file initialized_in h /\ sent (as_seq file h) c
val receive_file (#n:nat) (file:array byte n) (c:receiver{disjoint c file}) : MST (option nat)
   (ensures (fun h$_0$ ropt h$_1$ -> modifies {file, c} h$_0$ h$_1$ /\ (match ropt with None -> True
                                                  | Some r -> r <= n /\ received (prefix file r) c h$_1$)))

Confidentiality.

To prove that the file transfer is confidential, we relate the ciphers and tags sent on the network by two runs of the sender application and prove that for arbitrary files that contain the same number of messages, the network traffic is (probabilistically) indistinguishable. As such, our file-transfer application is only partially length hiding—the adversary learns lower and upper bounds on the size of the file based on the number of messages it contains. Our proof relies on a form of probabilistic coupling (Lindvall, 2002), relating the randomness used in two runs of the sender by a bijection chosen to mask the differences between the two files. The proof technique used is independent of monotonicity and, as such, is orthogonal to this paper.

4. Ariadne: State continuity vs. hardware crashes

We present a second case study of monotonic state at work, this time verifying a protocol whose very purpose is to ensure a form of monotonicity. We may improve the resilience of systems confronted by unexpected failures (e.g., power loss) by having them persist their state periodically, and by restoring their state upon recovery. In this context, state continuity ensures that the state is restored from a recent backup, consistent with the observable effects of the system, rather than from a stale or fake backup that an attacker could attempt to replay or to forge.

For concreteness, we consider state continuity for Intel SGX enclaves. Such enclaves rely on a special CPU mode to protect some well-identified piece of code from the rest of the platform, notably from its host operating system. As long as the CPU is powered, the hardware automatically encrypts and authenticates every memory access, which ensures state confidentiality and integrity for the protected computation. For secure databases and many other applications, these guarantees must extend across platform crashes so as to ensure, for instance, that any commit that has been reported as complete (e.g., by witnessing its result) remains committed once the system resumes.

Several recent papers address this problem (Parno et al., 2011; Matetic et al., 2017; Strackx and Piessens, 2016). In this section, we present a first mechanized proof of Strackx and Piessens’s Ariadne protocol using F, relying on our libraries for monotonic state—the proof essentially consists of supplementing a hardware counter with a ghost state machine to track the recovery process (naturally expressed as a preorder) and then typechecking the recovery code. This example also illustrates the fairly natural combination of monotonic state with other effects like exceptions.

The Ariadne Protocol.

The protocol relies on a single, hardware-based, non-volatile monotonic counter. Incrementing the counter is a privileged but unreliable operation. (It is also costly, hence the need to minimize increments.) If the operation returns, then the counter is incremented; if it crashes, the counter may or may not have been incremented. We model this behavior using MSTExn, a combination of the MST monad and exceptions. The result of an MSTExn computation is either a normal result V v or an exception E e. In practice, our protocol code never throws or catches exceptions—these operations are available to the context only to model malicious hosts.

val incr: c:counter -> MSTExn unit
 (ensures fun h$_0$ r h$_1$ -> if r=V() then ctr c h$_1$ == ctr c h$_0$ + 1 else (ctr c h$_1$ == ctr c h$_0$ \/ ctr c h$_1$ == ctr c h$_0$ + 1))

The protocol also relies on a persistent but untrusted backup, modelled as a host function save. Before saving the backup, the current state is encrypted and authenticated together with a sequence number corresponding to the anticipated next value of the counter. The implementation of this authenticated encryption construction (using the auth_encrypt function below) is similar to what is presented in the §3.3; it makes use of a hardware-based key available only to the protected code. Conversely, the host presents an encrypted backup for recovery. Recovery from the last backup always succeeds, although it may be delayed by further crashes. Recovery from any other backup may fail, but must not break state-continuity; we focus on verifying this safety property of Ariadne.

Below we give the code for creating an Ariadne-protected reference cell and the two main operations for using it: store and recover. Their pre- and postconditions are given later, but at a high level, store requires a good state and recover does not. (There is no separate load from a good state, since the enclave can keep the state in volatile memory once it completes recovery.) For simplicity, we model hardware protection using a private datatype constructor: ‘Protect c k’ has type protected (defined shortly) and packages the enclave capabilities to use the monotonic counter c and backup key k. The recover function also takes as argument a (purportedly) last_saved backup, which is first authenticated and decrypted: if the decryption yields (m:nat,w:state) and m matches the current counter !c, then recovery continues with state w; otherwise it returns None.

let create (w:state) = let c = ref 0 in let k = keygen c in save (auth_encrypt k 0 w); Protect c k
let store (Protect c k) (w:state) = save (auth_encrypt k (!c+1) w); (*1*) incr c
let recover (Protect c k) last_saved =
  match auth_decrypt k !c last_saved with | None -> None | Some (w:state) ->
  save (auth_encrypt k (!c+1) w); (*2*) incr c; (*3*) save (auth_encrypt k (!c+1) w); (*4*) incr c; Some w

In this code, any particular call to save or incr may fail. To update the state, ‘store’ first encrypts and saves a backup of its new state (associated with the anticipated next value of the counter) and then increments the counter. This ordering of save and incr is necessary to enable recovery if a crash occurs at point (*1*) between these two operations, but may provide the host with several valid backups for recovery. For instance, a malicious host may obtain encryptions of both v and w at both !c and !c+1 by causing failure at (*1*); recovering from the older backup; causing failure at (*4*); then recovering and causing failure at (*2*). This is fine as long as the recovery process eventually commits to either v or w before returning it. To this end, recovery actually performs two successive counter increments to clear up any ambiguity, and to ensure there is a unique backup at the current counter and no backup at the next counter. (On the other hand, completing recovery after its first increment would break continuity: with backups at the current counter for both v and w, the host has “forked” the enclave and can indefinitely get updated backups for both computations.) To capture this argument on the intermediate steps of the protocol, we supplement the real state of the counter (modeled here as an integer) with one out of 4 ghost cases, listed below.

type case =
  | Ok:        saved:state -> case (* clean state at the end of store or recover *)
  | Recover:    read:state -> other:state -> case (* at step (3) above *)
  | Writing: written:state -> old:state -> case (* at steps (1) and (4) above *)
  | Crash:      read:state -> other:state -> case (* worst case at step (2) of recovery, outlined above *)

Next, we give the specification of the protocol code, with ‘ghost’ selecting the current ghost case of a counter, and g in_between$ $ (v,w) stating that the ghost case g holds at most v and w.

val create: w:state -> MSTExn protected (ensures fun h$_0$ r h$_1$ -> V? r ==> ghost h$_1$ (V?.v r) == Ok v)
val store: p:protected -> w:state -> MSTExn unit (requires fun h$_0$ -> Ok? (ghost h$_0$ p))
  (ensures fun h$_0$ r h$_1$ -> match r with
                     | V () -> ghost h$_1$ p == Ok w
                     | E _ -> ghost h$_1$ p in_between (Ok?.saved (ghost h$_0$ p),w))
val recover: p:protected -> backup (Protect?.c p) -> MSTExn (option state)
 (ensures fun h$_0$ r h$_1$ -> exists v w. ghost h$_0$ p in_between (v,w) /\ match r with
                                                     $\hspace{1.8ex}$  | V (Some u) -> ghost h$_1$ p == Ok w /\ u == w
                                                     $\hspace{1.8ex}$  | _  $\hspace{8.9ex}$  -> ghost h$_1$ p in_between (v,w))

To verify this specification, we introduce a ‘saved’ predicate that controls, in any given counter state (n,g), which backups (seqn,u) may have been encrypted and saved, and thus may be presented by the host for recovery. We use this predicate both to define our preorder on counter states, and to refine the type nat*state of general authenticated-encrypted values so as to define the type backup c of authenticated, encrypted, and saved backups associated with a given counter state c.

let saved (n:nat,g:case) (seqn:nat,u:state) = (* overapproximating what may have been encrypted and saved *)
  (seqn < n) \/ (* an old state; authentication of seqn will fail, so nothing to say about it *)
  (seqn == n$\hspace{2ex}$   /\ (match g with | Ok v -> u == v
                              $\hspace{-0.06cm}$| Recover w v | Writing w v | Crash w v -> u == w \/ u == v)) \/
  (seqn == n+1 /\ (match g with | Ok _ | Recover _ _ -> False
                              $\hspace{-0.015cm}$| Writing v _ -> u == v
                              $\hspace{-0.015cm}$| Crash v w -> u == v \/ u == w))
let preorder (n$_0$,g$_0$) (n$_1$,g$_1$) = forall s. saved (n$_0$,g$_0$) s ==> saved (n$_1$,g$_1$) s
let saved_backup c s h = h contains c /\ saved (h.[c]) s
type backup c = s:(nat*state){witnessed (saved_backup c s)}
(* append$\text{-}$only ghost logs used to model the history of saving backups; we attach one to every backup key k *)
type log c = mref (list (backup c)) (fun l$_0$ l$_1$ -> l$_0$ prefix_of l$_1$)
type protected = Protect: c:mref (nat*case) preorder -> k:key c -> protected

Our method for verifying the protocol is to use monotonicity to capture the history of saving backups, by augmenting the save function with witness (saved_backup c s) for the given authenticated-encrypted backup s, and the recover function with recall (saved_backup c last_saved) just after authenticated decryption of the (purportedly) last_saved backup the host provides for recovery. This allows us to recover at which stage of the protocol last_saved might have been created. We also instrument key points in the code with erasable modeling functions that operate on the ghost counter state. Typechecking enforces that all updates respect the preorder that ties the ghost state to the actual counter value. For instance, incr has two ghost transitions: (n, Crash w v) $\leadsto$(n+1, Recover w v) for typing call (*2*) above, and (n, Writing w v)$\leadsto$ (n+1, Ok w) for typing calls (*1*) and (*4*). The other transitions update just the ghost case of the counter state. For instance, the only transition that introduces a new w:state is (n,Ok v) $\leadsto$ (n,Writing w v), used in the store function before the call to encrypt the new backup. This yields a concise F proof of the safety of Ariadne’s state continuity, possibly simpler than the original paper proof of Strackx and Piessens.

5. Meta-theory of the abstract monotonic-state monad

While the applications of monotonic state are many, diverse, and sometimes quite complex, in the previous sections we showed how they can all be reduced to our general monotonic-state monad from § 2.2 (MST in Figure 1 on page 1). In this section and the next, we put reasoning with this monotonic-state monad on solid foundations by presenting its meta-theory in two stages.

We first present a calculus called to validate the soundness of witness and recall in a setting in which MST is treated abstractly. In fact, is a fragment of another calculus, , which we study in the next section (§ 6), and that extends with support for safe, controlled monadic reflection and reification for revealing the representation of MST. Both calculi are designed to only capture the essence of reasoning with the monotonic-state monad—thus they do not include other advanced features of dependently typed languages, e.g., full dependency, refinement types, inductives, and universes. The complete definitions and proofs are available in the supplementary materials at https://fstar-lang.org/papers/monotonicity

5.1. : A Dependently Typed Lambda Calculus with Support for witness and recall

The syntax of is inspired by Levy’s fine-grain call-by-value language (Levy, 2004) in that it makes a clear distinction between values and computations. This set-up enables us to focus on stateful computations and validating their correctness, in particular for and actions.

First, value types and computation types are given by the following grammar:

Here, is a fixed abstract type of states, and is a value-dependent function type, where the computation type depends on values of type . Similarly to Hoare Type Theory and F surface syntax, computation types are indexed by pre- and postconditions: the former are given over initial state , and the latter over initial state , return value , and final state . As a convention, we use variables , , , … to stand for states.

Next, value terms and computation terms are given by the following grammar:

Value terms are mostly standard, with denoting a fixed non-empty set of -valued constant symbols. As a convention, we use , , … to stand for value terms of type .

Computation terms include returning values (), sequential composition (), function applications (

), and pattern matching for products (

) and sums (). In addition, computations include and actions for accessing the state and, finally, and actions parameterized by a first-order logic predicate (where the variable is bound in ).

Formulas () used for the pre- and postconditions, as well as in and , are drawn from a classical first-order logic extended with (1) a fixed preorder on states (), (2) typed equality on value terms (), and (3) logical capabilities stating that a predicate was true in some prior program state and can be recalled in any reachable state (). The notation defines a predicate on states by binding the variable of type in the formula .

We take formulas to be in first-order logic (as opposed to Type-valued functions in F) in order to keep the meta-theory simple and focused on stateful computations. Furthermore, this approach enables us to easily establish proof-theoretic properties of the logical witnessed-capabilities via a corresponding sequent calculus presentation (see § 5.5). We use a classical logic to be faithful to F’s SMT-logic of pre- and postconditions shown in our examples.

5.2. Static Semantics of

We define the type system of using judgments of well-formed contexts (), well-formed logical formulas (), well-formed value types (), well-typed value terms (), well-formed computation types (), and well-typed computation terms (). The rules defining the first five judgments are completely standard—we thus omit most of them and only give the formation rule for computation types:

The typing rules for computation terms are more interesting—we present a selection of them in Figure 2. The rules for the , , and actions correspond directly to the typed interface from Figure 1 on page 1. Similarly to F, supports subtyping for value types () and computation types (). Subtyping computation types (rule SubMST from Figure 2) is covariant in the postconditions, and contravariant in the preconditions. Moreover, logical entailment between the postconditions is proved assuming that the preconditions hold, and that the initial and final states are related by (recall that the state is only allowed to evolve according to ).

Figure 2. Selected typing and subtyping rules for

We define logical entailment for formulas using a natural deduction system (, where is a finite set of formulas). Most rules are standard for classical first-order logic, so Figure 3 only lists the rules that are specific to our setting. These include weakening for , reflexivity and transport for equality, reflexivity and transitivity for , and rules about the equality of pair and sum values. In more expressive logics, the latter rules are commonly derivable.

Figure 3. Natural deduction (selected rules)

5.3. Instrumented Operational Semantics of

We equip with a small-step operational semantics that we instrument with a log of witnessed stable properties, which we use as additional logical assumptions in the correctness theorem we prove in § 5.4, so as to accommodate the and actions, and their typing. Formally, we define the reduction relation on configurations , where is the computation term being reduced, is the current state value, and is a finite set logging the stable predicates witnessed so far. Most reduction rules are standard, so we only list the rules for actions below:

The action adds the witnessed stable predicate to the log , while , , and do not use the log at all. The action returns the current state; the action overwrites it.

5.4. Correctness for

We now prove the correctness of our instrumented operational semantics in a Hoare-style program logic sense. In particular, we show that if , reduces to , and holds of the initial state, then holds of the initial state, the returned value , and the final state. We also establish that the initial state is related to the final state, and that the logs can ever only increase. In order to better structure our proofs, we split them into progress and preservation theorems, which when combined give the above-mentioned result in Theorem 5.4.

A key ingredient of the following correctness results is the use of the instrumentation (log of witnessed stable predicates) to provide additional logical assumptions corresponding to the logical witnessed capabilities resulting from earlier witness actions. In detail, these additional logical assumptions take the form . In the results below, we also use well-formed state-log pairs , defined to hold iff and

Theorem 5.1 (Progress).

If then either or .

Preservation crucially uses the well-formedness of to record that all previously witnessed stable predicates are in fact true of the current state, in combination with the assumptions which ensure that, once obtained, logical capabilities remain usable in the future.

Theorem 5.2 (Preservation).

If and such that and , then

  1. and and and

  2. and and and .

Proof.

By induction on the sum of the height of the derivation of and the size of the computation term , and by inverting the judgment for each concrete . Below we comment briefly on the more interesting cases of this proof.

Put: In this case, , , and , and we prove by combining the assumption with that we get by inverting the typing of . We then derive from by using the stability of the witnessed predicates, proving for all .

Witness: In this case, , , , and , and we prove by using the Witnessed-Weaken rule from Figure 3.

Recall: In this case, , , , and , and we are required to prove . We do so by combining the assumption with the proof of , which follows from the assumed precondition ; this is a proof-theoretic property of the logic, see § 5.5 for details. ∎

Finally, we combine progress and preservation results to prove partial correctness for .

Proposition 5.3 (Correctness of ).

If and , then and .

Theorem 5.4 (Partial correctness).

If and we have a reduction sequence such that and , then and , and .

We strengthen this to total correctness by also showing that is strongly normalizing.

Theorem 5.5 ().

If and , then is strongly normalizing in .

Proof.

The proof is based on defining a typing- and reduction structure preserving translation  of types, terms, and configurations to a corresponding strongly normalizing simply typed calculus (by erasing type dependency, logical formulas, and logs, e.g., , 555To simplify the proof, the simply typed calculus includes computationally irrelevant computation terms and ., and ). We establish the strong normalization of this simply typed calculus using the standard -lifting approach (Lindley and Stark, 2005). ∎

5.5. Proof-theoretic Properties of the Logic of Pre- and Postconditions

We conclude our investigation into the meta-theory of by recalling that in the proof of Theorem 5.2, it was crucial (in the Recall case) to construct a derivation of from a derivation of . Intuitively, this proof-theoretic property of the logic means that must be a logical consequence of because could only have been proved using the natural deduction hypothesis and Witnessed-Weaken rules.

It is well known that establishing such properties directly in natural deduction is difficult due to the introduction rule for implication. We follow standard practice and turn to sequent calculus, which we define using judgment , where and are finite sets of formulas. Most rules are standard for classical sequent calculus, so Figure 4 only lists the more relevant ones. Importantly, the rules for do not include cut—we instead prove that it is admissible.

Theorem 5.6 (Admissibility of cut).

The cut rule is admissible in this sequent calculus.

In order to accommodate the rules concerning and from Figure 3, while at the same time ensuring that cut remains admissible, we follow Negri and von Plato (1998) in defining the corresponding rules in the sequent calculus such that they only modify the left-hand side of the entailment judgment—see the equality rules in Figure 4. Also following Negri and von Plato, we restrict the Eq-Transport-SC rule to atomic predicates (the general rule is admissible).