kmclib: Automated Inference and Verification of Session Types

11/23/2021
by   Keigo Imai, et al.
0

Theories and tools based on multiparty session types offer correctness guarantees for concurrent programs that communicate using message-passing. These guarantees usually come at the cost of an intrinsically top-down approach, which requires the communication behaviour of the entire program to be specified as a global type. This paper introduces kmclib: an OCaml library that supports the development of correct message-passing programs without having to write any types. The library utilises the meta-programming facilities of OCaml to automatically infer the session types of concurrent programs and verify their compatibility (k-MC). Well-typed programs, written with kmclib, do not lead to communication errors and cannot get stuck.

READ FULL TEXT VIEW PDF

Authors

page 1

page 2

page 3

page 4

12/10/2013

Towards deductive verification of MPI programs against session types

The Message Passing Interface (MPI) is the de facto standard message-pas...
04/03/2020

Bounded verification of message-passing concurrency in Go using Promela and Spin

This paper describes a static verification framework for the message-pas...
10/28/2020

Actris 2.0: Asynchronous Session-Type Based Reasoning in Separation Logic

Message passing is a useful abstraction for implementing concurrent prog...
10/12/2020

Multiparty Motion Coordination: From Choreographies to Robotics Programs

We present a programming model and typing discipline for complex multi-r...
09/12/2019

Rusty Variation: Deadlock-free Sessions with Failure in Rust

Rusty Variation (RV) is a library for session-typed communication in Rus...
01/28/2019

Verifying Asynchronous Interactions via Communicating Session Automata

The relationship between communicating automata and session types is the...
10/09/2020

CAMP: Cost-Aware Multiparty Session Protocols

This paper presents CAMP, a new static performance analysis framework fo...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Multiparty session types (MPST) [HondaYC08] are a popular type-driven technique to ensure the correctness of concurrent programs that communicate using message-passing. The key benefit of MPST is to guarantee statically that the components of a program have compatible behaviours, and thus no components can get permanently stuck. Many implementations of MPST in different programming languages have been proposed in the last decade [NgCY15, HuY16, KouzapasDPG16, ScalasDHY17, NHYA2018, CastroHJNY19, ImaiNYY19, miuGenerating2020, harveyMultiparty2021, DBLP:journals/pacmpl/00020HNY20], however, all suffer from a notable shortcoming: they require programmers to adopt a top-down approach that does not fit well in modern development practices. When changes are frequent and continual (e.g., continuous delivery), re-designing the program and its specification at every change is not feasible.

Most MPST theories and tools advocate an intrinsically top-down approach. They require programmers to specify the communication (often in the form of a global type) of their programs before they can be type-checked. In practice, type-checking programs against session types is very difficult. To circumvent the problem, most implementations of MPST rely on external toolings that generate code from a global type, see e.g., all works based on the Scribble toolchain [YHNN2013].

In this paper, we present an OCaml library, called kmclib, that supports the development of programs which enjoy all the benefits of MPST while avoiding their main drawbacks. The kmclib library guarantees that threads in well-typed programs will not get stuck. The library also enables bottom-up development: programmers write message-passing programs in a natural way, without having to write session types. Our library is built on top of Multicore OCaml [DBLP:journals/pacmpl/Sivaramakrishnan20] that offers highly scalable and efficient concurrent programming, but does not provide any static guarantees wrt. concurrency.

Figure 1 gives an overview of kmclib. Its implementation combines the power of the type-aware

macro system of OCaml (Typed PPX) with two recent advances in the session types area: an encoding of MPST in OCaml (channel vector types 

[ImaiNYY19]) and a session type compatibility checker (-MC checker [LangeY19]). To our knowledge, this is the first implementation of type inference for MPST and the first integration of compatibility checking in a programming language.

OCamlcode

TypedPPX

OCamlCompiler

Session typedexecutable

-MCchecker [LangeY19]

Sessiontypes

OCaml channelvector types [ImaiNYY19]

Infer

Translate

ok: bound

ko: counterexamples
Figure 1: Workflow of the kmclib library.

The kmclib library offers several advantages compared to earlier MPST implementations. (1) It is flexible: programmers can implement communication patterns (e.g., fire-and-forget patterns [LangeY19]) that are not expressible in the synchrony-oriented syntax of global types. (2) It is lightweight as it piggybacks on OCaml’s type system to check and infer session types, hence lifting the burden of writing session types off the programmers. (3) It is user-friendly thanks to its integration in Visual Studio Code, e.g., compatibility violations are mapped to precise locations in the code. (4) It is well-integrated into the natural edit-compile-run cycle. Although compatibility is checked by an external tool, this step is embedded as a compilation step and thus hidden from the user.

2 Safe Concurrent Programming in Multicore OCaml

We give an overview of the features and usage of kmclib using the program in Figure 2 (top) which calculates Fibonacci numbers. The program consists of three concurrent threads (user, master, and worker) that interact using point-to-point message-passing. Initially, the user thread sends a request to the master to start the calculation, then waits for the master to return a work-in-progress message, or the final result. After receiving the result, the user sends back a stop message. Upon receiving a new request, the master splits the initial computation in two and sends two tasks to a worker. For each task that the worker receives, it replies with a result. The master and worker threads are recursive and terminate only upon receiving a stop message.

Figure 2 (bottom) gives a session type for each thread, i.e., the behaviour of each thread wrt. communication. For clarity we represent session types as a communicating finite state machine (CFSM [cfsm83]), where ! (resp. ?) denotes sending (resp. receiving). For example, means that the user is sending to the master a message compute, while says that the master receives compute from the user. Our library infers these CFSM representations from the OCaml code, in Figure 2 (top), and verifies statically that the three threads are compatible, hence no thread can get stuck due to communication errors. If compatibility cannot be guaranteed, the compiler reports the kind of violations (i.e., progress or eventual reception error) and their locations in the code. Figure 3 shows how such semantic errors are reported visually in Visual Studio Code.

Albeit simple, the common communication pattern used in Figure 2 cannot be expressed as a global type, and thus cannot be implemented in previous MPST implementations. Concretely, global types cannot express the intrinsic asynchronous interactions between the master and worker threads (i.e., the master may send a second task message, while the worker sends a result).

1let KMC (uch,mch,wch) = [%kmc.gen (u,m,w)] 2 3let user () = 4 let uch = send uch#m#compute 42 in 5 let rec loop uch : unit = 6 match receive uch#m with 7 | wip(res, uch) -> 8 printf "in progress: %d\n" res; 9 loop uch 10 | result(res, uch) -> 11 printf "result: %d\n" res; 12 send uch#m#stop () 13 in loop uch 14 15let worker () = 16 let rec loop wch : unit = 17 match receive wch#m with 18 | task(num, wch) -> 19 loop (send wch#m#result (fib num)) 20 | stop((), wch) -> wch 21 in loop wch  
1 let rec loop (mch --!: [%kmc.check u]!--) : unit = 2 match receive mch#u with 3 | compute(x, mch) -> 4 let mch = send mch#w#task (x - 2) in 5 let mch = send mch#w#task (x - 1) in 6 let result(r1, mch) = receive mch#w in 7 let mch = send mch#u#wip r1 in 8 let result(r2, mch) = receive mch#w in 9 loop (send mch#u#result (r1 + r2)) 10 | stop((), mch) -> 11 send mch#w#stop () 12 in loop mch 13 14let () = 15 let ut = Thread.create user () in 16 let mt = Thread.create master () in 17 let wt = Thread.create worker () in 18 List.iter Thread.join [ut;mt;wt]

 

u:

m:

w:

Figure 2: Example of kmclib program (top) and inferred session types (bottom).

Programming with kmclib.  To enable safe message-passing programs, kmclib provides two communication primitives, send and receive, and two primitives for channel creation (KMC and %kmc.gen). We only give a user-oriented description of these primitives here (see Appendix 0.A an overview of their implementations).

The crux of kmclib is the session channel creation: [%kmc.gen (u,m,w)] at Line 2. This primitive takes a tuple of role names as argument (i.e., (u,m,w)) and returns a tuple of communication channels, which are bound to (uch,mch,wch). These channels will be used by the threads implementing roles user (Lines 2-2), worker (Lines 2-2), and master (Lines LABEL:line:masterbegin-2). By default, channels are implemented using concurrent queues from Multicore OCaml (Domainslib.Chan.t) but other underlying transports can easily be provided.

Threads send and receive messages over these channels using the communication primitives provided by kmclib. The send primitive requires three arguments: a channel, a destination role, and a message. For instance, the user sends a request to the master with send uch#m#compute 20 where uch is the user’s communication channel, m indicates the destination, and compute 20 is the message (consisting of a label and a payload). Observe that a sending operation returns a new channel which is to be used in the continuation of the interactions, e.g., uch bound at Line 2. Receiving messages work in a similar way to sending messages, e.g., see Line 2 where the user waits for a message from the master with receive uch#m

. We use OCaml’s pattern matching to match messages against their labels and bind the payload and continuation channel. See, e.g., Lines 

2-2 where the user expects either wip or result message. The receive primitive returns the payload res and a new communication channel uch.

New thread instances are spawned in the usual way; see Lines 2-2. The code at Line 2 waits for them to terminate.

Figure 3: Examples of type errors.

Compatibility and error reporting.  While the code in Figure 2 may appear unremarkable, it hides a substantial machinery that guarantees that, if a program type-checks, then its constituent threads are safe, i.e., no thread gets permanently stuck and all messages that are sent are eventually received. This property is ensured by kmclib using OCaml’s type inference and PPX plugins to infer a session type from each thread then check whether these session types are -multiparty compatible (-MC) [LangeY19].

If a system of session types is -MC, then it is safe [LangeY19, Theorem 1], i.e., it has the progress property (no role gets permanently stuck in a receiving state) and the eventual reception property (all sent messages are eventually received). Checking -MC notably involves checking that all their executions (where each channel contains at most messages) satisfy progress and eventual reception.

The -MC-checker [LangeY19] performs a bounded verification to discover the least for which a system is -MC, up-to a specified upper bound . In the kmclib API, this bound can be optionally specified with [%kmclib.gen $\mathit{roles}$ ~bound:]. The -MC-checker emits an error if the bound is insufficient to guarantee safety.

The [%kmc.gen (u,m,w)] primitive also feeds the results of -MC checking back to the code. If the inferred session types are -MC, then channels for roles u, m and w can be generated. If -MC cannot be guaranteed, then this results in a type error. We have modified the -MC-checker to return counterexample traces when the verification fails. This helps give actionable feedback to the programmer, as counterexample traces are translated to OCaml types and inserted at the hole corresponding to [%kmc.gen]. This has the effect of reporting the precise location of the errors.

To report errors in a function parameter, we provide an optional macro for types: [%kmc.check rolename] (see faded code in Line 2). Figure 3 shows examples of such error reports. The left-hand-side shows the reported error when Line 2 is commented out, i.e., the master sends one task, but expects two result messages; hence progress is violated since the master gets stuck at Line 2. The right-hand-side shows the reported error when Line 2 is commented out. In this case, variable mch in Line 2 (master) is highlighted because the master fails to consume a message from channel mch.

3 Inference of Session Types in kmclib

The kmclib API.  The kmclib primitives allow the vanilla OCaml typechecker to infer the session structure of a program, while simultaneously providing a user-friendly communication API for the programmer. To enable inference of session types from concurrent programs, we leverage OCaml’s structural typing and row polymorphism. In particular, we reuse the encoding from [ImaiNYY19] where input and output session types are encoded as polymorphic variants and objects in OCaml. In contrast to [ImaiNYY19] which relies on programmers writing global types prior to type-checking, kmclib infers and verifies local session types automatically, without requiring any additional type or annotation.

Typed PPX Rewriter.  To extract and verify session types from a piece of OCaml code, the kmclib library makes use of OCaml PreProcessor eXtensions (PPX) plugins which provide a powerful meta-programming facility. PPX plugins are invoked during the compilation process to manipulate or translate the abstract syntax tree (AST) of the program. This is often used to insert additional definitions, e.g., pretty-printers, at compile-time.

A key novelty of kmclib is the combination of PPX with a form of type-aware translation, whereas most PPX plugins typically perform purely syntactic (type-unaware) translations. Figure 4 shows the workflow of the PPX rewriter, overlayed on code snippets from Figure 2. The inference works as follows.

  1. The plugin reads the AST of the program code to replace the [%kmc.gen] primitive with a hole, which can have any type.

  2. The plugin invokes the typechecker to get the typed AST of the program. In this way, the type of the hole is inferred to be a tuple of channel object types whose structure is derived from their usages (i.e., mch#u#compute).

    To enable this propagation, we introduce the idiom “let (KMC $\ldots$) = $\ldots$” which enforces the type of the hole to be monomorphic. Otherwise, the type would be too general and this would spoil the type propagation (See § 0.B).

  3. The inferred type is translated to a system of (local) session types, which are passed to the -MC-checker.

  4. If the system is -MC, then it is safe and the plugin instruments the code to allocate a fresh channel tuple (i.e., concurrent queues) at the hole.

  5. If the system is unsafe, the -MC-checker returns a violation trace which is translated back to an OCaml type and inserted at the hole, to report a more precise error location.

The translation is limited inside the [%kmc.gen] expression, retaining a clear correspondence between the original and translated code. It can be understood as a form of ad hoc polymorphism reminiscent of type classes in Haskell. Like the Haskell typechecker verifies whether a type belongs to a class or not, the kmclib verifies whether the set of session types belongs to the class of -MC systems.

Figure 4: Inferring session types from OCaml code.

4 Conclusion

We have developed a practical library for safe message-passing programming. The library enables developers to program and verify arbitrary communication patterns without the need for type annotations or user-operated external tools. Our automated verification approach can be applied to other general-purpose programming languages. Indeed it mainly relies on two ingredients: structural typing and metaprogramming facilities. Both are available, with a varying degree of support, in, e.g., Scala, Haskell, TypeScript, and F#.

Our work is reminiscent of automated software model checking which has a long history (see [10.1145/1592434.1592438] for a survey). There are few works on inference and verification of behavioural types, i.e., [PereraLG16, LangeNTY17, LNTY2018, ASE21]. However, Perera et al. [PereraLG16] only present a prototype research language, while Lange et al. [LangeNTY17, LNTY2018, ASE21] propose verification procedures for Go programs that rely on external tools which are not integrated with the language nor its type system. To our knowledge, ours is the first implementation of type inference for MPST and the first integration of session types compatibility checking within a programming language.

References

Appendix 0.A Technical Details on the kmclib Api

We explain the main communication primitives of kmclib and their translation to session types. In particular, we reuse the encoding [ImaiNYY19] where input and output session types are encoded as polymorphic variants and objects, while loops are naturally handled using equi-recursive types in OCaml.

Objects and variants in OCaml are structurally typed, which enables the creation of ad-hoc types. This allows the channel structure to be derived from the usage of channels in send and receive primitives.

Output types.  Sending a message, e.g., in Line 2 of Figure 2, is parsed as send (uch#m#compute) 42 where the two chained method calls yields a port for sending compute label to role m, which in turn is passed to send (together with a payload). This corresponds to an internal choice where the user specifies a destination for its message and chooses a label from those offered by the receiver.

The inferred type of channel object uch is a nested object type of the form <m: <compute: (int, $\chan$)> out> where m is a method that returns an object that itself provides method compute (which returns a port for sending an int payload and yielding a continuation channel ch). Note that the implementation of these methods is not provided explicitly by the API nor the programmer, instead they are constructed on-demand when invoking uch#m#compute; i.e., objects are generated automatically according to the method types that is invoked on them (# denotes method invocation). Such object types correspond to session types of the form m!compute<int>;ch (the translation is trivial).

Input types.  To receive messages, as in Lines 2-2 of Figure 2, we use uch#m to return a channel object which effectively corresponds to a port originating from role m. This channel is then passed to the receive primitive, which returns a polymorphic variant on which one needs to pattern match for expected messages.

The inferred type of wch, which specifies the expected messages and their respective continuation, is

<u: [‘compute of int * $\chan\chan’’

] inp> This type corresponds to an external choice, i.e., session type of the form

{u?compute<int>;} or {u?stop<unit>;}.

Linearity.  MPST require channels to be used linearly, i.e., each channel must be used exactly once. If a channel is not used, this leads to a multiparty compatibility issue (a message will not be sent/received), and hence our implementation detects such issues statically via -MC.

The idiomatic shadowing with the same variable names (e.g., re-binding of uch in Line 2) in OCaml mitigates the risk of using a channel more than once. If the program deviates from this best practice and a channel is used non-linearly, an exception is raised at runtime.

Alternatively, kmclib provides an event-based alternative API (similar to that of [DBLP:journals/pacmpl/00020HNY20]), which eliminates the explicit need for linear channels, at the cost of losing a direct-style API.111See https://github.com/keigoi/kmclib/blob/tooldemo/test/paper/test_handler.ml for an example. We remark that there are other known ways to check linearity statically [ImaiNYY19], which can easily be adapted to our library.

1let KMC (uch,mch,wch) =
2 let um, mu, mw, wm = Chan.create_unbounded (), Chan.create_unbounded (), in
3 let uch = <m = <compute = Internal.make_out um (fun v -> compute v) > > in
4 let mch = <u = Internal.make_inp mu (fun v -> compute v) > in
5 let wch = <m = Internal.make_inp mw (fun v -> stop v) > in
6 make_tuple (uch, mch, wch)
Figure 5: Code from Figure 2, instrumented at [kmc.gen]

Appendix 0.B Instrumented Code for Figure 2

Figure 5 shows the instrumented code for Figure 2. Line 5 allocates raw channels using Chan.create_unbounded from Multicore OCaml, and Lines 5-5 create objects inhabiting the inferred type. We use shorthand <...> for in-place objects object...end and abbreviate the continuations with an ellipsis.

The Internal functions make a channel from raw channels and a continuation. In particular, make_out takes an extra function (fun v -> label v), allocating a variant tag representing the message label. Also, it uses type casts from Obj module in OCaml, which is a common technique to implement session types in OCaml (cf. [DBLP:journals/jfp/Padovani17]).

Line 5 (make_tuple) wraps the resulting tuple with the KMC constructor. As mentioned in § 3, this makes the inferred hole type monomorphic. Normally, for the top-level declarations, OCaml generalises the type to be polymorphic and the hole type is inferred as (can be instantiated with any type at any site) if its occurrence is at the covariant position, spoiling the propagation (cf. relaxed value restriction [DBLP:conf/flops/Garrigue04]). We avoid this by wrapping the pattern with a type KMC : a -> a tuple declared explicitly as non-covariant.





Appendix 0.C Error Reporting with Type Ascription

It is vital to show the location of the error to the programmer when an error is found. To achieve this, the PPX plugin of kmclib instruments an extra ascription of an incompatible type at the erroneous usage of a channel. For example, see the error at Line 2 in the left of Figure 3, where the PPX plugin assigns the variable mch a type [‘progress_violation] (a single variant constructor type whose name is progress_violation), as the -MC-checker detects the input blocking forever at that point. Since it is used as <u: [‘result of int * $\cdots$] inp > denoting an input of result with an int from the user, the OCaml typechecker reports a type error.

For usability purposes, kmclib detects another kind of error, which we refer to as format error. These errors happen when the inferred type of a channel is not even in the form of output or input session type (channel misuse). For example, if the one drops the role name (#m) writing send uch#compute 42, the variable uch has the inferred type <compute: (int,) out>, which is not a session type anymore. Figure 6 shows such an error. The highlighted part is assigned a type [‘shoud_be_inp_or_out_object] type saying that the expression needs another method call (or the expression should be used as an input). We are planning to improve error messages to be more descriptive, e.g., as [‘role_or_label_not_given].

Note that these format checks are all done within the [%kmc.gen] (or [%kmc.check]) primitive. These errors could also be regarded as a “no instance” error in the type class, as such ill-formatted types are not in the class of -MC systems (they are not even in the class of session type syntax).

Figure 6: Example of a format error.