Choreographic Programming [M13:phd] is a paradigm for developing concurrent software, where an “Alice and Bob” notation is used to prevent mismatched I/O actions syntactically. An EndPoint Projection (EPP) can then be used to synthesise correct-by-construction process implementations [CM13, CM17:facs, QZCY07]. Choreographies are used in different settings, including standards [BPMN, wscdl], languages [chor:website, HMBCY11, pi4soa, savara:website], specification models [CHY12, CM13, LGMZ08], and design tools [BPMN, pi4soa, savara:website, wscdl].
The key to preventing mismatched I/O actions in choreographies is that interactions between two (or more) processes are specified atomically, using terms such as – read “process sends the evaluation of expression to process , and then we proceed as the choreography ”. Giving a semantics to such terms is relatively easy if we assume that communications are synchronous: we can just reduce to in a single step (and update ’s state with the received value, but this is orthogonal to this discussion). For this reason, most research on choreographic programming focused on systems with a synchronous communications semantics.
However, many real-world systems use asynchronous communications. This motivated the introduction of an ad-hoc reduction rule for modelling asynchrony in choreographies (Rule in [CM13]). As an example, consider the choreography (where and are just constants). The special rule would allow for consuming the second communication immediately, thus reducing the choreography to . In general, roughly, this rule allows a choreography of the form to execute an action in if this action involves but not (sends are non-blocking, receives are blocking). This approach was later adopted in other works (see Section LABEL:sec:related). Unfortunately, it also comes with a serious problem: it yields an unintuitive semantics, since a choreography can now reduce to a state that would normally not be reachable in the real world. In our example, specifically, in the real world would have to send its first message to before it could proceed to sending its other message to . This information is lost in the choreography reduction, where it appears that can just send its messages in any order. In [CM13], this also translates to a misalignment between the structures of choreographies and their process implementations generated by EPP, since the latter use a standard asynchronous semantics with message buffers; see Section LABEL:sec:related for a detailed discussion of this aspect. Previous work [DY13] uses intermediate runtime terms in choreographies to represent asynchronous messages in transit, in an attempt at overcoming this problem. However, the adequacy of this approach has never been formally demonstrated.
In this paper, we are interested in studying asynchrony for choreographies in a systematic way. Thus, we first analyse the properties that an asynchronous choreography semantics should have (assuming standard FIFO duplex channels between each pair of processes) – messages can be sent without the intended receiver being ready, and all sent messages are eventually received – and afterwards formulate them precisely in a representative choreography language. Our study leads naturally to the construction of a new choreography model that supports asynchronous communications, by capitalising on the characteristic feature of out-of-order execution found in choreographic programming. We formally establish the adequacy of our asynchronous model, by proving that it respects the formal definitions of our properties. Then, we define an EPP from our new model to an asynchronous process calculus. Thanks to the accurate asynchronous semantics of our choreography model, we prove that the code generated by our EPP and the originating choreography are lockstep operationally equivalent. As a corollary, our generated processes are deadlock-free by construction. Our development also has the pleasant property that programmers do not need to reason about asynchrony: they can just program thinking in the usual terms of synchronous communications, and assume that adopting asynchronous communications will not lead to errors. We conclude by discussing how our construction can be systematically extended to more complex choreography models.
The contribution of this article is threefold. First, we give an abstract characterisation of asynchronous semantics for choreography languages, which is formalised for a minimal choreography calculus. Secondly, we propose an asynchronous semantics for this minimal language, show that it is an instance of our characterisation, and discuss how it can be applied to other choreography calculi. Finally, we prove a lockstep operational correspondence between choreographies and their process implementations, when asynchronous semantics for both systems are considered.
We present the representative choreography language in which we develop our work, together with its associated process calculus and EPP, in Section 2. In Section 3, we motivate and introduce the properties we would expect of an asynchronous choreography semantics, and introduce a semantics that satisfies these properties. We show that we can define an asynchronous variant of the target process calculus in Section LABEL:sec:asp, and extend the definition of EPP towards it, preserving the precise operational correspondence from the synchronous case. We relate our development to other approaches for asynchrony in choreographies in Section LABEL:sec:related, before concluding in Section LABEL:sec:concl with a discussion on the implications of our work and possible future directions.
2 Minimal Choreographies and Stateful Processes
We review the choreography model of Minimal Choreographies (MC) and its target calculus of Stateful Processes (SP), originally introduced in [CM17:facs].
2.1 Minimal Choreographies
The language of MC is defined inductively in Figure 1.
Processes, ranged over by , are assumed to run concurrently. Processes interact through value communications , which we also denote as . Here, process evaluates expression and sends the result to . The precise syntax of expressions is immaterial for our presentation; in particular, expressions can access values stored in ’s memory.
The remaining choreography terms denote conditionals, recursion, and termination. In the conditional , process evaluates expression to decide whether the choreography should proceed as or as . In , we define variable to be the choreography term , which then can be called (as ) inside both and . Term is the terminated choreography, which we sometimes omit. For a more detailed discussion of these primitives, we refer the reader to [CM17:facs].111We relaxed the syntax of MC slightly with respect to [CM17:facs] by leaving the syntax of expressions unspecified, which allows for the simpler conditional in line with typical choreography languages. This minor change simplifies our presentation.
The (synchronous) semantics of MC is a reduction semantics that uses a total state function to represent the memory state at each process . Since our development is orthogonal to the details of the memory implementation, we say that is a representation of the memory state of (left unspecified) and write to denote the (uniquely defined) updated memory state of after receiving a value . Term denotes that locally evaluating expression at , with memory state , evaluates to . We assume that expression evaluation is deterministic and always terminates. (This formulation captures the essence of previous memory models for choreographies, cf. [CM13, CM16a].)
Transitions are defined over pairs , given by the rules in Figure 2.1. As usual, we omit the angular brackets in transitions.
These rules are mostly standard, and we summarise their intuition. In , the state of is updated with the value received from (which results from the evaluation of expression at that process). Rules and are as expected, while rule allows reductions under recursive definitions. Finally, rule uses a structural precongruence , defined in Figure 3, which essentially allows (i) independent communications to be swapped (rule ), (ii) recursive definitions to be unfolded (rule ), and (iii) garbage collection (rule ).
The remaining rules are additional rules allowing communications to swap with other constructs, required for achieving (i). These rules, taken together, endow the semantics of MC with out-of-order execution: interactions not at the top level may be brought to the top and executed if they do not interfere with other interactions that precede them in the choreography’s abstract syntax tree. We write for and , and denote the set of process names in a choreography by .
Unsurprisingly, choreographies in MC are always deadlock-free. We use this property later on, to prove that the process code generated from choreographies is also deadlock-free.
Theorem 1 (Deadlock-freedom).Given a choreography , either (termination) or, for every , there exist and such that .
2.2 Stateful Processes and EndPoint Projection
Minimal Choreographies are meant to be implemented in a minimalistic process calculus, also introduced in [CM17:facs], called Stateful Processes (SP). We summarize this calculus, noting that we make the same conventions and changes regarding expressions, labels, and states as above.
The syntax of SP is reported in Figure 4.
A term is a process with name , memory state and behaviour . Networks, ranged over by , are parallel compositions of processes, with being the inactive network.
Behaviours correspond to the local views of choreography actions. The process executing a send term evaluates expression and sends the result to process , proceeding as . The dual receiving behaviour expects a value from process , stores it in its memory and proceeds as . The other terms are as in MC.
These intuitions are formalized in the synchronous semantics of SP, which is defined by the rules in Figure 2.2.