Generating Interactive WebSocket Applications in TypeScript

04/03/2020
by   Anson Miu, et al.
0

Advancements in mobile device computing power have made interactive web applications possible, allowing the web browser to render contents dynamically and support low-latency communication with the server. This comes at a cost to the developer, who now needs to reason more about correctness of communication patterns in their application as web applications support more complex communication patterns. Multiparty session types (MPST) provide a framework for verifying conformance of implementations to their prescribed communication protocol. Existing proposals for applying the MPST framework in application developments either neglect the event-driven nature of web applications, or lack compatibility with industry tools and practices, which discourages mainstream adoption by web developers. In this paper, we present an implementation of the MPST framework for developing interactive web applications using familiar industry tools using TypeScript and the React.js framework. The developer can use the Scribble protocol language to specify the protocol and use the Scribble toolchain to validate and obtain the local protocol for each role. The local protocol describes the interactions of the global communication protocol observed by the role. We encode the local protocol into TypeScript types, catering for server-side and client-side targets separately. We show that our encoding guarantees that only implementations which conform to the protocol can type-check. We demonstrate the effectiveness of our approach through a web-based implementation of the classic Noughts and Crosses game from an MPST formalism of the game logic.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

04/02/2019

Multiparty Session Type-safe Web Development with Static Linearity

Modern web applications can now offer desktop-like experiences from with...
01/12/2021

Communication-Safe Web Programming in TypeScript with Routed Multiparty Session Types

Modern web programming involves coordinating interactions between browse...
05/13/2020

Multiparty Session Programming with Global Protocol Combinators

Multiparty Session Types (MPST) is a typing discipline for communication...
09/07/2017

Automated Dyadic Data Recorder (ADDR) Framework and Analysis of Facial Cues in Deceptive Communication

We developed an online framework that can automatically pair two crowd-s...
10/04/2017

CELIO: An application development framework for interactive spaces

Developing applications for interactive space is different from developi...
02/14/2020

Gavial: Programming the web with multi-tier FRP

Developing web applications requires dealing with their distributed natu...
04/23/2021

Capability-based access control for multi-tenant systems using OAuth 2.0 and Verifiable Credentials

We propose a capability-based access control technique for sharing Web r...

1 Introduction

Modern interactive web applications aim to provide a highly responsive user experience by minimising the communication latency between clients and servers. Whilst the HTTP request-response model is sufficient for retrieving static assets, applying the same stateless communication approach for interactive use cases (such as real-time multiplayer games) introduces undesirable performance overhead. Developers have since adopted other communication transport abstractions over HTTP connections such as the WebSockets protocol [8] to enjoy low-latency full-duplex client-server communication in their applications over a single persistent connection. Enabling more complex communication patterns caters for more interactive use cases, but introduces additional correctness concerns to the developer.

Consider a classic turn-based board game of Noughts and Crosses between two players. Both players, identified by either noughts (O’s) or crosses (X’s) respectively, take turns to place a mark on an unoccupied cell of a 3-by-3 grid until one player wins (when their markers form one straight line on the board) or a stalemate is reached (when all cells are occupied and no one wins). A web-based implementation may involve players connected to a game server via WebSocket connections. The players interact with the game from their web browser, which shows a single-page application (SPA) of the game client written in a popular framework like React.js [21]. SPAs feature a single HTML page and dynamically render content via JavaScript in the browser. Players take turns to make a move on the game board, which sends a message to the server. The server implements the game logic to progress the game forward until a result (either a win/loss or draw) can be declared, where either the move of the other player or the game result is sent to players.

Whilst WebSockets make this web-based implementation possible, they introduce the developer to a new family of communication errors, even for this simple game. In addition to the usual testing for game logic correctness, the developer needs to test against deadlocks (e.g. both players waiting for each other to make a move at the same time) and communication mismatches (e.g. player 1 sending a boolean to the game server instead of the board coordinates). The complexity of these errors correlates to the complexity of the required tests and scales with the complexity of communication patterns involved.

Multiparty Session Types (MPST) [6] provide a framework for formally specifying a structured communication pattern between concurrent processes and verifying implementations for correctness with respect to the communications aspect. By specifying the client-server interactions of our game as an MPST protocol and verifying the implementations against the protocol for conformance, MPST theory guarantees well-formed implementations to be free from communication errors.

We see the application of the MPST methodology to generating interactive TypeScript web applications to be an interesting design space — to what extent can the MPST methodology be applied to deliver a workflow where developers use the generated TypeScript APIs in their application to guarantee protocol conformance by construction? Such a workflow would ultimately decrease the overhead for incorporating MPST into mainstream web development, which reduces development time by programmatically verifying communication correctness of the implementation.

Contributions

This paper presents a workflow for developing type-safe interactive SPAs motivated by the MPST framework: (1) An endpoint API code generation workflow targeting TypeScript-based web applications for multiparty sessions; (2) An encoding of session types in server-side TypeScript that enforces static linearity; and (3) An encoding of session types in browser-side TypeScript using the React framework that guarantees affine usage of communication channels.

2 The Scribble Framework

Development begins with specifying the expected communications between participants as a global protocol in Scribble [24], a MPST-based protocol specification language and code generation toolchain. We specify the Noughts and Crosses game as a Scribble protocol in LABEL:lst:game. In the protocol, the role Svr stands for the Server, and the roles P1 and P2 stand for the two Players respectively.

1module NoughtsAndCrosses;
2type <typescript> "Coordinate" from "./Types" as Point; // Position on board
3
4global protocol Game(role Svr, role P1, role P2) {
5  Pos(Point) from P1 to Svr;
6  choice at Svr {
7    Lose(Point) from Svr to P2; Win(Point) from Svr to P1;
8  } or {
9    Draw(Point) from Svr to P2; Draw(Point) from Svr to P1;
10  } or {
11    Update(Point) from Svr to P2; Update(Point) from Svr to P1;
12    do Game(Svr, P2, P1); // Continue the game with player roles swapped
13  }
14}
1\end{figure}
2
3We leverage the Scribble toolchain to check for protocol
4well-formedness.
5This directly corresponds to multiparty session
6type theory \cite{FeatherweightScribble}:
7a Scribble protocol maps to some \textit{global type}, and the Scribble
8toolchain implements the algorithmic projection defined in \cite{MPST} to
9derive valid local type \textit{projections} for all participants.
10% Note: Scribble Projection is not MPST projection, because you can have
11% protocols with choices that send to different roles.
12We obtain a set of \textit{endpoint protocols} (corresponds to \emph{local
13  types}) --- one for each role from a
14well-formed global protocol.
15An endpoint protocol only preserves the interactions defined by the global
16protocol in which the target role is involved, and corresponds to an equivalent
17\textit{Endpoint Finite State Machine} (EFSM) \cite{ICALP13CFSM}.
18% Review: The meaning and hte usage of EFSM should be explained
19The EFSM holds information about the permitted IO actions for the role.
20We use the EFSMs as a basis for API generation and adopt the formalisms in
21\cite{Hybrid2016}.

3 Encoding Session Types in TypeScript

Developers can implement their application using APIs generated from the EFSM to guarantee correctness by construction. Our approach integrates the EFSM into the development workflow by encoding session types as TypeScript types. Communication over the WebSocket protocol introduces additional constraints: communication is always initiated in the front-end and driven by user interactions, whilst back-end roles can only accept connections. This motivates our design of encoding the session types differently for server (Section 3.1) and client (LABEL:section:browser) targets.

3.1 Server-Side API Generation

Figure 1: EFSM for Svr.

We refer to the Svr EFSM (Figure 1) as a running example in this section. For server-side targets, we encode EFSM states into TypeScript types and consider branching (receiving) and selection (sending) states separately. We assign TypeScript encodings of states to their state identifiers in the EFSM, providing syntactic sugar when referring to the successor state when encoding the current state. For any state in the EFSM, we refer to the TypeScript type alias of its encoding as . We outline the encoding below using examples from the Noughts and Crosses game (LABEL:lst:svr).

Branching State

We consider a receiving state as a unary branching state for conciseness. A branching state is encoded as an object literal [19] (a record type), with each branch ( denoting set of all branches), corresponding to a member field. A branch expecting to receive a message labelled carrying payload of type with successor state is encoded as an member field named of function type . The developer implements a branching operation by passing callbacks for each branch, parameterised by the expected message payload type for that branch.

Selection State

We consider a sending state as a unary selection state for conciseness. A selection state is encoded as a union type [19] of internal choice encodings: each internal choice ( denoting set of all choices), sending a message labelled carrying payload of type with successor state is encoded as a tuple type of [Labels.label, T, ]. The developer implements a selection operation by passing the selected label and payload to send in the message. We generate a string enum (named Labels) wrapping the labels in the protocol.

1export type S13 = { Pos: (payload: Point) => S15 };
2export type S15 = [ Labels.Lose, Point, S16 ]
3                | [ Labels.Draw, Point, S17 ]
4                | [ Labels.Update, Point, S18 ];
1\end{figure}
2
3% REVIEW:
4% Also, when describing Listing 2, it would be helpful to say a bit more explicitly that the client is supposed to implement S13, and the code in S13 returns an object of type S15. Then you can point to Listing 4 if they are coordinated.
5In the case of \cref{lst:svr}, the developer is expected to implement
6\texttt{S13} which handles the \texttt{Pos} message sent by \texttt{P1},
7and the code in \texttt{S13} returns a value of type \texttt{S15}, which
8corresponds to a selection of messages to send to \texttt{P2}. \Cref{lst:svrprotocol}
9illustrates how the developer may implement these types.
10
11We make a key design decision \textit{not} to expose communication channels in
12the TypeScript session type encodings to provide \textit{static} linearity
13guarantees (\cref{section:serverlinear}).
14Our encoding sufficiently exposes seams for the developer to inject their
15program logic, whilst the generated session API
16(\cref{section:serversessionapi}) handles the sending and receiving of
17messages.
18
19\subsubsection{Session Runtime}
20\label{section:serversessionapi}
21
22The generated code for our session runtime performs communication in a protocol-conformant manner, but
23does not expose these IO actions to the developer by delegating the
24aforementioned responsibilities to an inner class.
25% REVIEW: it is not clear what is
26% generated on the server-side and what actually is the session runtime
27The runtime executes the EFSM by keeping track of
28the current state (similar to the generated code in \cite{javatypestate})
29and only permitting the specified IO actions at the current state.
30The runtime listens to message (receiving) events on the communication channel,
31invokes the corresponding callback to obtain the value to send next, and
32performs the sending.
33The developer instantiates a session by constructing an instance of the
34session runtime class, providing the WebSocket endpoint URL (to open the
35connection) and the initial state (to execute the EFSM).
36
37\subsubsection{Linear Channel Usage}
38\label{section:serverlinear}
39Developers writing their implementation using the generated APIs
40enjoy channel linearity by construction.
41Our library design prevents the two conditions detailed below:
42
43\paragraph{Repeated Usage}
44We do not expose channels to the developer, which makes \textit{reusing
45  channels} impossible.
46For example, to send a message, the generated API only requires the payload
47that needs to be sent, and the session runtime performs the send internally,
48guaranteeing this action is done \textit{exactly once} by construction.
49
50\paragraph{Unused Channels}
51The initial state must be supplied to the session runtime
52constructor in order to instantiate a session;
53this initial state is defined
54in terms of the successor states, which in turn has references to its
55successors and so forth.
56The developers implementation will cover the terminal state
57(if it exists), and the
58session runtime guarantees this terminal state will be reached
59by construction.
60
61\subsection{The React Framework}
62Our browser-side session type encodings for browser-side targets build upon the
63\emph{React.js} framework, developed by Facebook \cite{React} for the
64\textit{Model-View-Controller} (MVC) architecture.
65React is widely used in industry to create scalable single-page TypeScript
66applications, and we intend for our proposed workflow to be beneficial in an
67industrial context.
68We introduce the key features of the framework.
69
70\paragraph{Components}
71A component is a reusable UI element which
72contains its own markup and logic.
73Components implement a \texttt{render()} function which returns a React
74element, the smallest building blocks of a React application, analogous to the
75view function in the MVU architecture.
76Components can keep \textit{state}s and the \texttt{render()} function is
77invoked upon a change of state.
78
79For example, a simple counter can be implemented as a component,
80with its \texttt{count} stored as state.
81When rendered, it displays a button which increments \texttt{count}
82when clicked and a \texttt{div} that renders the current
83\texttt{count}.
84If the button is clicked, the \texttt{count} is incremented, which triggers a
85re-rendering (since the state has changed), and the updated \texttt{count} is
86displayed.
87
88Components can also render other components, which gives rise
89to a parent/child relationship between components.
90Parents can pass data to children as \textit{props} (short for properties).
91Going back to the aforementioned example, the counter component could
92render a child component \texttt{<StyledDiv count=\{this.state.count\} />} in
93its \texttt{render()} function, propagating the \texttt{count} from its state
94to the child.
95This enables reusability, and for our use case, gives control to the parent
96on what data to pass to its children (e.g. pass the payload of a received
97message to a child to render).
98
99\subsection{Browser-Side API Generation}
100
101\label{section:browser}
102
103\begin{wrapfigure}{R}{0.5\textwidth}
104  \begin{center}
105    \includegraphics[width=0.5\textwidth]{figures/efsm_p1.png}
106  \end{center}
107
108  \captionof{figure}{EFSM for \texttt{P1}.}
109  \label{fig:efsmp1}
110\end{wrapfigure}
111
112We refer to the \texttt{P1} EFSM (\cref{fig:efsmp1}) as a running example in
113this section.
114Preserving behavioural typing and channel linearity is challenging
115for browser-side applications due to EFSM transitions being triggered by user
116events:
117in the case of \textit{Noughts and Crosses}, once the user makes a move by
118clicking on a cell on the game board, this click event must be deactivated
119until the users next turn, otherwise the user can click again and violate
120channel linearity.
121Our design goal is to enforce this statically through the generated APIs.
122
123For browser-side targets, we extend the approach presented in \cite{MVU2019} on
124\textit{multiple model types} motivated by the \textit{Model-View-Update} (MVU)
125architecture.
126% FEEDBACK: introduce MVU and model type
127An MVU application features a \textit{model} encapsulating application
128state, a \textit{view function} rendering the state on the Document Object Model (DOM), and an
129\textit{update function} handling \textit{messages} produced by the
130rendered model to produce a new model.
131The concept of model types express type dependencies between these
132components: a \emph{model type} uniquely defines a \textit{view function},
133set of \textit{messages} and \textit{update function} -- rather than
134producing a new model, the update function defines valid transitions to
135other model types.
136We leverage the correspondence between model types and states in the EFSM:
137each state in the EFSM is a model type, the set of messages represent
138the possible (IO) actions available at that state,
139and the update function defines which successor state to transition to,
140given the supported IO actions at this state.
141
142\subsubsection{Model Types in React}
143
144\paragraph{State}
145An EFSM state is encoded as an \textit{abstract} React
146component.
147This is an abstract class to require the developer to provide their
148own view function, which translates conveniently to the \texttt{render()}
149function of React components.
150Our session runtime (\cref{section:clientruntime}) ‘‘executes’’ the EFSM and
151renders the current state.
152Upon transitioning to a successor state, the successors view function will be
153invoked, as per the semantics expressed in \cite{MVU2019}.
154
155\paragraph{Model Transitions}
156Transitions are encoded as React component props onto the encoded states by the
157session runtime (\cref{section:clientruntime}).
158We motivate the design choice of not exposing channel resources to provide
159guarantees on channel usage.
160React components in TypeScript are
161\textit{generic} \cite{TypeScriptSpec}, parameterised by the permitted
162types of prop and state.
163The parameters allow us to leverage the TypeScript compiler to
164verify that the props for model transitions stay local to the state they are
165defined for.
166The model transitions for EFSMs are message send and receive.
167
168\subparagraph{Sending}
169We make the assumption that message sending is triggered by
170some user-driven UI event (e.g. clicking a button, pressing a key on the
171keyboard) which interacts with some DOM element.
172We could pass a
173\texttt{send()} function as a prop to the sending state, but the developer
174would be free to call the function multiple times which makes channel reuse
175possible.
176Instead, we pass a \textit{factory function} as a prop, which will,
177given an HTML event and an event handler function, return a fresh React
178component that binds the sending action on construction.
179So once the bound event is triggered, our session runtime executes the event
180handler function to obtain the payload to send, perform the send
181\textit{exactly once} and transition to (which, in practice, means render) the
182successor state.
183
184\begin{figure}[!h]
185\begin{lstlisting}[language=JavaScript, tabsize=4]
186// Inside some render() function..
187{board.map((row, x) => (
188    row.map((col, y) => {
189        const SelectPoint = this.props.Pos(’click’, (event: UIEvent) => {
190            event.preventDefault();
191            return { x: x, y: y };}
192        return <SelectPoint><td>.</td></SelectPoint>;
193});}
1\end{figure}
2
3We demonstrate the semantics using the \textit{Noughts and Crosses} example in
4\cref{lst:clientapp}.
5The session runtime passes the factory function \texttt{this.props.Pos} as a prop.
6For each x-y coordinate on the game board, we
7create a \texttt{SelectPoint} React component from the factory function (which
8reads ‘‘build a React component that sends the \texttt{Pos} message with x-y
9coordinates as payload when the user clicks on it’’) and we wrap a table cell
10(the game board is rendered as an HTML table) inside the \texttt{SelectPoint}
11component to bind the click event on the table cell.
12
13\subparagraph{Receiving}
14The React component for a receiving state is required to
15define a handler for each supported branch.
16Upon a message receive event, the session runtime invokes the
17handler of the corresponding branch with the message payload and
18renders the successor state upon completion.
19
20\subsubsection{Session Runtime}
21\label{section:clientruntime}
22
23The session runtime can be interpreted as an abstraction on top of the React
24VDOM that implements the EFSM by construction.
25The session runtime itself is a React component too, named after the endpoint
26role identifier:
27it opens the WebSocket connection to the server, keeps track of the current
28EFSM state as part of its React component state, and most importantly, renders
29the React component encoding of the active EFSM state.
30Channel communications are managed by the runtime, which allows it to render
31the successor of a receive state upon receiving a message from the channel.
32Similarly, the session runtime is responsible for passing the required props
33for model transitions to EFSM state React components.
34The session runtime component is rendered by the developer and requires, as
35props, the \textit{endpoint URL} (so it can open the connection) and a list of
36\textit{concrete state components}.
37
38The developer writes their own implementation of each state (mainly to
39customise how the state is rendered and inject business logic into state
40transitions) by extending the abstract React class components.
41The session runtime requires references to these concrete components in order to
42render the user implementation accordingly.
43
44\subsubsection{Affine Channel Usage}
45A limitation of our browser-side session type encoding is only being able to
46guarantee that channel resources are used \textit{at most once} as opposed to
47\textit{exactly once}.
48
49Communication channels are not exposed to the developer so multiple sends are
50impossible.
51This does not restrict the developer from binding the send action to exactly
52one UI event: for \textit{Noughts and Crosses}, we bind the \texttt{Pos(Point)}
53send action to each unoccupied cell on the game board, but the generated
54runtime ensures that, once the cell is clicked, the send is only performed once
55and the successor state is rendered on the DOM, so the channel resource used to
56send becomes unavailable.
57
58However, our approach \textit{does not} statically detect whether all
59transitions in a certain state are bound to some UI event.
60This means that it is possible for an implementation to \textit{not} handle
61transitions to a terminal state but still type-check, so we cannot prevent
62unused states. Equally, our approach does not prevent a client closing the browser, which would drop the connection.

4 Case Study

1const handleP1Move: S13 = (move: Point) => {
2    board.P1(move);         // User logic
3    if (board.won()) {
4        return [Labels.Lose, move, [Labels.Win, move]]; 
5    } else if (board.draw()) {
6        return [Labels.Draw, move, [Labels.Draw, move]]; 
7    } else {
8        return [Labels.Update, move, [Labels.Update, move, handleP2Move]];  
9    }
10}
11
12// Instantiate session - ‘handleP2Move‘ defined similarly as S19
13new NoughtsAndCrosses.Svr(webSocketServer, handleP1Move);
1\end{figure}
2
3We apply our framework to implement a web-based implementation of the
4\textit{Noughts and Crosses} running example in TypeScript;
5the interested reader can find the full implementation in
6\cite{NoughtsAndCrosses}.
7In addition to MPST-safety, we show that our library design welcomes idiomatic
8JavaScript practices in the user implementation and is interoperable with
9common front- and back-end frameworks.
10
11\paragraph{Game Server}
12We set up the WebSocket server as an Express.js \cite{ExpressJS}
13application on top of a Node.js \cite{NodeJS} runtime.
14We define our own game logic in a \texttt{Board} class to keep track of the
15game state and expose methods to query the result.
16This custom logic is integrated into our \texttt{handleP1Move} and
17\texttt{handleP2Move} handlers (\cref{lst:svrprotocol}), so the session runtime
18can handle \texttt{Pos(Point)} messages from players and transition to the
19permitted successor states (\cref{lst:game}) according
20to the injected game logic: if \texttt{P1} played a winning move (\cref{linelose}),
21\texttt{Svr} sends a \texttt{Lose} message to \texttt{P2} with the winning
22move, and also sends a \texttt{Win} message to \texttt{P1};
23if \texttt{P1}’s move resulted
24in a draw (\cref{linedraw}), \texttt{Svr} sends \texttt{Draw} messages to
25both \texttt{P2} and \texttt{P1}; otherwise, the game continues (\cref{lineupdate}),
26so \texttt{Svr} updates both \texttt{P2} and \texttt{P1} with the latest
27move and proceeds to handle \texttt{P2}’s turn.
28
29Note that, by TypeScripts structural typing
30\cite{TypeScriptSpec}, replacing
31\texttt{handleP2Move} on \cref{lineupdate} with a recursive
32occurrence of \texttt{handleP1Move} would be
33type-correct --- this allows for better code reuse as opposed to
34defining
35additional abstractions to work around the limitations of nominal
36typing in
37\cite{Hybrid2016}.
38There is also full type erasure when transpiling to JavaScript to
39run the
40server code, so the types defined in TypeScript will not appear in the
41JavaScript after type-checking.
42This means state space explosion is not a runtime consideration.
43
44\paragraph{Game Clients}
45We implement the game client for \texttt{P1} and
46\texttt{P2} by extending from the generated abstract React (EFSM state)
47components and registering those to the session runtime component.
48
49For the sake of code reuse, \cite{NoughtsAndCrosses}
50uses \textit{higher-order components} (HOC) to build the correct state
51implementations depending on which player the user chooses to be.
52We leverage the \textit{Redux} \cite{Redux} state management library to keep
53track of the game state, thus showing the flexibility of our library design in
54being interoperable with other libraries and idiomatic JavaScript practices.
55% REVIEW: tight-coupling?
56Our approach encourages the separation of concerns between the
57communication logic and program logic --- the generated session runtime
58keeps track of the state of the EFSM to ensure protocol conformance by
59construction, whilst \textit{Redux} solely manages our game state.

5 Related Work

The two main approaches for incorporating our MPST workflow into application development are native language support for first-class linear channel resources [23] and code generation. The latter closely relates to our proposal; we highlight two areas of existing work under this approach that motivate our design choice.

Endpoint API Generation

Neykova and Yoshida targeted Python applications and the generation of runtime monitors [16] to dynamically verify communication patterns. Whilst the same approach could be applied to JavaScript, we can provide more static guarantees with TypeScript’s gradual typing system. Scribble-Java [12] proposed to encode the EFSM states and transitions as classes and instance methods respectively, with behavioural typing achieved statically by the type system and channel linearity guarantees achieved dynamically since channels are exposed and aliasing is not monitored. Scribble-Java can generate callback-style APIs similar to the approach we present, but this approach is arguably less idiomatic for Java developers.

Session Types in Web Development

King et al. [14] targeted web development in PureScript using the Concur UI framework and proposed a type-level encoding of EFSMs as multi-parameter type classes. However, it presents a trade-off between achieving static linearity guarantees from the type-level EFSM encoding under the expressive type system and providing an intuitive development experience to developers, especially given the prevalence of JavaScript and TypeScript applications in industry. Fowler [10] focused on applying binary session types in front-end web development and presented approaches that tackle the challenge of guaranteeing linearity in the event-driven environment, whereas our work is applicable to multiparty scenarios.

Our work applies the aforementioned approaches in a multiparty context using industrial tools and practices to ultimately encourage MPST-safe web application development workflows in industry.

6 Conclusion and Future Work

We have presented an MPST-based framework for developing full-stack interactive TypeScript applications with WebSocket communications. The implementation conforms to a specified protocol, statically providing linear channel usage guarantees and affine channel usage guarantees for back-end and front-end targets respectively.

Future work includes incorporating explicit connection actions introduced in [13] in our API generation to better model real-world communication protocols that may feature in interactive web applications. Server-side implementations may perform asynchronous operations on the received messages, so supporting asynchronous values (such as JavaScript Promises [4]) in IO actions would be a welcome addition. Whilst our approach supports multiparty sessions, the nature of WebSockets require some server-based role in the communication protocol and clients to interact via the server. Extending support to WebRTC [22] would cater for peer-to-peer communication between browsers, which further opens up possibilities for communication protocols supported by our approach.

Acknowledgements

We thank the anonymous reviewers for their feedback.
This work was supported in part by EPSRC projects EP/K011715/1, EP/K034413/1, EP/L00058X/1, EP/N027833/1, EP/N028201/1, and EP/T006544/1.

References

  • [1]
  • [2] Dan Abramov (2015): Redux - A predictable state container for JavaScript apps. Available at https://redux.js.org/.
  • [3] Gavin Bierman, Martín Abadi & Mads Torgersen (2014): Understanding TypeScript. In RichardEditor Jones, editor: ECOOP 2014 – Object-Oriented Programming, Lecture Notes in Computer Science, Springer, pp. 257–281, doi:10.1007/978-3-662-44202-911.
  • [4] MDN contributors (2020): Promise. Available at https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise. Library Catalog: developer.mozilla.org.
  • [5] Ezra Cooper, Sam Lindley, Philip Wadler & Jeremy Yallop (2007): Links: Web Programming Without Tiers, pp. 266–296. 4709, Springer Berlin Heidelberg, doi:10.1007/978-3-540-74792-512.
  • [6] Mario Coppo, Mariangiola Dezani-Ciancaglini, Luca Padovani & Nobuko Yoshida (2015): A Gentle Introduction to Multiparty Asynchronous Session Types. In: 15th International School on Formal Methods for the Design of Computer, Communication and Software Systems: Multicore Programming, LNCS 9104, Springer, pp. 146–178, doi:10.1007/978-3-319-18941-34.
  • [7] Pierre-Malo Deniélou & Nobuko Yoshida (2013): Multiparty Compatibility in Communicating Automata: Characterisation and Synthesis of Global Session Types. In: 40th International Colloquium on Automata, Languages and Programming, LNCS 7966, Springer, Berlin, Heidelberg, pp. 174–186, doi:10.1007/978-3-642-39212-2_18.
  • [8] I. Fette & A. Melnikov (2011): The WebSocket Protocol. RFC 6455, RFC Editor. Available at https://www.rfc-editor.org/rfc/rfc1654.txt.
  • [9] Node.js Foundation: Express - Node.js web application framework. Available at https://expressjs.com/.
  • [10] Simon Fowler (2019): Model-View-Update-Communicate: Session Types meet the Elm Architecture. arXiv:1910.11108 [cs]. Available at http://arxiv.org/abs/1910.11108. ArXiv: 1910.11108.
  • [11] Rosita Gerbo & Luca Padovani (2019): Concurrent Typestate-Oriented Programming in Java. Electronic Proceedings in Theoretical Computer Science 291, pp. 24–34, doi:10.4204/EPTCS.291.3. ArXiv: 1904.01286.
  • [12] Raymond Hu & Nobuko Yoshida (2016): Hybrid Session Verification through Endpoint API Generation. In: 19th International Conference on Fundamental Approaches to Software Engineering, LNCS 9633, Springer, pp. 401–418, doi:10.1007/978-3-662-49665-724.
  • [13] Raymond Hu & Nobuko Yoshida (2017): Explicit Connection Actions in Multiparty Session Types, pp. 116–133. 10202, Springer Berlin Heidelberg, doi:10.1007/978-3-662-54494-57.
  • [14] Jonathan King, Nicholas Ng & Nobuko Yoshida (2019): Multiparty Session Type-safe Web Development with Static Linearity. Electronic Proceedings in Theoretical Computer Science 291, pp. 35–46, doi:10.4204/EPTCS.291.4.
  • [15] Anson Miu (2020): ansonmiu0214/scribble-noughts-and-crosses. Available at https://github.com/ansonmiu0214/scribble-noughts-and-crosses.
  • [16] Rumyana Neykova & Nobuko Yoshida (2017): How to Verify Your Python Conversations. Behavioural Types: from Theory to Tools, pp. 77–98, doi:10.13052/rp-9788793519817.
  • [17] Rumyana Neykova & Nobuko Yoshida (2019): Featherweight Scribble, pp. 236–259. 11665, Springer International Publishing, doi:10.1007/978-3-030-21485-214.
  • [18] Node.js: Node.js. Available at https://nodejs.org/en/.
  • [19] Microsoft Research: TypeScript Language Specification. Available at https://github.com/microsoft/TypeScript.
  • [20]

    Facebook Open Source:

    Introducing JSX – React.
    Available at https://reactjs.org/docs/introducing-jsx.html.
  • [21] Facebook Open Source: React – A JavaScript library for building user interfaces. Available at https://reactjs.org/.
  • [22] Justin Uberti & Peter Thatcher (2011): WebRTC. Available at https://webrtc.org/.
  • [23] Hongwei Xi (2017): Applied Type System: An Approach to Practical Programming with Theorem-Proving. arXiv:1703.08683 [cs]. Available at http://arxiv.org/abs/1703.08683. ArXiv: 1703.08683.
  • [24] Nobuko Yoshida, Raymond Hu, Rumyana Neykova & Nicholas Ng (2013): The Scribble Protocol Language. In: 8th International Symposium on Trustworthy Global Computing, LNCS 8358, Springer, pp. 22–41, doi:10.1007/978-3-319-05119-23.

3 Encoding Session Types in TypeScript

Developers can implement their application using APIs generated from the EFSM to guarantee correctness by construction. Our approach integrates the EFSM into the development workflow by encoding session types as TypeScript types. Communication over the WebSocket protocol introduces additional constraints: communication is always initiated in the front-end and driven by user interactions, whilst back-end roles can only accept connections. This motivates our design of encoding the session types differently for server (Section 3.1) and client (LABEL:section:browser) targets.

3.1 Server-Side API Generation

Figure 1: EFSM for Svr.

We refer to the Svr EFSM (Figure 1) as a running example in this section. For server-side targets, we encode EFSM states into TypeScript types and consider branching (receiving) and selection (sending) states separately. We assign TypeScript encodings of states to their state identifiers in the EFSM, providing syntactic sugar when referring to the successor state when encoding the current state. For any state in the EFSM, we refer to the TypeScript type alias of its encoding as . We outline the encoding below using examples from the Noughts and Crosses game (LABEL:lst:svr).

Branching State

We consider a receiving state as a unary branching state for conciseness. A branching state is encoded as an object literal [19] (a record type), with each branch ( denoting set of all branches), corresponding to a member field. A branch expecting to receive a message labelled carrying payload of type with successor state is encoded as an member field named of function type . The developer implements a branching operation by passing callbacks for each branch, parameterised by the expected message payload type for that branch.

Selection State

We consider a sending state as a unary selection state for conciseness. A selection state is encoded as a union type [19] of internal choice encodings: each internal choice ( denoting set of all choices), sending a message labelled carrying payload of type with successor state is encoded as a tuple type of [Labels.label, T, ]. The developer implements a selection operation by passing the selected label and payload to send in the message. We generate a string enum (named Labels) wrapping the labels in the protocol.

1export type S13 = { Pos: (payload: Point) => S15 };
2export type S15 = [ Labels.Lose, Point, S16 ]
3                | [ Labels.Draw, Point, S17 ]
4                | [ Labels.Update, Point, S18 ];
1\end{figure}
2
3% REVIEW:
4% Also, when describing Listing 2, it would be helpful to say a bit more explicitly that the client is supposed to implement S13, and the code in S13 returns an object of type S15. Then you can point to Listing 4 if they are coordinated.
5In the case of \cref{lst:svr}, the developer is expected to implement
6\texttt{S13} which handles the \texttt{Pos} message sent by \texttt{P1},
7and the code in \texttt{S13} returns a value of type \texttt{S15}, which
8corresponds to a selection of messages to send to \texttt{P2}. \Cref{lst:svrprotocol}
9illustrates how the developer may implement these types.
10
11We make a key design decision \textit{not} to expose communication channels in
12the TypeScript session type encodings to provide \textit{static} linearity
13guarantees (\cref{section:serverlinear}).
14Our encoding sufficiently exposes seams for the developer to inject their
15program logic, whilst the generated session API
16(\cref{section:serversessionapi}) handles the sending and receiving of
17messages.
18
19\subsubsection{Session Runtime}
20\label{section:serversessionapi}
21
22The generated code for our session runtime performs communication in a protocol-conformant manner, but
23does not expose these IO actions to the developer by delegating the
24aforementioned responsibilities to an inner class.
25% REVIEW: it is not clear what is
26% generated on the server-side and what actually is the session runtime
27The runtime executes the EFSM by keeping track of
28the current state (similar to the generated code in \cite{javatypestate})
29and only permitting the specified IO actions at the current state.
30The runtime listens to message (receiving) events on the communication channel,
31invokes the corresponding callback to obtain the value to send next, and
32performs the sending.
33The developer instantiates a session by constructing an instance of the
34session runtime class, providing the WebSocket endpoint URL (to open the
35connection) and the initial state (to execute the EFSM).
36
37\subsubsection{Linear Channel Usage}
38\label{section:serverlinear}
39Developers writing their implementation using the generated APIs
40enjoy channel linearity by construction.
41Our library design prevents the two conditions detailed below:
42
43\paragraph{Repeated Usage}
44We do not expose channels to the developer, which makes \textit{reusing
45  channels} impossible.
46For example, to send a message, the generated API only requires the payload
47that needs to be sent, and the session runtime performs the send internally,
48guaranteeing this action is done \textit{exactly once} by construction.
49
50\paragraph{Unused Channels}
51The initial state must be supplied to the session runtime
52constructor in order to instantiate a session;
53this initial state is defined
54in terms of the successor states, which in turn has references to its
55successors and so forth.
56The developers implementation will cover the terminal state
57(if it exists), and the
58session runtime guarantees this terminal state will be reached
59by construction.
60
61\subsection{The React Framework}
62Our browser-side session type encodings for browser-side targets build upon the
63\emph{React.js} framework, developed by Facebook \cite{React} for the
64\textit{Model-View-Controller} (MVC) architecture.
65React is widely used in industry to create scalable single-page TypeScript
66applications, and we intend for our proposed workflow to be beneficial in an
67industrial context.
68We introduce the key features of the framework.
69
70\paragraph{Components}
71A component is a reusable UI element which
72contains its own markup and logic.
73Components implement a \texttt{render()} function which returns a React
74element, the smallest building blocks of a React application, analogous to the
75view function in the MVU architecture.
76Components can keep \textit{state}s and the \texttt{render()} function is
77invoked upon a change of state.
78
79For example, a simple counter can be implemented as a component,
80with its \texttt{count} stored as state.
81When rendered, it displays a button which increments \texttt{count}
82when clicked and a \texttt{div} that renders the current
83\texttt{count}.
84If the button is clicked, the \texttt{count} is incremented, which triggers a
85re-rendering (since the state has changed), and the updated \texttt{count} is
86displayed.
87
88Components can also render other components, which gives rise
89to a parent/child relationship between components.
90Parents can pass data to children as \textit{props} (short for properties).
91Going back to the aforementioned example, the counter component could
92render a child component \texttt{<StyledDiv count=\{this.state.count\} />} in
93its \texttt{render()} function, propagating the \texttt{count} from its state
94to the child.
95This enables reusability, and for our use case, gives control to the parent
96on what data to pass to its children (e.g. pass the payload of a received
97message to a child to render).
98
99\subsection{Browser-Side API Generation}
100
101\label{section:browser}
102
103\begin{wrapfigure}{R}{0.5\textwidth}
104  \begin{center}
105    \includegraphics[width=0.5\textwidth]{figures/efsm_p1.png}
106  \end{center}
107
108  \captionof{figure}{EFSM for \texttt{P1}.}
109  \label{fig:efsmp1}
110\end{wrapfigure}
111
112We refer to the \texttt{P1} EFSM (\cref{fig:efsmp1}) as a running example in
113this section.
114Preserving behavioural typing and channel linearity is challenging
115for browser-side applications due to EFSM transitions being triggered by user
116events:
117in the case of \textit{Noughts and Crosses}, once the user makes a move by
118clicking on a cell on the game board, this click event must be deactivated
119until the users next turn, otherwise the user can click again and violate
120channel linearity.
121Our design goal is to enforce this statically through the generated APIs.
122
123For browser-side targets, we extend the approach presented in \cite{MVU2019} on
124\textit{multiple model types} motivated by the \textit{Model-View-Update} (MVU)
125architecture.
126% FEEDBACK: introduce MVU and model type
127An MVU application features a \textit{model} encapsulating application
128state, a \textit{view function} rendering the state on the Document Object Model (DOM), and an
129\textit{update function} handling \textit{messages} produced by the
130rendered model to produce a new model.
131The concept of model types express type dependencies between these
132components: a \emph{model type} uniquely defines a \textit{view function},
133set of \textit{messages} and \textit{update function} -- rather than
134producing a new model, the update function defines valid transitions to
135other model types.
136We leverage the correspondence between model types and states in the EFSM:
137each state in the EFSM is a model type, the set of messages represent
138the possible (IO) actions available at that state,
139and the update function defines which successor state to transition to,
140given the supported IO actions at this state.
141
142\subsubsection{Model Types in React}
143
144\paragraph{State}
145An EFSM state is encoded as an \textit{abstract} React
146component.
147This is an abstract class to require the developer to provide their
148own view function, which translates conveniently to the \texttt{render()}
149function of React components.
150Our session runtime (\cref{section:clientruntime}) ‘‘executes’’ the EFSM and
151renders the current state.
152Upon transitioning to a successor state, the successors view function will be
153invoked, as per the semantics expressed in \cite{MVU2019}.
154
155\paragraph{Model Transitions}
156Transitions are encoded as React component props onto the encoded states by the
157session runtime (\cref{section:clientruntime}).
158We motivate the design choice of not exposing channel resources to provide
159guarantees on channel usage.
160React components in TypeScript are
161\textit{generic} \cite{TypeScriptSpec}, parameterised by the permitted
162types of prop and state.
163The parameters allow us to leverage the TypeScript compiler to
164verify that the props for model transitions stay local to the state they are
165defined for.
166The model transitions for EFSMs are message send and receive.
167
168\subparagraph{Sending}
169We make the assumption that message sending is