Specification and Runtime Checking of Derecho, A Protocol for Fast Replication for Cloud Services

05/19/2023
by   Kumar Shivam, et al.
0

Reliable distributed systems require replication and consensus among distributed processes to tolerate process and communication failures. Understanding and assuring the correctness of protocols for replication and consensus have been a significant challenge. This paper describes the precise specification and runtime checking of Derecho, a more recent, sophisticated protocol for fast replication and consensus for cloud services. A precise specification must fill in missing details and resolve ambiguities in English and pseudocode algorithm descriptions while also faithfully following the descriptions. To help check the correctness of the protocol, we also performed careful manual analysis and increasingly systematic runtime checking. We obtain a complete specification that is directly executable, and we discover and fix a number of issues in the pseudocode. These results were facilitated by the already detailed pseudocode of Derecho and made possible by using DistAlgo, a language that allows distributed algorithms to be easily and clearly expressed and directly executed.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/28/2020

Modeling the Raft Distributed Consensus Protocol in LNT

Consensus protocols are crucial for reliable distributed systems as they...
research
02/23/2022

Compositional Model Checking of Consensus Protocols Specified in TLA+ via Interaction-Preserving Abstraction

Consensus protocols are widely used in building reliable distributed sof...
research
08/22/2020

Assurance of Distributed Algorithms and Systems: Runtime Checking of Safety and Liveness

This paper presents a general framework and methods for complete program...
research
09/23/2018

Kishon's Poker Game

We present an approach for proving the correctness of distributed algori...
research
06/15/2023

Behaviorally Typed State Machines in TypeScript for Heterogeneous Swarms

A heterogeneous swarm system is a distributed system where participants ...
research
08/06/2015

Replication and Generalization of PRECISE

This report describes an initial replication study of the PRECISE system...
research
07/18/2022

From Infinity to Choreographies: Extraction for Unbounded Systems

Choreographies are formal descriptions of distributed systems, which foc...

Please sign up or login with your details

Forgot password? Click here to reset