A Domain Specific Language for Testing Consensus Implementations

03/10/2023
by   Cezara Dragoi, et al.
0

Large-scale, fault-tolerant, distributed systems are the backbone for many critical software services. Since they must execute correctly in a possibly adversarial environment with arbitrary communication delays and failures, the underlying algorithms are intricate. In particular, achieving consistency and data retention relies on intricate consensus (state machine replication) protocols. Ensuring the reliability of implementations of such protocols remains a significant challenge because of the enormous number of exceptional conditions that may arise in production. We propose a methodology and a tool called Netrix for testing such implementations that aims to exploit programmer's knowledge to improve coverage, enables robust bug reproduction, and can be used in regression testing across different versions of an implementation. As evaluation, we apply our tool to a popular proof of stake blockchain protocol, Tendermint, which relies on a Byzantine consensus algorithm, a benign consensus algorithm, Raft, and BFT-Smart. We were able to identify 4 deviations of the Tendermint implementation from the protocol specification and check their absence on an updated implementation. Additionally, we were able to reproduce 4 previously known bugs in Raft.

READ FULL TEXT

page 13

page 14

research
04/30/2020

From Byzantine Replication to Blockchain: Consensus is only the Beginning

The popularization of blockchains leads to a resurgence of interest in B...
research
03/08/2019

Certifying Safety when Implementing Consensus

Ensuring the correctness of distributed system implementations remains a...
research
08/31/2022

Simulating BFT Protocol Implementations at Scale

The novel blockchain generation of Byzantine fault-tolerant (BFT) state ...
research
04/06/2022

Stateful Greybox Fuzzing

Many protocol implementations are reactive systems, where the protocol p...
research
06/15/2023

Behaviorally Typed State Machines in TypeScript for Heterogeneous Swarms

A heterogeneous swarm system is a distributed system where participants ...
research
04/22/2020

Twins: White-Glove Approach for BFT Testing

Byzantine Fault Tolerant (BFT) systems have seen extensive study for mor...
research
02/16/2018

Paxos Consensus, Deconstructed and Abstracted (Extended Version)

Lamport's Paxos algorithm is a classic consensus protocol for state mach...

Please sign up or login with your details

Forgot password? Click here to reset