From Reversible Computation to Checkpoint-Based Rollback Recovery for Message-Passing Concurrent Programs

09/09/2023
by   Germán Vidal, et al.
0

The reliability of concurrent and distributed systems often depends on some well-known techniques for fault tolerance. One such technique is based on checkpointing and rollback recovery. Checkpointing involves processes to take snapshots of their current states regularly, so that a rollback recovery strategy is able to bring the system back to a previous consistent state whenever a failure occurs. In this paper, we consider a message-passing concurrent programming language and propose a novel rollback recovery strategy that is based on some explicit checkpointing primitives and the use of a (partially) reversible semantics for rolling back the system.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/23/2021

A Lightweight Approach to Computing Message Races with an Application to Causal-Consistent Reversible Debugging

This paper presents a lightweight formalism (a trace) to model message-p...
research
12/14/2018

Mastering Concurrent Computing Through Sequential Thinking: A Half-century Evolution

Concurrency, the art of doing many things at the same time is slowly bec...
research
08/24/2017

Reliability and Fault-Tolerance by Choreographic Design

Distributed programs are hard to get right because they are required to ...
research
10/11/2017

A Semantics Comparison Workbench for a Concurrent, Asynchronous, Distributed Programming Language

A number of high-level languages and libraries have been proposed that o...
research
12/28/2017

Inferring Formal Properties of Production Key-Value Stores

Production distributed systems are challenging to formally verify, in pa...
research
10/27/2016

Fencing off Go: Liveness and Safety for Channel-based Programming (extended version)

Go is a production-level statically typed programming language whose des...
research
07/10/2018

Two-Phase Dynamic Analysis of Message-Passing Go Programs based on Vector Clocks

Understanding the run-time behavior of concurrent programs is a challeng...

Please sign up or login with your details

Forgot password? Click here to reset