A Fault-Tolerance Shim for Serverless Computing

03/12/2020
by   Vikram Sreekanti, et al.
0

Serverless computing has grown in popularity in recent years, with an increasing number of applications being built on Functions-as-a-Service (FaaS) platforms. By default, FaaS platforms support retry-based fault tolerance, but this is insufficient for programs that modify shared state, as they can unwittingly persist partial sets of updates in case of failures. To address this challenge, we would like atomic visibility of the updates made by a FaaS application. In this paper, we present AFT, an atomic fault tolerance shim for serverless applications. AFT interposes between a commodity FaaS platform and storage engine and ensures atomic visibility of updates by enforcing the read atomic isolation guarantee. AFT supports new protocols to guarantee read atomic isolation in the serverless setting. We demonstrate that aft introduces minimal overhead relative to existing storage engines and scales smoothly to thousands of requests per second, while preventing a significant number of consistency anomalies.

READ FULL TEXT

page 10

page 11

research
10/03/2018

Distributed transactional reads: the strong, the quick, the fresh & the impossible

This paper studies the costs and trade-offs of providing transactional c...
research
04/05/2021

ECRM: Efficient Fault Tolerance for Recommendation Model Training via Erasure Coding

Deep-learning-based recommendation models (DLRMs) are widely deployed to...
research
06/04/2019

Reconfigurable Atomic Transaction Commit (Extended Version)

Modern data stores achieve scalability by partitioning data into shards ...
research
02/13/2021

Reinit++: Evaluating the Performance of Global-Restart Recovery Methods For MPI Fault Tolerance

Scaling supercomputers comes with an increase in failure rates due to th...
research
01/20/2020

BAASH: Enabling Blockchain-as-a-Service on High-Performance Computing Systems

The state-of-the-art approach to manage blockchains is to process blocks...
research
07/21/2023

Transactional Indexes on (RDMA or CXL-based) Disaggregated Memory with Repairable Transaction

The failure atomic and isolated execution of clients operations is a def...
research
08/21/2019

MOD: Minimally Ordered Durable Datastructures for Persistent Memory

Persistent Memory (PM) makes possible recoverable applications that can ...

Please sign up or login with your details

Forgot password? Click here to reset