Resilient Cloud-based Replication with Low Latency

09/21/2020
by   Michael Eischer, et al.
0

Existing approaches to tolerate Byzantine faults in geo-replicated environments require systems to execute complex agreement protocols over wide-area links and consequently are often associated with high response times. In this paper we address this problem with Spider, a resilient replication architecture for geo-distributed systems that leverages the availability characteristics of today's public-cloud infrastructures to minimize complexity and reduce latency. Spider models a system as a collection of loosely coupled replica groups whose members are hosted in different cloud-provided fault domains (i.e., availability zones) of the same geographic region. This structural organization makes it possible to achieve low response times by placing replica groups in close proximity to clients while still enabling the replicas of a group to interact over short-distance links. To handle the inter-group communication necessary for strong consistency Spider uses a reliable group-to-group message channel with first-in-first-out semantics and built-in flow control that significantly simplifies system design.

READ FULL TEXT
research
01/27/2020

Hermes: a Fast, Fault-Tolerant and Linearizable Replication Protocol

Today's datacenter applications are underpinned by datastores that are r...
research
12/04/2021

Invalidation-Based Protocols for Replicated Datastores

Distributed in-memory datastores underpin cloud applications that run wi...
research
09/14/2021

Egalitarian Byzantine Fault Tolerance

Minimizing end-to-end latency in geo-replicated systems usually makes it...
research
02/08/2019

Consistency models in distributed systems: A survey on definitions, disciplines, challenges and applications

The replication mechanism resolves some challenges with big data such as...
research
09/20/2021

ApproxIFER: A Model-Agnostic Approach to Resilient and Robust Prediction Serving Systems

Due to the surge of cloud-assisted AI services, the problem of designing...
research
11/03/2020

AWARE: Adaptive Wide-Area Replication for Fast and Resilient Byzantine Consensus

With upcoming blockchain infrastructures, world-spanning Byzantine conse...
research
05/31/2022

Dropbear: Machine Learning Marketplaces made Trustworthy with Byzantine Model Agreement

Marketplaces for machine learning (ML) models are emerging as a way for ...

Please sign up or login with your details

Forgot password? Click here to reset