PCRAFT: Capacity Planning for Dependable Stateless Services

06/15/2022
by   Rasha Faqeh, et al.
0

Fault-tolerance techniques depend on replication to enhance availability, albeit at the cost of increased infrastructure costs. This results in a fundamental trade-off: Fault-tolerant services must satisfy given availability and performance constraints while minimising the number of replicated resources. These constraints pose capacity planning challenges for the service operators to minimise replication costs without negatively impacting availability. To this end, we present PCRAFT, a system to enable capacity planning of dependable services. PCRAFT's capacity planning is based on a hybrid approach that combines empirical performance measurements with probabilistic modelling of availability based on fault injection. In particular, we integrate traditional service-level availability mechanisms (active route anywhere and passive failover) and deployment schemes (cloud and on-premises) to quantify the number of nodes needed to satisfy the given availability and performance constraints. Our evaluation based on real-world applications shows that cloud deployment requires fewer nodes than on-premises deployments. Additionally, when considering on-premises deployments, we show how passive failover requires fewer nodes than active route anywhere. Furthermore, our evaluation quantify the quality enhancement given by additional integrity mechanisms and how this affects the number of nodes needed.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/20/2017

Transparent Replication Using Metaprogramming in Cyan

Replication can be used to increase the availability of a service by cre...
research
05/16/2023

Availability Evaluation of IoT Systems with Byzantine Fault-Tolerance for Mission-critical Applications

Byzantine fault-tolerant (BFT) systems are able to maintain the availabi...
research
05/02/2021

Deployment Archetypes for Cloud Applications

This is a survey paper that explores six Cloud-based deployment archetyp...
research
03/30/2020

Provisioning Spot Instances Without Employing Fault-Tolerance Mechanisms

Cloud computing offers a variable-cost payment scheme that allows cloud ...
research
05/09/2022

Applying consensus and replication securely with FLAQR

Availability is crucial to the security of distributed systems, but guar...
research
10/09/2021

Evaluation and Ranking of Replica Deployments in Geographic State Machine Replication

Geographic state machine replication (SMR) is a replication method in wh...
research
05/12/2019

Interoperator fixed-mobile network sharing

We propose the novel idea of interoperator fixed-mobile network sharing,...

Please sign up or login with your details

Forgot password? Click here to reset