Availability Analysis of Redundant and Replicated Cloud Services with Bayesian Networks

06/23/2023
by   Otto Bibartiu, et al.
0

Due to the growing complexity of modern data centers, failures are not uncommon any more. Therefore, fault tolerance mechanisms play a vital role in fulfilling the availability requirements. Multiple availability models have been proposed to assess compute systems, among which Bayesian network models have gained popularity in industry and research due to its powerful modeling formalism. In particular, this work focuses on assessing the availability of redundant and replicated cloud computing services with Bayesian networks. So far, research on availability has only focused on modeling either infrastructure or communication failures in Bayesian networks, but have not considered both simultaneously. This work addresses practical modeling challenges of assessing the availability of large-scale redundant and replicated services with Bayesian networks, including cascading and common-cause failures from the surrounding infrastructure and communication network. In order to ease the modeling task, this paper introduces a high-level modeling formalism to build such a Bayesian network automatically. Performance evaluations demonstrate the feasibility of the presented Bayesian network approach to assess the availability of large-scale redundant and replicated services. This model is not only applicable in the domain of cloud computing it can also be applied for general cases of local and geo-distributed systems.

READ FULL TEXT
research
03/14/2009

CloudSim: A Novel Framework for Modeling and Simulation of Cloud Computing Infrastructures and Services

Cloud computing focuses on delivery of reliable, secure, fault-tolerant,...
research
06/08/2018

Does The Cloud Need Stabilizing?

The last decade has witnessed rapid proliferation of cloud computing. Wh...
research
01/23/2013

Bayesian Networks for Dependability Analysis: an Application to Digital Control Reliability

Bayesian Networks (BN) provide robust probabilistic methods of reasoning...
research
03/23/2021

An Approach for the Automation of IaaS Cloud Upgrade

An Infrastructure as a Service (IaaS) cloud provider is committed to eac...
research
03/06/2022

Enabling SMEs to Use Cloud Computing Services - An Exploratory Study on Enterprises Strategy Alterations

Modern commercial enterprises, irrespective of their relative size and t...
research
08/16/2020

Dependability Evaluation of Middleware Technology for Large-scale Distributed Caching

Distributed caching systems (e.g., Memcached) are widely used by service...
research
08/13/2018

Addressing Client Needs for Cloud Computing using Formal Foundations

Cloud-enabled large-scale distributed systems orchestrate resources and ...

Please sign up or login with your details

Forgot password? Click here to reset