DeepAI AI Chat
Log In Sign Up

Enhancing Failure Propagation Analysis in Cloud Computing Systems

08/30/2019
by   Domenico Cotroneo, et al.
University of Naples Federico II
0

In order to plan for failure recovery, the designers of cloud systems need to understand how their system can potentially fail. Unfortunately, analyzing the failure behavior of such systems can be very difficult and time-consuming, due to the large volume of events, non-determinism, and reuse of third-party components. To address these issues, we propose a novel approach that joins fault injection with anomaly detection to identify the symptoms of failures. We evaluated the proposed approach in the context of the OpenStack cloud computing platform. We show that our model can significantly improve the accuracy of failure analysis in terms of false positives and negatives, with a low computational cost.

READ FULL TEXT
09/30/2020

Fault Injection Analytics: A Novel Approach to Discover Failure Modes in Cloud-Computing Systems

Cloud computing systems fail in complex and unexpected ways due to unexp...
06/29/2021

Enhancing the Analysis of Software Failures in Cloud Computing Systems with Deep Learning

Identifying the failure modes of cloud computing systems is a difficult ...
07/09/2019

How Bad Can a Bug Get? An Empirical Analysis of Software Failures in the OpenStack Cloud Computing Platform

Cloud management systems provide abstractions and APIs for programmatica...
01/27/2019

Anomaly detecting and ranking of the cloud computing platform by multi-view learning

Anomaly detecting as an important technical in cloud computing is applie...
01/18/2023

Run-time Failure Detection via Non-intrusive Event Analysis in a Large-Scale Cloud Computing Platform

Cloud computing systems fail in complex and unforeseen ways due to unexp...
11/16/2021

Online Self-Evolving Anomaly Detection in Cloud Computing Environments

Modern cloud computing systems contain hundreds to thousands of computin...
02/02/2023

MLOps with enhanced performance control and observability

The explosion of data and its ever increasing complexity in the last few...