DeepAI AI Chat
Log In Sign Up

How Bad Can a Bug Get? An Empirical Analysis of Software Failures in the OpenStack Cloud Computing Platform

by   Domenico Cotroneo, et al.
University of Naples Federico II

Cloud management systems provide abstractions and APIs for programmatically configuring cloud infrastructures. Unfortunately, residual software bugs in these systems can potentially lead to high-severity failures, such as prolonged outages and data losses. In this paper, we investigate the impact of failures in the context widespread OpenStack cloud management system, by performing fault injection and by analyzing the impact of the resulting failures in terms of fail-stop behavior, failure detection through logging, and failure propagation across components. The analysis points out that most of the failures are not timely detected and notified; moreover, many of these failures can silently propagate over time and through components of the cloud management system, which call for more thorough run-time checks and fault containment.


Towards Runtime Verification via Event Stream Processing in Cloud Computing Infrastructures

Software bugs in cloud management systems often cause erratic behavior, ...

Run-time Failure Detection via Non-intrusive Event Analysis in a Large-Scale Cloud Computing Platform

Cloud computing systems fail in complex and unforeseen ways due to unexp...

Failure Analysis of Big Cloud Service Providers Prior to and During Covid-19 Period

Cloud services are important for societal function such as healthcare, c...

Impact of Limpware on HDFS: A Probabilistic Estimation

With the advent of cloud computing, thousands of machines are connected ...

Enhancing Failure Propagation Analysis in Cloud Computing Systems

In order to plan for failure recovery, the designers of cloud systems ne...

Intelligent Vision Based Wear Forecasting on Surfaces of Machine Tool Elements

This paper addresses the ability to enable machines to automatically det...