DeepAI AI Chat
Log In Sign Up

Towards Data-Driven Autonomics in Data Centers

by   Alina Sîrbu, et al.
University of Bologna

Continued reliance on human operators for managing data centers is a major impediment for them from ever reaching extreme dimensions. Large computer systems in general, and data centers in particular, will ultimately be managed using predictive computational and executable models obtained through data-science tools, and at that point, the intervention of humans will be limited to setting high-level goals and policies rather than performing low-level operations. Data-driven autonomics, where management and control are based on holistic predictive models that are built and updated using generated data, opens one possible path towards limiting the role of operators in data centers. In this paper, we present a data-science study of a public Google dataset collected in a 12K-node cluster with the goal of building and evaluating a predictive model for node failures. We use BigQuery, the big data SQL platform from the Google Cloud suite, to process massive amounts of data and generate a rich feature set characterizing machine state over time. We describe how an ensemble classifier can be built out of many Random Forest classifiers each trained on these features, to predict if machines will fail in a future 24-hour window. Our evaluation reveals that if we limit false positive rates to 5 precision varying between 50 our predictive model as the central component of a data-driven autonomic manager and operating it on-line with live data streams (rather than off-line on data logs). All of the scripts used for BigQuery and classification analyses are publicly available from the authors' website.


Data Science Methodologies: Current Challenges and Future Approaches

Data science has employed great research efforts in developing advanced ...

Building Data Science Capabilities into University Data Warehouse to Predict Graduation

The discipline of data science emerged to combine statistical methods wi...

Patent Data for Engineering Design: A Critical Review and Future Directions

Patent data have long been used for engineering design research because ...

Code4ML: a Large-scale Dataset of annotated Machine Learning Code

Program code as a data source is gaining popularity in the data science ...

Data science and Machine learning in the Clouds: A Perspective for the Future

As we are fast approaching the beginning of a paradigm shift in the fiel...

On the Universal Transformation of Data-Driven Models to Control Systems

As in almost every other branch of science, the major advances in data s...

Uncovering the Data-Related Limits of Human Reasoning Research: An Analysis based on Recommender Systems

Understanding the fundamentals of human reasoning is central to the deve...