Towards Data-Driven Autonomics in Data Centers

by   Alina Sîrbu, et al.

Continued reliance on human operators for managing data centers is a major impediment for them from ever reaching extreme dimensions. Large computer systems in general, and data centers in particular, will ultimately be managed using predictive computational and executable models obtained through data-science tools, and at that point, the intervention of humans will be limited to setting high-level goals and policies rather than performing low-level operations. Data-driven autonomics, where management and control are based on holistic predictive models that are built and updated using generated data, opens one possible path towards limiting the role of operators in data centers. In this paper, we present a data-science study of a public Google dataset collected in a 12K-node cluster with the goal of building and evaluating a predictive model for node failures. We use BigQuery, the big data SQL platform from the Google Cloud suite, to process massive amounts of data and generate a rich feature set characterizing machine state over time. We describe how an ensemble classifier can be built out of many Random Forest classifiers each trained on these features, to predict if machines will fail in a future 24-hour window. Our evaluation reveals that if we limit false positive rates to 5 precision varying between 50 our predictive model as the central component of a data-driven autonomic manager and operating it on-line with live data streams (rather than off-line on data logs). All of the scripts used for BigQuery and classification analyses are publicly available from the authors' website.


Data Science Methodologies: Current Challenges and Future Approaches

Data science has employed great research efforts in developing advanced ...

WeatherBench 2: A benchmark for the next generation of data-driven global weather models

WeatherBench 2 is an update to the global, medium-range (1-14 day) weath...

Building Data Science Capabilities into University Data Warehouse to Predict Graduation

The discipline of data science emerged to combine statistical methods wi...

Code4ML: a Large-scale Dataset of annotated Machine Learning Code

Program code as a data source is gaining popularity in the data science ...

Putting Data Science Pipelines on the Edge

This paper proposes a composable "Just in Time Architecture" for Data Sc...

On the Universal Transformation of Data-Driven Models to Control Systems

As in almost every other branch of science, the major advances in data s...

New methods for new data? An overview and illustration of quantitative inductive methods for HRM research

"Data is the new oil", in short, data would be the essential source of t...

Please sign up or login with your details

Forgot password? Click here to reset