Analytics of Longitudinal System Monitoring Data for Performance Prediction

07/07/2020
by   Ian J. Costello, et al.
0

In recent years, several HPC facilities have started continuous monitoring of their systems and jobs to collect performance-related data for understanding performance and operational efficiency. Such data can be used to optimize the performance of individual jobs and the overall system by creating data-driven models that can predict the performance of pending jobs. In this paper, we model the performance of representative control jobs using longitudinal system-wide monitoring data to explore the causes of performance variability. Using machine learning, we are able to predict the performance of unseen jobs before they are executed based on the current system state. We analyze these prediction models in great detail to identify the features that are dominant predictors of performance. We demonstrate that such models can be application-agnostic and can be used for predicting performance of applications that are not included in training.

READ FULL TEXT
research
09/13/2020

Analyzing Performance Properties Collected by the PerSyst Scalable HPC Monitoring Tool

The ability to understand how a scientific application is executed on a ...
research
02/05/2019

Gradient Boosting to Boost the Efficiency of Hydraulic Fracturing

In this paper we present a data-driven model for forecasting the product...
research
05/02/2016

Highly Accurate Prediction of Jobs Runtime Classes

Separating the short jobs from the long is a known technique to improve ...
research
01/20/2023

ARcode: HPC Application Recognition Through Image-encoded Monitoring Data

Knowing HPC applications of jobs and analyzing their performance behavio...
research
04/14/2022

Longitudinal Complex Dynamics of Labour Markets Reveal Increasing Polarisation

In this paper we conduct a longitudinal analysis of the structure of lab...
research
09/29/2019

A Longitudinal Framework for Predicting Nonresponse in Panel Surveys

Nonresponse in panel studies can lead to a substantial loss in data qual...
research
05/19/2022

Prediction for Distributional Outcomes in High-Performance Computing I/O Variability

Although high-performance computing (HPC) systems have been scaled to me...

Please sign up or login with your details

Forgot password? Click here to reset