Design Strategies and Approximation Methods for High-Performance Computing Variability Management

01/24/2022
by   Yueyao Wang, et al.
0

Performance variability management is an active research area in high-performance computing (HPC). We focus on input/output (I/O) variability. To study the performance variability, computer scientists often use grid-based designs (GBDs) to collect I/O variability data, and use mathematical approximation methods to build a prediction model. Mathematical approximation models could be biased particularly if extrapolations are needed. Space-filling designs (SFDs) and surrogate models such as Gaussian process (GP) are popular for data collection and building predictive models. The applicability of SFDs and surrogates in the HPC variability needs investigation. We investigate their applicability in the HPC setting in terms of design efficiency, prediction accuracy, and scalability. We first customize the existing SFDs so that they can be applied in the HPC setting. We conduct a comprehensive investigation of design strategies and the prediction ability of approximation methods. We use both synthetic data simulated from three test functions and the real data from the HPC setting. We then compare different methods in terms of design efficiency, prediction accuracy, and scalability. In synthetic and real data analysis, GP with SFDs outperforms in most scenarios. With respect to approximation models, GP is recommended if the data are collected by SFDs. If data are collected using GBDs, both GP and Delaunay can be considered. With the best choice of approximation method, the performance of SFDs and GBD depends on the property of the underlying surface. For the cases in which SFDs perform better, the number of design points needed for SFDs is about half of or less than that of the GBD to achieve the same prediction accuracy. SFDs that can be tailored to high dimension and non-smooth surface are recommended especially when large numbers of input factors need to be considered in the model.

READ FULL TEXT

page 7

page 19

research
12/14/2020

Prediction of High-Performance Computing Input/Output Variability and Its Application to Optimization for System Configurations

Performance variability is an important measure for a reliable high perf...
research
05/19/2022

Prediction for Distributional Outcomes in High-Performance Computing I/O Variability

Although high-performance computing (HPC) systems have been scaled to me...
research
11/03/2019

Gaussian process metamodeling for experiments with manipulating factors

This paper presents a new Gaussian process (GP) metamodeling approach fo...
research
01/31/2018

Composite Gaussian Processes: Scalable Computation and Performance Analysis

Gaussian process (GP) models provide a powerful tool for prediction but ...
research
12/06/2018

Distance-distributed design for Gaussian process surrogates

A common challenge in computer experiments and related fields is to effi...
research
01/06/2021

Sequential Design of Computer Experiments with Quantitative and Qualitative Factors in Applications to HPC Performance Optimization

Computer experiments with both qualitative and quantitative factors are ...
research
08/15/2018

Monitoring through many eyes: Integrating scientific and crowd-sourced datasets to improve monitoring of the Great Barrier Reef

Data in the Great Barrier Reef (GBR) are collected by numerous organisat...

Please sign up or login with your details

Forgot password? Click here to reset