Cello: Efficient Computer Systems Optimization with Predictive Early Termination and Censored Regression

04/11/2022
by   Yi Ding, et al.
0

Sample-efficient machine learning (SEML) has been widely applied to find optimal latency and power tradeoffs for configurable computer systems. Instead of randomly sampling from the configuration space, SEML reduces the search cost by dramatically reducing the number of configurations that must be sampled to optimize system goals (e.g., low latency or energy). Nevertheless, SEML only reduces one component of cost – the total number of samples collected – but does not decrease the cost of collecting each sample. Critically, not all samples are equal; some take much longer to collect because they correspond to slow system configurations. This paper present Cello, a computer systems optimization framework that reduces sample collection costs – especially those that come from the slowest configurations. The key insight is to predict ahead of time whether samples will have poor system behavior (e.g., long latency or high energy) and terminate these samples early before their measured system behavior surpasses the termination threshold, which we call it predictive early termination. To predict the future system behavior accurately before it manifests as high runtime or energy, Cello uses censored regression to produces accurate predictions for running samples. We evaluate Cello by optimizing latency and energy for Apache Spark workloads. We give Cello a fixed amount of time to search a combined space of hardware and software configuration parameters. Our evaluation shows that compared to the state-of-the-art SEML approach in computer systems optimization, Cello improves latency by 1.19X for minimizing latency under a power constraint, and improves energy by 1.18X for minimizing energy under a latency constraint.

READ FULL TEXT

page 9

page 10

page 12

research
03/04/2018

Scout: An Experienced Guide to Find the Best Cloud Configuration

Finding the right cloud configuration for workloads is an essential step...
research
06/09/2022

Predictive Exit: Prediction of Fine-Grained Early Exits for Computation- and Energy-Efficient Inference

By adding exiting layers to the deep learning networks, early exit can t...
research
04/22/2022

SCOPE: Safe Exploration for Dynamic Computer Systems Optimization

Modern computer systems need to execute under strict safety constraints ...
research
06/28/2020

Fast and Low-cost Search for Efficient Cloud Configurations for HPC Workloads

The use of cloud computational resources has become increasingly importa...
research
05/01/2023

Software Runtime Monitoring with Adaptive Sampling Rate to Collect Representative Samples of Execution Traces

Monitoring software systems at runtime is key for understanding workload...
research
02/03/2021

Llama: A Heterogeneous Serverless Framework for Auto-Tuning Video Analytics Pipelines

The proliferation of camera-enabled devices and large video repositories...
research
05/16/2021

Zero Aware Configurable Data Encoding by Skipping Transfer for Error Resilient Applications

In this paper, we propose Zero Aware Configurable Data Encoding by Skipp...

Please sign up or login with your details

Forgot password? Click here to reset