HyperSched: Dynamic Resource Reallocation for Model Development on a Deadline

01/08/2020
by   Richard Liaw, et al.
0

Prior research in resource scheduling for machine learning training workloads has largely focused on minimizing job completion times. Commonly, these model training workloads collectively search over a large number of parameter values that control the learning process in a hyperparameter search. It is preferable to identify and maximally provision the best-performing hyperparameter configuration (trial) to achieve the highest accuracy result as soon as possible. To optimally trade-off evaluating multiple configurations and training the most promising ones by a fixed deadline, we design and build HyperSched – a dynamic application-level resource scheduler to track, identify, and preferentially allocate resources to the best performing trials to maximize accuracy by the deadline. HyperSched leverages three properties of a hyperparameter search workload over-looked in prior work - trial disposability, progressively identifiable rankings among different configurations, and space-time constraints - to outperform standard hyperparameter search algorithms across a variety of benchmarks.

READ FULL TEXT

page 2

page 4

page 5

page 8

page 9

research
08/08/2021

Online Evolutionary Batch Size Orchestration for Scheduling Deep Learning Workloads in GPU Clusters

Efficient GPU resource scheduling is essential to maximize resource util...
research
12/02/2019

ExperienceThinking: Hyperparameter Optimization with Budget Constraints

The problem of hyperparameter optimization exists widely in the real lif...
research
09/29/2022

Dynamic Surrogate Switching: Sample-Efficient Search for Factorization Machine Configurations in Online Recommendations

Hyperparameter optimization is the process of identifying the appropriat...
research
09/07/2019

Transferable Neural Processes for Hyperparameter Optimization

Automated machine learning aims to automate the whole process of machine...
research
05/21/2021

Polyjuice: High-Performance Transactions via Learned Concurrency Control

Concurrency control algorithms are key determinants of the performance o...
research
12/23/2021

Using Sequential Statistical Tests to Improve the Performance of Random Search in hyperparameter Tuning

Hyperparamter tuning is one of the the most time-consuming parts in mach...
research
08/29/2021

Leveraging Transprecision Computing for Machine Vision Applications at the Edge

Machine vision tasks present challenges for resource constrained edge de...

Please sign up or login with your details

Forgot password? Click here to reset