Stage-based Hyper-parameter Optimization for Deep Learning

11/24/2019
by   Ahnjae Shin, et al.
0

As deep learning techniques advance more than ever, hyper-parameter optimization is the new major workload in deep learning clusters. Although hyper-parameter optimization is crucial in training deep learning models for high model performance, effectively executing such a computation-heavy workload still remains a challenge. We observe that numerous trials issued from existing hyper-parameter optimization algorithms share common hyper-parameter sequence prefixes, which implies that there are redundant computations from training the same hyper-parameter sequence multiple times. We propose a stage-based execution strategy for efficient execution of hyper-parameter optimization algorithms. Our strategy removes redundancy in the training process by splitting the hyper-parameter sequences of trials into homogeneous stages, and generating a tree of stages by merging the common prefixes. Our preliminary experiment results show that applying stage-based execution to hyper-parameter optimization algorithms outperforms the original trial-based method, saving required GPU-hours and end-to-end training time by up to 6.60 times and 4.13 times, respectively.

READ FULL TEXT
research
06/22/2020

Hippo: Taming Hyper-parameter Optimization of Deep Learning with Stage Trees

Hyper-parameter optimization is crucial for pushing the accuracy of a de...
research
07/30/2020

On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice

Machine learning algorithms have been used widely in various application...
research
03/12/2020

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Since deep neural networks were developed, they have made huge contribut...
research
08/02/2020

Bayesian Optimization for Selecting Efficient Machine Learning Models

The performance of many machine learning models depends on their hyper-p...
research
06/16/2022

Optimization-Derived Learning with Essential Convergence Analysis of Training and Hyper-training

Recently, Optimization-Derived Learning (ODL) has attracted attention fr...
research
12/08/2017

Characterizing the hyper-parameter space of LSTM language models for mixed context applications

Applying state of the art deep learning models to novel real world datas...
research
11/07/2021

Varuna: Scalable, Low-cost Training of Massive Deep Learning Models

Systems for training massive deep learning models (billions of parameter...

Please sign up or login with your details

Forgot password? Click here to reset