DeepAI AI Chat
Log In Sign Up

Stage-based Hyper-parameter Optimization for Deep Learning

11/24/2019
by   Ahnjae Shin, et al.
Seoul National University
0

As deep learning techniques advance more than ever, hyper-parameter optimization is the new major workload in deep learning clusters. Although hyper-parameter optimization is crucial in training deep learning models for high model performance, effectively executing such a computation-heavy workload still remains a challenge. We observe that numerous trials issued from existing hyper-parameter optimization algorithms share common hyper-parameter sequence prefixes, which implies that there are redundant computations from training the same hyper-parameter sequence multiple times. We propose a stage-based execution strategy for efficient execution of hyper-parameter optimization algorithms. Our strategy removes redundancy in the training process by splitting the hyper-parameter sequences of trials into homogeneous stages, and generating a tree of stages by merging the common prefixes. Our preliminary experiment results show that applying stage-based execution to hyper-parameter optimization algorithms outperforms the original trial-based method, saving required GPU-hours and end-to-end training time by up to 6.60 times and 4.13 times, respectively.

READ FULL TEXT
06/22/2020

Hippo: Taming Hyper-parameter Optimization of Deep Learning with Stage Trees

Hyper-parameter optimization is crucial for pushing the accuracy of a de...
07/30/2020

On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice

Machine learning algorithms have been used widely in various application...
03/12/2020

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Since deep neural networks were developed, they have made huge contribut...
08/02/2020

Bayesian Optimization for Selecting Efficient Machine Learning Models

The performance of many machine learning models depends on their hyper-p...
06/16/2022

Optimization-Derived Learning with Essential Convergence Analysis of Training and Hyper-training

Recently, Optimization-Derived Learning (ODL) has attracted attention fr...
12/08/2017

Characterizing the hyper-parameter space of LSTM language models for mixed context applications

Applying state of the art deep learning models to novel real world datas...
11/07/2021

Varuna: Scalable, Low-cost Training of Massive Deep Learning Models

Systems for training massive deep learning models (billions of parameter...