Automatic prior selection for meta Bayesian optimization with a case study on tuning deep neural network optimizers

09/16/2021
by   Zi Wang, et al.
9

The performance of deep neural networks can be highly sensitive to the choice of a variety of meta-parameters, such as optimizer parameters and model hyperparameters. Tuning these well, however, often requires extensive and costly experimentation. Bayesian optimization (BO) is a principled approach to solve such expensive hyperparameter tuning problems efficiently. Key to the performance of BO is specifying and refining a distribution over functions, which is used to reason about the optima of the underlying function being optimized. In this work, we consider the scenario where we have data from similar functions that allows us to specify a tighter distribution a priori. Specifically, we focus on the common but potentially costly task of tuning optimizer parameters for training neural networks. Building on the meta BO method from Wang et al. (2018), we develop practical improvements that (a) boost its performance by leveraging tuning results on multiple tasks without requiring observations for the same meta-parameter points across all tasks, and (b) retain its regret bound for a special case of our method. As a result, we provide a coherent BO solution for iterative optimization of continuous optimizer parameters. To verify our approach in realistic model training setups, we collected a large multi-task hyperparameter tuning dataset by training tens of thousands of configurations of near-state-of-the-art models on popular image and text datasets, as well as a protein sequence dataset. Our results show that on average, our method is able to locate good hyperparameters at least 3 times more efficiently than the best competing methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/07/2022

Pre-training helps Bayesian optimization too

Bayesian optimization (BO) has become a popular strategy for global opti...
research
03/24/2022

Rich Feature Construction for the Optimization-Generalization Dilemma

There often is a dilemma between ease of optimization and robust out-of-...
research
03/12/2019

Practical Multi-fidelity Bayesian Optimization for Hyperparameter Tuning

Bayesian optimization is popular for optimizing time-consuming black-box...
research
02/25/2021

Hyperparameter Transfer Learning with Adaptive Complexity

Bayesian optimization (BO) is a sample efficient approach to automatical...
research
08/14/2015

Unbounded Bayesian Optimization via Regularization

Bayesian optimization has recently emerged as a popular and efficient to...
research
05/26/2022

Towards Learning Universal Hyperparameter Optimizers with Transformers

Meta-learning hyperparameter optimization (HPO) algorithms from prior ex...
research
09/16/2019

Weighted Sampling for Combined Model Selection and Hyperparameter Tuning

The combined algorithm selection and hyperparameter tuning (CASH) proble...

Please sign up or login with your details

Forgot password? Click here to reset