Practical and sample efficient zero-shot HPO

07/27/2020
by   Fela Winkelmolen, et al.
9

Zero-shot hyperparameter optimization (HPO) is a simple yet effective use of transfer learning for constructing a small list of hyperparameter (HP) configurations that complement each other. That is to say, for any given dataset, at least one of them is expected to perform well. Current techniques for obtaining this list are computationally expensive as they rely on running training jobs on a diverse collection of datasets and a large collection of randomly drawn HPs. This cost is especially problematic in environments where the space of HPs is regularly changing due to new algorithm versions, or changing architectures of deep networks. We provide an overview of available approaches and introduce two novel techniques to handle the problem. The first is based on a surrogate model and adaptively chooses pairs of dataset, configuration to query. The second, for settings where finding, tuning and testing a surrogate model is problematic, is a multi-fidelity technique combining HyperBand with submodular optimization. We benchmark our methods experimentally on five tasks (XGBoost, LightGBM, CatBoost, MLP and AutoML) and show significant improvement in accuracy compared to standard zero-shot HPO with the same training budget. In addition to contributing new algorithms, we provide an extensive study of the zero-shot HPO technique resulting in (1) default hyper-parameters for popular algorithms that would benefit the community using them, (2) massive lookup tables to further the research of hyper-parameter tuning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/26/2023

Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis

Recently, zero-shot TTS and VC methods have gained attention due to thei...
research
03/29/2022

Practical Aspects of Zero-Shot Learning

One of important areas of machine learning research is zero-shot learnin...
research
01/03/2022

Zero-Shot Cost Models for Out-of-the-box Learned Cost Prediction

In this paper, we introduce zero-shot cost models which enable learned c...
research
03/07/2022

Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer

Hyperparameter (HP) tuning in deep learning is an expensive process, pro...
research
06/10/2021

Meta-Learning for Symbolic Hyperparameter Defaults

Hyperparameter optimization in machine learning (ML) deals with the prob...
research
03/01/2021

Performance Variability in Zero-Shot Classification

Zero-shot classification (ZSC) is the task of learning predictors for cl...
research
02/20/2022

Mining Robust Default Configurations for Resource-constrained AutoML

Automatic machine learning (AutoML) is a key enabler of the mass deploym...

Please sign up or login with your details

Forgot password? Click here to reset