-
Is One Hyperparameter Optimizer Enough?
Hyperparameter tuning is the black art of automatically finding a good c...
read it
-
Descending through a Crowded Valley – Benchmarking Deep Learning Optimizers
Choosing the optimizer is among the most crucial decisions of deep learn...
read it
-
Squirrel: A Switching Hyperparameter Optimizer
In this short note, we describe our submission to the NeurIPS 2020 BBO c...
read it
-
Domain-independent Dominance of Adaptive Methods
From a simplified analysis of adaptive methods, we derive AvaGrad, a new...
read it
-
Improving Deep Learning for Defect Prediction (using the GHOST Hyperparameter Optimizer)
There has been much recent interest in the application of deep learning ...
read it
-
Mixing ADAM and SGD: a Combined Optimization Method
Optimization methods (optimizers) get special attention for the efficien...
read it
-
On Empirical Comparisons of Optimizers for Deep Learning
Selecting an optimizer is a central step in the contemporary deep learni...
read it
On the Tunability of Optimizers in Deep Learning
There is no consensus yet on the question whether adaptive gradient methods like Adam are easier to use than non-adaptive optimization methods like SGD. In this work, we fill in the important, yet ambiguous concept of `ease-of-use' by defining an optimizer's tunability: How easy is it to find good hyperparameter configurations using automatic random hyperparameter search? We propose a practical and universal quantitative measure for optimizer tunability that can form the basis for a fair optimizer benchmark. Evaluating a variety of optimizers on an extensive set of standard datasets and architectures, we find that Adam is the most tunable for the majority of problems, especially with a low budget for hyperparameter tuning.
READ FULL TEXT
Comments
There are no comments yet.