A Generalizable Approach to Learning Optimizers

06/02/2021
by   Diogo Almeida, et al.
15

A core issue with learning to optimize neural networks has been the lack of generalization to real world problems. To address this, we describe a system designed from a generalization-first perspective, learning to update optimizer hyperparameters instead of model parameters directly using novel features, actions, and a reward function. This system outperforms Adam at all neural network tasks including on modalities not seen during training. We achieve 2x speedups on ImageNet, and a 2.5x speedup on a language modeling task using over 5 orders of magnitude more compute than the training tasks.

READ FULL TEXT
research
09/23/2020

Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves

Much as replacing hand-designed features with learned functions has revo...
research
02/27/2020

Using a thousand optimization tasks to learn hyperparameter search strategies

We present TaskSet, a dataset of tasks for use in training and evaluatin...
research
11/22/2018

HyperAdam: A Learnable Task-Adaptive Adam for Network Training

Deep neural networks are traditionally trained using human-designed stoc...
research
11/28/2018

ESPNetv2: A Light-weight, Power Efficient, and General Purpose Convolutional Neural Network

We introduce a light-weight, power efficient, and general purpose convol...
research
02/14/2012

A Geometric Traversal Algorithm for Reward-Uncertain MDPs

Markov decision processes (MDPs) are widely used in modeling decision ma...
research
09/26/2022

Learning to Learn with Generative Models of Neural Network Checkpoints

We explore a data-driven approach for learning to optimize neural networ...
research
04/11/2021

A Bop and Beyond: A Second Order Optimizer for Binarized Neural Networks

The optimization of Binary Neural Networks (BNNs) relies on approximatin...

Please sign up or login with your details

Forgot password? Click here to reset