Neural Optimizer Search with Reinforcement Learning

09/21/2017
by   Irwan Bello, et al.
0

We present an approach to automate the process of discovering optimization methods, with a focus on deep learning architectures. We train a Recurrent Neural Network controller to generate a string in a domain specific language that describes a mathematical update equation based on a list of primitive functions, such as the gradient, running average of the gradient, etc. The controller is trained with Reinforcement Learning to maximize the performance of a model after a few epochs. On CIFAR-10, our method discovers several update rules that are better than many commonly used optimizers, such as Adam, RMSProp, or SGD with and without Momentum on a ConvNet model. We introduce two new optimizers, named PowerSign and AddSign, which we show transfer well and improve training on a variety of different tasks and architectures, including ImageNet classification and Google's neural machine translation system.

READ FULL TEXT
research
08/08/2018

Backprop Evolution

The back-propagation algorithm is the cornerstone of deep learning. Desp...
research
07/19/2019

Lookahead Optimizer: k steps forward, 1 step back

The vast majority of successful deep neural networks are trained using v...
research
10/13/2021

Improving the sample-efficiency of neural architecture search with reinforcement learning

Designing complex architectures has been an essential cogwheel in the re...
research
09/06/2017

Towards Neural Machine Translation with Latent Tree Attention

Building models that take advantage of the hierarchical structure of lan...
research
02/28/2020

Do optimization methods in deep learning applications matter?

With advances in deep learning, exponential data growth and increasing m...
research
05/25/2020

Gradient Monitored Reinforcement Learning

This paper presents a novel neural network training approach for faster ...
research
12/20/2017

A Flexible Approach to Automated RNN Architecture Generation

The process of designing neural architectures requires expert knowledge ...

Please sign up or login with your details

Forgot password? Click here to reset