Quasi-Newton Methods for Deep Learning: Forget the Past, Just Sample

01/28/2019
by   Albert S. Berahas, et al.
0

We present two sampled quasi-Newton methods for deep learning: sampled LBFGS (S-LBFGS) and sampled LSR1 (S-LSR1). Contrary to the classical variants of these methods that sequentially build Hessian or inverse Hessian approximations as the optimization progresses, our proposed methods sample points randomly around the current iterate at every iteration to produce these approximations. As a result, the approximations constructed make use of more reliable (recent and local) information, and do not depend on past iterate information that could be significantly stale. Our proposed algorithms are efficient in terms of accessed data points (epochs) and have enough concurrency to take advantage of parallel/distributed computing environments. We provide convergence guarantees for our proposed methods. Numerical tests on a toy classification problem as well as on popular benchmarking neural network training tasks reveal that the methods outperform their classical variants and are competitive with state-of-the-art first-order methods such as ADAM.

READ FULL TEXT
research
05/18/2022

On the efficiency of Stochastic Quasi-Newton Methods for Deep Learning

While first-order methods are popular for solving optimization problems ...
research
11/06/2018

Quasi-Newton Optimization in Deep Q-Learning for Playing ATARI Games

Reinforcement Learning (RL) algorithms allow artificial agents to improv...
research
05/30/2019

Scaling Up Quasi-Newton Algorithms: Communication Efficient Distributed SR1

In this paper, we present a scalable distributed implementation of the s...
research
07/01/2018

Trust-Region Algorithms for Training Responses: Machine Learning Methods Using Indefinite Hessian Approximations

Machine learning (ML) problems are often posed as highly nonlinear and n...
research
12/02/2021

Newton methods based convolution neural networks using parallel processing

Training of convolutional neural networks is a high dimensional and a no...
research
07/26/2017

A Robust Multi-Batch L-BFGS Method for Machine Learning

This paper describes an implementation of the L-BFGS method designed to ...
research
10/11/2022

Learning to Optimize Quasi-Newton Methods

We introduce a novel machine learning optimizer called LODO, which onlin...

Please sign up or login with your details

Forgot password? Click here to reset