A Progressive Batching L-BFGS Method for Machine Learning

02/15/2018
by   Raghu Bollapragada, et al.
0

The standard L-BFGS method relies on gradient approximations that are not dominated by noise, so that search directions are descent directions, the line search is reliable, and quasi-Newton updating yields useful quadratic models of the objective function. All of this appears to call for a full batch approach, but since small batch sizes give rise to faster algorithms with better generalization properties, L-BFGS is currently not considered an algorithm of choice for large-scale machine learning applications. One need not, however, choose between the two extremes represented by the full batch or highly stochastic regimes, and may instead follow a progressive batching approach in which the sample size increases during the course of the optimization. In this paper, we present a new version of the L-BFGS algorithm that combines three basic components - progressive batching, a stochastic line search, and stable quasi-Newton updating - and that performs well on training logistic regression and deep neural networks. We provide supporting convergence theory for the method.

READ FULL TEXT
research
05/19/2016

A Multi-Batch L-BFGS Method for Machine Learning

The question of how to parallelize the stochastic gradient descent (SGD)...
research
07/26/2017

A Robust Multi-Batch L-BFGS Method for Machine Learning

This paper describes an implementation of the L-BFGS method designed to ...
research
12/26/2018

Stochastic Trust Region Inexact Newton Method for Large-scale Machine Learning

Nowadays stochastic approximation methods are one of the major research ...
research
05/18/2022

On the efficiency of Stochastic Quasi-Newton Methods for Deep Learning

While first-order methods are popular for solving optimization problems ...
research
04/24/2008

A Quasi-Newton Approach to Nonsmooth Convex Optimization Problems in Machine Learning

We extend the well-known BFGS quasi-Newton method and its memory-limited...
research
12/10/2019

A Stochastic Quasi-Newton Method for Large-Scale Nonconvex Optimization with Applications

This paper proposes a novel stochastic version of damped and regularized...
research
10/30/2017

Adaptive Sampling Strategies for Stochastic Optimization

In this paper, we propose a stochastic optimization method that adaptive...

Please sign up or login with your details

Forgot password? Click here to reset