A Robust Multi-Batch L-BFGS Method for Machine Learning

07/26/2017
by   Albert S. Berahas, et al.
0

This paper describes an implementation of the L-BFGS method designed to deal with two adversarial situations. The first occurs in distributed computing environments where some of the computational nodes devoted to the evaluation of the function and gradient are unable to return results on time. A similar challenge occurs in a multi-batch approach in which the data points used to compute function and gradients are purposely changed at each iteration to accelerate the learning process. Difficulties arise because L-BFGS employs gradient differences to update the Hessian approximations, and when these gradients are computed using different data points the updating process can be unstable. This paper shows how to perform stable quasi-Newton updating in the multi-batch setting, studies the convergence properties for both convex and nonconvex functions, and illustrates the behavior of the algorithm in a distributed computing platform on binary classification logistic regression and neural network training problems that arise in machine learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/19/2016

A Multi-Batch L-BFGS Method for Machine Learning

The question of how to parallelize the stochastic gradient descent (SGD)...
research
03/28/2023

Convergence of Momentum-Based Heavy Ball Method with Batch Updating and/or Approximate Gradients

In this paper, we study the well-known "Heavy Ball" method for convex an...
research
02/15/2018

A Progressive Batching L-BFGS Method for Machine Learning

The standard L-BFGS method relies on gradient approximations that are no...
research
11/16/2020

Avoiding Communication in Logistic Regression

Stochastic gradient descent (SGD) is one of the most widely used optimiz...
research
01/28/2019

Quasi-Newton Methods for Deep Learning: Forget the Past, Just Sample

We present two sampled quasi-Newton methods for deep learning: sampled L...
research
05/30/2019

Scaling Up Quasi-Newton Algorithms: Communication Efficient Distributed SR1

In this paper, we present a scalable distributed implementation of the s...

Please sign up or login with your details

Forgot password? Click here to reset