Avoiding Communication in Logistic Regression

11/16/2020
by   Aditya Devarakonda, et al.
14

Stochastic gradient descent (SGD) is one of the most widely used optimization methods for solving various machine learning problems. SGD solves an optimization problem by iteratively sampling a few data points from the input data, computing gradients for the selected data points, and updating the solution. However, in a parallel setting, SGD requires interprocess communication at every iteration. We introduce a new communication-avoiding technique for solving the logistic regression problem using SGD. This technique re-organizes the SGD computations into a form that communicates every s iterations instead of every iteration, where s is a tuning parameter. We prove theoretical flops, bandwidth, and latency upper bounds for SGD and its new communication-avoiding variant. Furthermore, we show experimental results that illustrate that the new Communication-Avoiding SGD (CA-SGD) method can achieve speedups of up to 4.97× on a high-performance Infiniband cluster without altering the convergence behavior or accuracy.

READ FULL TEXT
research
01/10/2019

Quantized Epoch-SGD for Communication-Efficient Distributed Learning

Due to its efficiency and ease to implement, stochastic gradient descent...
research
01/27/2019

99

It is well known that many optimization methods, including SGD, SAGA, an...
research
05/21/2021

Escaping Saddle Points with Compressed SGD

Stochastic gradient descent (SGD) is a prevalent optimization technique ...
research
06/11/2015

Variance Reduced Stochastic Gradient Descent with Neighbors

Stochastic Gradient Descent (SGD) is a workhorse in machine learning, ye...
research
07/26/2017

A Robust Multi-Batch L-BFGS Method for Machine Learning

This paper describes an implementation of the L-BFGS method designed to ...
research
05/29/2019

Accelerated Sparsified SGD with Error Feedback

We study a stochastic gradient method for synchronous distributed optimi...
research
10/24/2017

Avoiding Communication in Proximal Methods for Convex Optimization Problems

The fast iterative soft thresholding algorithm (FISTA) is used to solve ...

Please sign up or login with your details

Forgot password? Click here to reset