Variance Reduction for Distributed Stochastic Gradient Descent

12/05/2015
by   Soham De, et al.
0

Variance reduction (VR) methods boost the performance of stochastic gradient descent (SGD) by enabling the use of larger, constant stepsizes and preserving linear convergence rates. However, current variance reduced SGD methods require either high memory usage or an exact gradient computation (using the entire dataset) at the end of each epoch. This limits the use of VR methods in practical distributed settings. In this paper, we propose a variance reduction method, called VR-lite, that does not require full gradient computations or extra storage. We explore distributed synchronous and asynchronous variants that are scalable and remain stable with low communication frequency. We empirically compare both the sequential and distributed algorithms to state-of-the-art stochastic optimization methods, and find that our proposed algorithms perform favorably to other stochastic methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/17/2017

Larger is Better: The Effect of Learning Rates Enjoyed by Stochastic Optimization with Progressive Variance Reduction

In this paper, we propose a simple variant of the original stochastic va...
research
12/09/2015

Efficient Distributed SGD with Variance Reduction

Stochastic Gradient Descent (SGD) has become one of the most popular opt...
research
06/11/2015

Variance Reduced Stochastic Gradient Descent with Neighbors

Stochastic Gradient Descent (SGD) is a workhorse in machine learning, ye...
research
11/15/2018

Asynchronous Stochastic Composition Optimization with Variance Reduction

Composition optimization has drawn a lot of attention in a wide variety ...
research
10/02/2020

Variance-Reduced Methods for Machine Learning

Stochastic optimization lies at the heart of machine learning, and its c...
research
02/05/2016

Reducing Runtime by Recycling Samples

Contrary to the situation with stochastic gradient descent, we argue tha...
research
03/20/2017

Guaranteed Sufficient Decrease for Variance Reduced Stochastic Gradient Descent

In this paper, we propose a novel sufficient decrease technique for vari...

Please sign up or login with your details

Forgot password? Click here to reset