On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants

06/23/2015
by   Sashank J Reddi, et al.
0

We study optimization algorithms based on variance reduction for stochastic gradient descent (SGD). Remarkable recent progress has been made in this direction through development of algorithms like SAG, SVRG, SAGA. These algorithms have been shown to outperform SGD, both theoretically and empirically. However, asynchronous versions of these algorithms---a crucial requirement for modern large-scale applications---have not been studied. We bridge this gap by presenting a unifying framework for many variance reduction techniques. Subsequently, we propose an asynchronous algorithm grounded in our framework, and prove its fast convergence. An important consequence of our general approach is that it yields asynchronous versions of variance reduction algorithms such as SVRG and SAGA as a byproduct. Our method achieves near linear speedup in sparse settings common to machine learning. We demonstrate the empirical performance of our method through a concrete realization of asynchronous SVRG.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/24/2015

Fast Asynchronous Parallel Stochastic Gradient Decent

Stochastic gradient descent (SGD) and its variants have become more and ...
research
05/31/2016

CYCLADES: Conflict-free Asynchronous Machine Learning

We present CYCLADES, a general framework for parallelizing stochastic op...
research
11/15/2018

Asynchronous Stochastic Composition Optimization with Variance Reduction

Composition optimization has drawn a lot of attention in a wide variety ...
research
11/08/2018

(Near) Optimal Parallelism Bound for Fully Asynchronous Coordinate Descent with Linear Speedup

When solving massive optimization problems in areas such as machine lear...
research
07/20/2017

Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization

Due to their simplicity and excellent performance, parallel asynchronous...
research
02/17/2021

Consistent Lock-free Parallel Stochastic Gradient Descent for Fast and Stable Convergence

Stochastic gradient descent (SGD) is an essential element in Machine Lea...
research
07/19/2019

ASYNC: Asynchronous Machine Learning on Distributed Systems

ASYNC is a framework that supports the implementation of asynchronous ma...

Please sign up or login with your details

Forgot password? Click here to reset