Trading-off variance and complexity in stochastic gradient descent

03/22/2016
by   Vatsal Shah, et al.
0

Stochastic gradient descent is the method of choice for large-scale machine learning problems, by virtue of its light complexity per iteration. However, it lags behind its non-stochastic counterparts with respect to the convergence rate, due to high variance introduced by the stochastic updates. The popular Stochastic Variance-Reduced Gradient (SVRG) method mitigates this shortcoming, introducing a new update rule which requires infrequent passes over the entire input dataset to compute the full-gradient. In this work, we propose CheapSVRG, a stochastic variance-reduction optimization scheme. Our algorithm is similar to SVRG but instead of the full gradient, it uses a surrogate which can be efficiently computed on a small subset of the input data. It achieves a linear convergence rate ---up to some error level, depending on the nature of the optimization problem---and features a trade-off between the computational complexity and the convergence rate. Empirical evaluation shows that CheapSVRG performs at least competitively compared to the state of the art.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2017

A Novel Stochastic Stratified Average Gradient Method: Convergence Rate and Its Complexity

SGD (Stochastic Gradient Descent) is a popular algorithm for large scale...
research
03/10/2020

Communication-efficient Variance-reduced Stochastic Gradient Descent

We consider the problem of communication efficient distributed optimizat...
research
09/29/2022

Computational Complexity of Sub-linear Convergent Algorithms

Optimizing machine learning algorithms that are used to solve the object...
research
11/20/2018

Variance Suppression: Balanced Training Process in Deep Learning

Stochastic gradient descent updates parameters with summation gradient c...
research
02/21/2020

Debiasing Stochastic Gradient Descent to handle missing values

A major caveat of large scale data is their incom-pleteness. We propose ...
research
11/05/2022

Stochastic Variance Reduced Gradient for affine rank minimization problem

We develop an efficient stochastic variance reduced gradient descent alg...
research
04/21/2016

Stabilized Sparse Online Learning for Sparse Data

Stochastic gradient descent (SGD) is commonly used for optimization in l...

Please sign up or login with your details

Forgot password? Click here to reset