Semi-Stochastic Gradient Descent Methods

12/05/2013
by   Jakub Konečný, et al.
0

In this paper we study the problem of minimizing the average of a large number (n) of smooth convex loss functions. We propose a new method, S2GD (Semi-Stochastic Gradient Descent), which runs for one or several epochs in each of which a single full gradient and a random number of stochastic gradients is computed, following a geometric law. The total work needed for the method to output an ε-accurate solution in expectation, measured in the number of passes over data, or equivalently, in units equivalent to the computation of a single gradient of the loss, is O((κ/n)(1/ε)), where κ is the condition number. This is achieved by running the method for O((1/ε)) epochs, with a single gradient evaluation and O(κ) stochastic gradient evaluations in each. The SVRG method of Johnson and Zhang arises as a special case. If our method is limited to a single epoch only, it needs to evaluate at most O((κ/ε)(1/ε)) stochastic gradients. In contrast, SVRG requires O(κ/ε^2) stochastic gradients. To illustrate our theoretical results, S2GD only needs the workload equivalent to about 2.1 full gradient evaluations to find an 10^-6-accurate solution for a problem with n=10^9 and κ=10^3.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/17/2014

mS2GD: Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting

We propose a mini-batching scheme for improving the theoretical complexi...
research
04/16/2015

Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting

We propose mS2GD: a method incorporating a mini-batching scheme for impr...
research
05/26/2022

Active Labeling: Streaming Stochastic Gradients

The workhorse of machine learning is stochastic gradient descent. To acc...
research
06/24/2020

Befriending The Byzantines Through Reputation Scores

We propose two novel stochastic gradient descent algorithms, ByGARS and ...
research
01/09/2019

The Lingering of Gradients: How to Reuse Gradients over Time

Classically, the time complexity of a first-order method is estimated by...
research
02/11/2019

Topology Optimization under Uncertainty using a Stochastic Gradient-based Approach

Topology optimization under uncertainty (TOuU) often defines objectives ...
research
04/14/2020

Strategic Investment in Energy Markets: A Multiparametric Programming Approach

An investor has to carefully select the location and size of new generat...

Please sign up or login with your details

Forgot password? Click here to reset