Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting

04/16/2015
by   Jakub Konečný, et al.
0

We propose mS2GD: a method incorporating a mini-batching scheme for improving the theoretical complexity and practical performance of semi-stochastic gradient descent (S2GD). We consider the problem of minimizing a strongly convex function represented as the sum of an average of a large number of smooth convex functions, and a simple nonsmooth convex regularizer. Our method first performs a deterministic step (computation of the gradient of the objective function at the starting point), followed by a large number of stochastic steps. The process is repeated a few times with the last iterate becoming the new starting point. The novelty of our method is in introduction of mini-batching into the computation of stochastic steps. In each step, instead of choosing a single function, we sample b functions, compute their gradients, and compute the direction based on this. We analyze the complexity of the method and show that it benefits from two speedup effects. First, we prove that as long as b is below a certain threshold, we can reach any predefined accuracy with less overall work than without mini-batching. Second, our mini-batching scheme admits a simple parallel implementation, and hence is suitable for further acceleration by parallelization.

READ FULL TEXT

page 9

page 10

research
10/17/2014

mS2GD: Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting

We propose a mini-batching scheme for improving the theoretical complexi...
research
03/19/2014

A Proximal Stochastic Gradient Method with Progressive Variance Reduction

We consider the problem of minimizing the sum of two convex functions: o...
research
12/05/2013

Semi-Stochastic Gradient Descent Methods

In this paper we study the problem of minimizing the average of a large ...
research
04/14/2020

Strategic Investment in Energy Markets: A Multiparametric Programming Approach

An investor has to carefully select the location and size of new generat...
research
10/02/2020

A variable metric mini-batch proximal stochastic recursive gradient algorithm with diagonal Barzilai-Borwein stepsize

Variable metric proximal gradient methods with different metric selectio...
research
02/07/2023

Two Losses Are Better Than One: Faster Optimization Using a Cheaper Proxy

We present an algorithm for minimizing an objective with hard-to-compute...
research
05/03/2023

Select without Fear: Almost All Mini-Batch Schedules Generalize Optimally

We establish matching upper and lower generalization error bounds for mi...

Please sign up or login with your details

Forgot password? Click here to reset