mS2GD: Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting

10/17/2014
by   Jakub Konečný, et al.
0

We propose a mini-batching scheme for improving the theoretical complexity and practical performance of semi-stochastic gradient descent applied to the problem of minimizing a strongly convex composite function represented as the sum of an average of a large number of smooth convex functions, and simple nonsmooth convex function. Our method first performs a deterministic step (computation of the gradient of the objective function at the starting point), followed by a large number of stochastic steps. The process is repeated a few times with the last iterate becoming the new starting point. The novelty of our method is in introduction of mini-batching into the computation of stochastic steps. In each step, instead of choosing a single function, we sample b functions, compute their gradients, and compute the direction based on this. We analyze the complexity of the method and show that the method benefits from two speedup effects. First, we prove that as long as b is below a certain threshold, we can reach predefined accuracy with less overall work than without mini-batching. Second, our mini-batching scheme admits a simple parallel implementation, and hence is suitable for further acceleration by parallelization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/16/2015

Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting

We propose mS2GD: a method incorporating a mini-batching scheme for impr...
research
12/05/2013

Semi-Stochastic Gradient Descent Methods

In this paper we study the problem of minimizing the average of a large ...
research
04/14/2020

Strategic Investment in Energy Markets: A Multiparametric Programming Approach

An investor has to carefully select the location and size of new generat...
research
09/29/2022

Convergence of the mini-batch SIHT algorithm

The Iterative Hard Thresholding (IHT) algorithm has been considered exte...
research
10/02/2020

A variable metric mini-batch proximal stochastic recursive gradient algorithm with diagonal Barzilai-Borwein stepsize

Variable metric proximal gradient methods with different metric selectio...
research
03/12/2018

High Throughput Synchronous Distributed Stochastic Gradient Descent

We introduce a new, high-throughput, synchronous, distributed, data-para...
research
08/09/2022

Adaptive Zeroth-Order Optimisation of Nonconvex Composite Objectives

In this paper, we propose and analyze algorithms for zeroth-order optimi...

Please sign up or login with your details

Forgot password? Click here to reset