Proximal and Federated Random Reshuffling

02/12/2021
by   Konstantin Mishchenko, et al.
7

Random Reshuffling (RR), also known as Stochastic Gradient Descent (SGD) without replacement, is a popular and theoretically grounded method for finite-sum minimization. We propose two new algorithms: Proximal and Federated Random Reshuffing (ProxRR and FedRR). The first algorithm, ProxRR, solves composite convex finite-sum minimization problems in which the objective is the sum of a (potentially non-smooth) convex regularizer and an average of n smooth objectives. We obtain the second algorithm, FedRR, as a special case of ProxRR applied to a reformulation of distributed problems with either homogeneous or heterogeneous data. We study the algorithms' convergence properties with constant and decreasing stepsizes, and show that they have considerable advantages over Proximal and Local SGD. In particular, our methods have superior complexities and ProxRR evaluates the proximal operator once per epoch only. When the proximal operator is expensive to compute, this small difference makes ProxRR up to n times faster than algorithms that evaluate the proximal operator in every iteration. We give examples of practical optimization tasks where the proximal operator is difficult to compute and ProxRR has a clear advantage. Finally, we corroborate our results with experiments on real data sets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/06/2021

Distributed stochastic proximal algorithm with random reshuffling for non-smooth finite-sum optimization

The non-smooth finite-sum minimization is a fundamental problem in machi...
research
03/19/2014

A Proximal Stochastic Gradient Method with Progressive Variance Reduction

We consider the problem of minimizing the sum of two convex functions: o...
research
03/05/2016

A single-phase, proximal path-following framework

We propose a new proximal, path-following framework for a class of const...
research
10/09/2019

The fastest ℓ_1,∞ prox in the west

Proximal operators are of particular interest in optimization problems d...
research
08/13/2013

Composite Self-Concordant Minimization

We propose a variable metric framework for minimizing the sum of a self-...
research
02/18/2022

ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally!

We introduce ProxSkip – a surprisingly simple and provably efficient met...
research
09/06/2022

Faster federated optimization under second-order similarity

Federated learning (FL) is a subfield of machine learning where multiple...

Please sign up or login with your details

Forgot password? Click here to reset