Provably Doubly Accelerated Federated Learning: The First Theoretically Successful Combination of Local Training and Compressed Communication

10/24/2022
by   Laurent Condat, et al.
0

In the modern paradigm of federated learning, a large number of users are involved in a global learning task, in a collaborative way. They alternate local computations and two-way communication with a distant orchestrating server. Communication, which can be slow and costly, is the main bottleneck in this setting. To reduce the communication load and therefore accelerate distributed gradient descent, two strategies are popular: 1) communicate less frequently; that is, perform several iterations of local computations between the communication rounds; and 2) communicate compressed information instead of full-dimensional vectors. In this paper, we propose the first algorithm for distributed optimization and federated learning, which harnesses these two strategies jointly and converges linearly to an exact solution, with a doubly accelerated rate: our algorithm benefits from the two acceleration mechanisms provided by local training and compression, namely a better dependency on the condition number of the functions and on the dimension of the model, respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/20/2023

TAMUNA: Accelerated Federated Learning with Local Training and Partial Participation

In federated learning, a large number of users are involved in a global ...
research
07/20/2021

CANITA: Faster Rates for Distributed Convex Optimization with Communication Compression

Due to the high communication cost in distributed and federated learning...
research
02/26/2020

Acceleration for Compressed Gradient Descent in Distributed and Federated Optimization

Due to the high communication cost in distributed and federated learning...
research
04/03/2020

From Local SGD to Local Fixed Point Methods for Federated Learning

Most algorithms for solving optimization problems or finding saddle poin...
research
02/18/2022

ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally!

We introduce ProxSkip – a surprisingly simple and provably efficient met...
research
09/11/2020

Federated Generalized Bayesian Learning via Distributed Stein Variational Gradient Descent

This paper introduces Distributed Stein Variational Gradient Descent (DS...
research
02/20/2020

Uncertainty Principle for Communication Compression in Distributed and Federated Learning and the Search for an Optimal Compressor

In order to mitigate the high communication cost in distributed and fede...

Please sign up or login with your details

Forgot password? Click here to reset