Reducing the Communication Cost of Federated Learning through Multistage Optimization

08/16/2021
by   Charlie Hou, et al.
0

A central question in federated learning (FL) is how to design optimization algorithms that minimize the communication cost of training a model over heterogeneous data distributed across many clients. A popular technique for reducing communication is the use of local steps, where clients take multiple optimization steps over local data before communicating with the server (e.g., FedAvg, SCAFFOLD). This contrasts with centralized methods, where clients take one optimization step per communication round (e.g., Minibatch SGD). A recent lower bound on the communication complexity of first-order methods shows that centralized methods are optimal over highly-heterogeneous data, whereas local methods are optimal over purely homogeneous data [Woodworth et al., 2020]. For intermediate heterogeneity levels, no algorithm is known to match the lower bound. In this paper, we propose a multistage optimization scheme that nearly matches the lower bound across all heterogeneity levels. The idea is to first run a local method up to a heterogeneity-induced error floor; next, we switch to a centralized method for the remaining steps. Our analysis may help explain empirically-successful stepsize decay methods in FL [Charles et al., 2020; Reddi et al., 2020]. We demonstrate the scheme's practical utility in image classification tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/10/2021

FedPAGE: A Fast Local Stochastic Gradient Method for Communication-Efficient Federated Learning

Federated Averaging (FedAvg, also known as Local-SGD) (McMahan et al., 2...
research
07/01/2022

Better Methods and Theory for Federated Learning: Compression, Client Selection and Heterogeneity

Federated learning (FL) is an emerging machine learning paradigm involvi...
research
06/17/2021

Locally Differentially Private Federated Learning: Efficient Algorithms with Tight Risk Bounds

Federated learning (FL) is a distributed learning paradigm in which many...
research
04/27/2022

FedShuffle: Recipes for Better Use of Local Work in Federated Learning

The practice of applying several local updates before aggregation across...
research
10/28/2022

GradSkip: Communication-Accelerated Local Gradient Methods with Better Computational Complexity

In this work, we study distributed optimization algorithms that reduce t...
research
02/20/2023

TAMUNA: Accelerated Federated Learning with Local Training and Partial Participation

In federated learning, a large number of users are involved in a global ...
research
05/21/2020

Global Multiclass Classification from Heterogeneous Local Models

Multiclass classification problems are most often solved by either train...

Please sign up or login with your details

Forgot password? Click here to reset