-
On the Convergence of Local Descent Methods in Federated Learning
In federated distributed learning, the goal is to optimize a global trai...
read it
-
Distributed learning with compressed gradients
Asynchronous computation and gradient compression have emerged as two ke...
read it
-
Artemis: tight convergence guarantees for bidirectional compression in Federated Learning
We introduce a new algorithm - Artemis - tackling the problem of learnin...
read it
-
FedSplit: An algorithmic framework for fast federated optimization
Motivated by federated learning, we consider the hub-and-spoke model of ...
read it
-
Distributed Training with Heterogeneous Data: Bridging Median and Mean Based Algorithms
Recently, there is a growing interest in the study of median-based algor...
read it
-
Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization
In federated optimization, heterogeneity in the clients' local datasets ...
read it
-
Revisiting EXTRA for Smooth Distributed Optimization
EXTRA is a popular method for the dencentralized distributed optimizatio...
read it
Federated Learning with Compression: Unified Analysis and Sharp Guarantees
In federated learning, communication cost is often a critical bottleneck to scale up distributed optimization algorithms to collaboratively learn a model from millions of devices with potentially unreliable or limited communication and heterogeneous data distributions. Two notable trends to deal with the communication overhead of federated algorithms are gradient compression and local computation with periodic communication. Despite many attempts, characterizing the relationship between these two approaches has proven elusive. We address this by proposing a set of algorithms with periodical compressed (quantized or sparsified) communication and analyze their convergence properties in both homogeneous and heterogeneous local data distributions settings. For the homogeneous setting, our analysis improves existing bounds by providing tighter convergence rates for both strongly convex and non-convex objective functions. To mitigate data heterogeneity, we introduce a local gradient tracking scheme and obtain sharp convergence rates that match the best-known communication complexities without compression for convex, strongly convex, and nonconvex settings. We complement our theoretical results and demonstrate the effectiveness of our proposed methods by several experiments on real-world datasets.
READ FULL TEXT
Comments
There are no comments yet.