Stochastic-Sign SGD for Federated Learning with Theoretical Guarantees

by   Richeng Jin, et al.

Federated learning (FL) has emerged as a prominent distributed learning paradigm. FL entails some pressing needs for developing novel parameter estimation approaches with theoretical guarantees of convergence, which are also communication efficient, differentially private and Byzantine resilient in the heterogeneous data distribution settings. Quantization-based SGD solvers have been widely adopted in FL and the recently proposed SIGNSGD with majority vote shows a promising direction. However, no existing methods enjoy all the aforementioned properties. In this paper, we propose an intuitively-simple yet theoretically-sound method based on SIGNSGD to bridge the gap. We present Stochastic-Sign SGD which utilizes novel stochastic-sign based gradient compressors enabling the aforementioned properties in a unified framework. We also present an error-feedback variant of the proposed Stochastic-Sign SGD which further improves the learning performance in FL. We test the proposed method with extensive experiments using deep neural networks on the MNIST dataset. The experimental results corroborate the effectiveness of the proposed method.


page 1

page 2

page 3

page 4


DP-SIGNSGD: When Efficiency Meets Privacy and Robustness

Federated learning (FL) has emerged as a promising collaboration paradig...

FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Convergence Analysis

Federated Learning (FL) is an emerging learning scheme that allows diffe...

Local SGD: Unified Theory and New Efficient Methods

We present a unified framework for analyzing local SGD methods in the co...

FedADC: Accelerated Federated Learning with Drift Control

Federated learning (FL) has become de facto framework for collaborative ...

CoFED: Cross-silo Heterogeneous Federated Multi-task Learning via Co-training

Federated Learning (FL) is a machine learning technique that enables par...

Federated Learning over Noisy Channels: Convergence Analysis and Design Examples

Does Federated Learning (FL) work when both uplink and downlink communic...