Distributed Momentum for Byzantine-resilient Learning

02/28/2020
by   El Mahdi El Mhamdi, et al.
0

Momentum is a variant of gradient descent that has been proposed for its benefits on convergence. In a distributed setting, momentum can be implemented either at the server or the worker side. When the aggregation rule used by the server is linear, commutativity with addition makes both deployments equivalent. Robustness and privacy are however among motivations to abandon linear aggregation rules. In this work, we demonstrate the benefits on robustness of using momentum at the worker side. We first prove that computing momentum at the workers reduces the variance-norm ratio of the gradient estimation at the server, strengthening Byzantine resilient aggregation rules. We then provide an extensive experimental demonstration of the robustness effect of worker-side momentum on distributed SGD.

READ FULL TEXT

page 17

page 19

research
05/23/2018

Phocas: dimensional Byzantine-resilient stochastic gradient descent

We propose a novel robust aggregation rule for distributed synchronous S...
research
12/29/2019

Federated Variance-Reduced Stochastic Gradient Descent with Robustness to Byzantine Attacks

This paper deals with distributed finite-sum optimization for learning o...
research
02/27/2018

Generalized Byzantine-tolerant SGD

We propose three new robust aggregation rules for distributed synchronou...
research
05/24/2022

Byzantine Machine Learning Made Easy by Resilient Averaging of Momentums

Byzantine resilience emerged as a prominent topic within the distributed...
research
10/28/2022

Aggregation in the Mirror Space (AIMS): Fast, Accurate Distributed Machine Learning in Military Settings

Distributed machine learning (DML) can be an important capability for mo...
research
10/18/2021

BEV-SGD: Best Effort Voting SGD for Analog Aggregation Based Federated Learning against Byzantine Attackers

As a promising distributed learning technology, analog aggregation based...
research
07/16/2022

MixTailor: Mixed Gradient Aggregation for Robust Learning Against Tailored Attacks

Implementations of SGD on distributed and multi-GPU systems creates new ...

Please sign up or login with your details

Forgot password? Click here to reset