Byzantine Machine Learning Made Easy by Resilient Averaging of Momentums

by   Sadegh Farhadkhani, et al.

Byzantine resilience emerged as a prominent topic within the distributed machine learning community. Essentially, the goal is to enhance distributed optimization algorithms, such as distributed SGD, in a way that guarantees convergence despite the presence of some misbehaving (a.k.a., Byzantine) workers. Although a myriad of techniques addressing the problem have been proposed, the field arguably rests on fragile foundations. These techniques are hard to prove correct and rely on assumptions that are (a) quite unrealistic, i.e., often violated in practice, and (b) heterogeneous, i.e., making it difficult to compare approaches. We present RESAM (RESilient Averaging of Momentums), a unified framework that makes it simple to establish optimal Byzantine resilience, relying only on standard machine learning assumptions. Our framework is mainly composed of two operators: resilient averaging at the server and distributed momentum at the workers. We prove a general theorem stating the convergence of distributed SGD under RESAM. Interestingly, demonstrating and comparing the convergence of many existing techniques become direct corollaries of our theorem, without resorting to stringent assumptions. We also present an empirical evaluation of the practical relevance of RESAM.


page 1

page 2

page 3

page 4


Fast and Secure Distributed Learning in High Dimension

Modern machine learning is distributed and the work of several machines ...

Resilient Distributed Averaging

In this paper, a fully distributed averaging algorithm in the presence o...

Combining Differential Privacy and Byzantine Resilience in Distributed SGD

Privacy and Byzantine resilience (BR) are two crucial requirements of mo...

Asynchronous Byzantine Machine Learning

Asynchronous distributed machine learning solutions have proven very eff...

Distributed Momentum for Byzantine-resilient Learning

Momentum is a variant of gradient descent that has been proposed for its...

PIRATE: A Blockchain-based Secure Framework of Distributed Machine Learning in 5G Networks

In the fifth-generation (5G) networks and the beyond, communication late...

Differential Privacy and Byzantine Resilience in SGD: Do They Add Up?

This paper addresses the problem of combining Byzantine resilience with ...