Byzantine-Resilient Non-Convex Stochastic Gradient Descent

12/28/2020
by   Zeyuan Allen-Zhu, et al.
0

We study adversary-resilient stochastic distributed optimization, in which m machines can independently compute stochastic gradients, and cooperate to jointly optimize over their local objective functions. However, an α-fraction of the machines are Byzantine, in that they may behave in arbitrary, adversarial ways. We consider a variant of this procedure in the challenging non-convex case. Our main result is a new algorithm SafeguardSGD which can provably escape saddle points and find approximate local minima of the non-convex objective. The algorithm is based on a new concentration filtering technique, and its sample and time complexity bounds match the best known theoretical bounds in the stochastic, distributed setting when no Byzantine machines are present. Our algorithm is practical: it improves upon the performance of prior methods when training deep neural networks, it is relatively lightweight, and is the first method to withstand two recently-proposed Byzantine attacks.

READ FULL TEXT

page 23

page 28

research
03/23/2018

Byzantine Stochastic Gradient Descent

This paper studies the problem of distributed stochastic optimization in...
research
12/10/2019

Byzantine Resilient Non-Convex SVRG with Distributed Batch Gradient Computations

In this work, we consider the distributed stochastic optimization proble...
research
06/14/2018

Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning

In this paper, we study robust large-scale distributed learning in the p...
research
03/17/2021

Escaping Saddle Points in Distributed Newton's Method with Communication efficiency and Byzantine Resilience

We study the problem of optimizing a non-convex loss function (with sadd...
research
05/16/2020

Byzantine-Resilient SGD in High Dimensions on Heterogeneous Data

We study distributed stochastic gradient descent (SGD) in the master-wor...
research
06/22/2020

Byzantine-Resilient High-Dimensional SGD with Local Iterations on Heterogeneous Data

We study stochastic gradient descent (SGD) with local iterations in the ...
research
09/10/2019

Byzantine-Resilient Stochastic Gradient Descent for Distributed Learning: A Lipschitz-Inspired Coordinate-wise Median Approach

In this work, we consider the resilience of distributed algorithms based...

Please sign up or login with your details

Forgot password? Click here to reset