Byzantine-Resilient SGD in High Dimensions on Heterogeneous Data

05/16/2020
by   Deepesh Data, et al.
7

We study distributed stochastic gradient descent (SGD) in the master-worker architecture under Byzantine attacks. We consider the heterogeneous data model, where different workers may have different local datasets, and we do not make any probabilistic assumptions on data generation. At the core of our algorithm, we use the polynomial-time outlier-filtering procedure for robust mean estimation proposed by Steinhardt et al. (ITCS 2018) to filter-out corrupt gradients. In order to be able to apply their filtering procedure in our heterogeneous data setting where workers compute stochastic gradients, we derive a new matrix concentration result, which may be of independent interest. We provide convergence analyses for smooth strongly-convex and non-convex objectives. We derive our results under the bounded variance assumption on local stochastic gradients and a deterministic condition on datasets, namely, gradient dissimilarity; and for both these quantities, we provide concrete bounds in the statistical heterogeneous data model. We give a trade-off between the mini-batch size for stochastic gradients and the approximation error. Our algorithm can tolerate up to 1/4 fraction Byzantine workers. It can find approximate optimal parameters in the strongly-convex setting exponentially fast and reach to an approximate stationary point in the non-convex setting with a linear speed, thus, matching the convergence rates of vanilla SGD in the Byzantine-free setting. We also propose and analyze a Byzantine-resilient SGD algorithm with gradient compression, where workers send k random coordinates of their gradients. Under mild conditions, we show a d/k-factor saving in communication bits as well as decoding complexity over our compression-free algorithm without affecting its convergence rate (order-wise) and the approximation error.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/22/2020

Byzantine-Resilient High-Dimensional SGD with Local Iterations on Heterogeneous Data

We study stochastic gradient descent (SGD) with local iterations in the ...
research
06/04/2019

Distributed Training with Heterogeneous Data: Bridging Median and Mean Based Algorithms

Recently, there is a growing interest in the study of median-based algor...
research
04/26/2018

Securing Distributed Machine Learning in High Dimensions

We consider securing a distributed machine learning system wherein the d...
research
12/10/2019

Byzantine Resilient Non-Convex SVRG with Distributed Batch Gradient Computations

In this work, we consider the distributed stochastic optimization proble...
research
12/28/2020

Byzantine-Resilient Non-Convex Stochastic Gradient Descent

We study adversary-resilient stochastic distributed optimization, in whi...
research
11/21/2019

Communication-Efficient and Byzantine-Robust Distributed Learning

We develop a communication-efficient distributed learning algorithm that...
research
06/28/2021

Robust Distributed Optimization With Randomly Corrupted Gradients

In this paper, we propose a first-order distributed optimization algorit...

Please sign up or login with your details

Forgot password? Click here to reset