Escaping Saddle Points in Distributed Newton's Method with Communication efficiency and Byzantine Resilience

03/17/2021
by   Avishek Ghosh, et al.
0

We study the problem of optimizing a non-convex loss function (with saddle points) in a distributed framework in the presence of Byzantine machines. We consider a standard distributed setting with one central machine (parameter server) communicating with many worker machines. Our proposed algorithm is a variant of the celebrated cubic-regularized Newton method of Nesterov and Polyak <cit.>, which avoids saddle points efficiently and converges to local minima. Furthermore, our algorithm resists the presence of Byzantine machines, which may create fake local minima near the saddle points of the loss function, also known as saddle-point attack. We robustify the cubic-regularized Newton algorithm such that it avoids the saddle points and the fake local minimas efficiently. Furthermore, being a second order algorithm, the iteration complexity is much lower than its first order counterparts, and thus our algorithm communicates little with the parameter server. We obtain theoretical guarantees for our proposed scheme under several settings including approximate (sub-sampled) gradients and Hessians. Moreover, we validate our theoretical findings with experiments using standard datasets and several types of Byzantine attacks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/15/2020

Distributed Newton Can Communicate Less and Resist Byzantine Workers

We develop a distributed second order optimization algorithm that is com...
research
06/14/2018

Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning

In this paper, we study robust large-scale distributed learning in the p...
research
12/28/2020

Byzantine-Resilient Non-Convex Stochastic Gradient Descent

We study adversary-resilient stochastic distributed optimization, in whi...
research
08/14/2020

Complexity aspects of local minima and related notions

We consider the notions of (i) critical points, (ii) second-order points...
research
05/16/2017

Distributed Statistical Machine Learning in Adversarial Settings: Byzantine Gradient Descent

We consider the problem of distributed statistical machine learning in a...
research
06/01/2022

Byzantine-Robust Online and Offline Distributed Reinforcement Learning

We consider a distributed reinforcement learning setting where multiple ...
research
07/15/2023

Byzantine-robust distributed one-step estimation

This paper proposes a Robust One-Step Estimator(ROSE) to solve the Byzan...

Please sign up or login with your details

Forgot password? Click here to reset