Escaping Saddle Points with Bias-Variance Reduced Local Perturbed SGD for Communication Efficient Nonconvex Distributed Learning

02/12/2022
by   Tomoya Murata, et al.
0

In recent centralized nonconvex distributed learning and federated learning, local methods are one of the promising approaches to reduce communication time. However, existing work has mainly focused on studying first-order optimality guarantees. On the other side, second-order optimality guaranteed algorithms have been extensively studied in the non-distributed optimization literature. In this paper, we study a new local algorithm called Bias-Variance Reduced Local Perturbed SGD (BVR-L-PSGD), that combines the existing bias-variance reduced gradient estimator with parameter perturbation to find second-order optimal points in centralized nonconvex distributed optimization. BVR-L-PSGD enjoys second-order optimality with nearly the same communication complexity as the best known one of BVR-L-SGD to find first-order optimality. Particularly, the communication complexity is better than non-local methods when the local datasets heterogeneity is smaller than the smoothness of the local loss. In an extreme case, the communication complexity approaches to Θ(1) when the local datasets heterogeneity goes to zero.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/05/2021

Bias-Variance Reduced Local SGD for Less Heterogeneous Federated Learning

Federated learning is one of the important learning scenarios in distrib...
research
09/01/2022

Versatile Single-Loop Method for Gradient Estimator: First and Second Order Optimality, and its Application to Federated Learning

While variance reduction methods have shown great success in solving lar...
research
12/02/2020

Second-Order Guarantees in Federated Learning

Federated learning is a useful framework for centralized learning from d...
research
12/12/2019

Parallel Restarted SPIDER – Communication Efficient Distributed Nonconvex Optimization with Optimal Computation Complexity

In this paper, we propose a distributed algorithm for stochastic smooth,...
research
10/05/2020

Lower Bounds and Optimal Algorithms for Personalized Federated Learning

In this work, we consider the optimization formulation of personalized f...
research
03/31/2020

Second-Order Guarantees in Centralized, Federated and Decentralized Nonconvex Optimization

Rapid advances in data collection and processing capabilities have allow...
research
02/14/2021

Smoothness Matrices Beat Smoothness Constants: Better Communication Compression Techniques for Distributed Optimization

Large scale distributed optimization has become the default tool for the...

Please sign up or login with your details

Forgot password? Click here to reset