Optimal Complexity in Non-Convex Decentralized Learning over Time-Varying Networks

11/01/2022
by   Xinmeng Huang, et al.
0

Decentralized optimization with time-varying networks is an emerging paradigm in machine learning. It saves remarkable communication overhead in large-scale deep training and is more robust in wireless scenarios especially when nodes are moving. Federated learning can also be regarded as decentralized optimization with time-varying communication patterns alternating between global averaging and local updates. While numerous studies exist to clarify its theoretical limits and develop efficient algorithms, it remains unclear what the optimal complexity is for non-convex decentralized stochastic optimization over time-varying networks. The main difficulties lie in how to gauge the effectiveness when transmitting messages between two nodes via time-varying communications, and how to establish the lower bound when the network size is fixed (which is a prerequisite in stochastic optimization). This paper resolves these challenges and establish the first lower bound complexity. We also develop a new decentralized algorithm to nearly attain the lower bound, showing the tightness of the lower bound and the optimality of our algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/14/2022

Revisiting Optimal Convergence Rate for Smooth and Non-convex Stochastic Decentralized Optimization

Decentralized optimization is effective to save communication in large-s...
research
06/22/2020

The space of sections of a smooth function

Given a compact manifold X with boundary and a submersion f : X → Y whos...
research
05/05/2021

Randomized Stochastic Variance-Reduced Methods for Stochastic Bilevel Optimization

In this paper, we consider non-convex stochastic bilevel optimization (S...
research
01/29/2022

Achieving Efficient Distributed Machine Learning Using a Novel Non-Linear Class of Aggregation Functions

Distributed machine learning (DML) over time-varying networks can be an ...
research
04/02/2016

Centralized and Decentralized Global Outer-synchronization of Asymmetric Recurrent Time-varying Neural Network by Data-sampling

In this paper, we discuss the outer-synchronization of the asymmetricall...
research
08/03/2022

Online decentralized tracking for nonlinear time-varying optimal power flow of coupled transmission-distribution grids

The coordinated alternating current optimal power flow (ACOPF) for coupl...
research
05/26/2022

Predictor-corrector algorithms for stochastic optimization under gradual distribution shift

Time-varying stochastic optimization problems frequently arise in machin...

Please sign up or login with your details

Forgot password? Click here to reset