DeepAI AI Chat
Log In Sign Up

Graph-Dependent Implicit Regularisation for Distributed Stochastic Subgradient Descent

by   Dominic Richards, et al.
University of Oxford

We propose graph-dependent implicit regularisation strategies for distributed stochastic subgradient descent (Distributed SGD) for convex problems in multi-agent learning. Under the standard assumptions of convexity, Lipschitz continuity, and smoothness, we establish statistical learning rates that retain, up to logarithmic terms, centralised statistical guarantees through implicit regularisation (step size tuning and early stopping) with appropriate dependence on the graph topology. Our approach avoids the need for explicit regularisation in decentralised learning problems, such as adding constraints to the empirical risk minimisation rule. Particularly for distributed methods, the use of implicit regularisation allows the algorithm to remain simple, without projections or dual methods. To prove our results, we establish graph-independent generalisation bounds for Distributed SGD that match the centralised setting (using algorithmic stability), and we establish graph-dependent optimisation bounds that are of independent interest. We present numerical experiments to show that the qualitative nature of the upper bounds we derive can be representative of real behaviours.


page 1

page 2

page 3

page 4


Fine-Grained Analysis of Stability and Generalization for Stochastic Gradient Descent

Recently there are a considerable amount of work devoted to the study of...

Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression

Stochastic gradient descent (SGD) has been demonstrated to generalize we...

An Analysis of Constant Step Size SGD in the Non-convex Regime: Asymptotic Normality and Bias

Structured non-convex learning problems, for which critical points have ...

Making SGD Parameter-Free

We develop an algorithm for parameter-free stochastic convex optimizatio...

Time-independent Generalization Bounds for SGLD in Non-convex Settings

We establish generalization error bounds for stochastic gradient Langevi...

Decentralised Learning with Random Features and Distributed Gradient Descent

We investigate the generalisation performance of Distributed Gradient De...

Learning with SGD and Random Features

Sketching and stochastic gradient methods are arguably the most common t...