The Minimax Complexity of Distributed Optimization

09/01/2021
by   Blake Woodworth, et al.
0

In this thesis, I study the minimax oracle complexity of distributed stochastic optimization. First, I present the "graph oracle model", an extension of the classic oracle complexity framework that can be applied to study distributed optimization algorithms. Next, I describe a general approach to proving optimization lower bounds for arbitrary randomized algorithms (as opposed to more restricted classes of algorithms, e.g., deterministic or "zero-respecting" algorithms), which is used extensively throughout the thesis. For the remainder of the thesis, I focus on the specific case of the "intermittent communication setting", where multiple computing devices work in parallel with limited communication amongst themselves. In this setting, I analyze the theoretical properties of the popular Local Stochastic Gradient Descent (SGD) algorithm in convex setting, both for homogeneous and heterogeneous objectives. I provide the first guarantees for Local SGD that improve over simple baseline methods, but show that Local SGD is not optimal in general. In pursuit of optimal methods in the intermittent communication setting, I then show matching upper and lower bounds for the intermittent communication setting with homogeneous convex, heterogeneous convex, and homogeneous non-convex objectives. These upper bounds are attained by simple variants of SGD which are therefore optimal. Finally, I discuss several additional assumptions about the objective or more powerful oracles that might be exploitable in order to develop better intermittent communication algorithms with better guarantees than our lower bounds allow.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/18/2020

Is Local SGD Better than Minibatch SGD?

We study local SGD (also known as parallel SGD and federated averaging),...
research
05/25/2018

Graph Oracle Models, Lower Bounds, and Gaps for Parallel Stochastic Optimization

We suggest a general oracle-based framework that captures different para...
research
10/16/2020

Improved Communication Lower Bounds for Distributed Optimisation

Motivated by the interest in communication-efficient methods for distrib...
research
01/11/2023

Compression for Distributed Optimization and Timely Updates

The goal of this thesis is to study the compression problems arising in ...
research
12/20/2021

Distributed and Stochastic Optimization Methods with Gradient Compression and Local Steps

In this thesis, we propose new theoretical frameworks for the analysis o...
research
09/08/2019

Convex Set Disjointness, Distributed Learning of Halfspaces, and LP Feasibility

We study the Convex Set Disjointness (CSD) problem, where two players ha...
research
06/03/2023

Gradient-free optimization of highly smooth functions: improved analysis and a new algorithm

This work studies minimization problems with zero-order noisy oracle inf...

Please sign up or login with your details

Forgot password? Click here to reset