Distributed Learning with Sublinear Communication

02/28/2019
by   Jayadev Acharya, et al.
0

In distributed statistical learning, N samples are split across m machines and a learner wishes to use minimal communication to learn as well as if the examples were on a single machine. This model has received substantial interest in machine learning due to its scalability and potential for parallel speedup. However, in high-dimensional settings, where the number examples is smaller than the number of features ("dimension"), the speedup afforded by distributed learning may be overshadowed by the cost of communicating a single example. This paper investigates the following question: When is it possible to learn a d-dimensional model in the distributed setting with total communication sublinear in d? Starting with a negative result, we show that for learning ℓ_1-bounded or sparse linear models, no algorithm can obtain optimal error until communication is linear in dimension. Our main result is that that by slightly relaxing the standard boundedness assumptions for linear models, we can obtain distributed algorithms that enjoy optimal error with communication logarithmic in dimension. This result is based on a family of algorithms that combine mirror descent with randomized sparsification/quantization of iterates, and extends to the general stochastic convex optimization model.

READ FULL TEXT
research
02/05/2021

Sparse Normal Means Estimation with Sublinear Communication

We consider the problem of sparse normal means estimation in a distribut...
research
08/15/2018

An Analysis of Asynchronous Stochastic Accelerated Coordinate Descent

Gradient descent, and coordinate descent in particular, are core tools i...
research
04/16/2012

Efficient Protocols for Distributed Classification and Optimization

In distributed learning, the goal is to perform a learning task over dat...
research
08/18/2017

A debiased distributed estimation for sparse partially linear models in diverging dimensions

We consider a distributed estimation of the double-penalized least squar...
research
01/09/2023

Distributed Sparse Linear Regression under Communication Constraints

In multiple domains, statistical tasks are performed in distributed sett...
research
11/02/2019

Order Optimal One-Shot Distributed Learning

We consider distributed statistical optimization in one-shot setting, wh...
research
09/15/2022

Distributed Sparse Linear Regression with Sublinear Communication

We study the problem of high-dimensional sparse linear regression in a d...

Please sign up or login with your details

Forgot password? Click here to reset