Parallel Stochastic Gradient Descent with Sound Combiners

05/22/2017
by   Saeed Maleki, et al.
0

Stochastic gradient descent (SGD) is a well known method for regression and classification tasks. However, it is an inherently sequential algorithm at each step, the processing of the current example depends on the parameters learned from the previous examples. Prior approaches to parallelizing linear learners using SGD, such as HOGWILD! and ALLREDUCE, do not honor these dependencies across threads and thus can potentially suffer poor convergence rates and/or poor scalability. This paper proposes SYMSGD, a parallel SGD algorithm that, to a first-order approximation, retains the sequential semantics of SGD. Each thread learns a local model in addition to a model combiner, which allows local models to be combined to produce the same result as what a sequential SGD would have produced. This paper evaluates SYMSGD's accuracy and performance on 6 datasets on a shared-memory machine shows upto 11x speedup over our heavily optimized sequential baseline on 16 cores and 2.2x, on average, faster than HOGWILD!.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/28/2019

Faster Distributed Deep Net Training: Computation and Communication Decoupled Stochastic Gradient Descent

With the increase in the amount of data and the expansion of model scale...
research
03/21/2022

ImageNet Challenging Classification with the Raspberry Pi: An Incremental Local Stochastic Gradient Descent Algorithm

With rising powerful, low-cost embedded devices, the edge computing has ...
research
09/08/2019

Distributed Word2Vec using Graph Analytics Frameworks

Word embeddings capture semantic and syntactic similarities of words, re...
research
12/04/2020

A Variant of Gradient Descent Algorithm Based on Gradient Averaging

In this work, we study an optimizer, Grad-Avg to optimize error function...
research
06/04/2020

Scaling Distributed Training with Adaptive Summation

Stochastic gradient descent (SGD) is an inherently sequential training a...
research
07/22/2019

Speeding Up Iterative Closest Point Using Stochastic Gradient Descent

Sensors producing 3D point clouds such as 3D laser scanners and RGB-D ca...

Please sign up or login with your details

Forgot password? Click here to reset