Performance Optimization on Model Synchronization in Parallel Stochastic Gradient Descent Based SVM

05/03/2019
by   Vibhatha Abeykoon, et al.
0

Understanding the bottlenecks in implementing stochastic gradient descent (SGD)-based distributed support vector machines (SVM) algorithm is important in training larger data sets. The communication time to do the model synchronization across the parallel processes is the main bottleneck that causes inefficiency in the training process. The model synchronization is directly affected by the mini-batch size of data processed before the global synchronization. In producing an efficient distributed model, the communication time in training model synchronization has to be as minimum as possible while retaining a high testing accuracy. The effect from model synchronization frequency over the convergence of the algorithm and accuracy of the generated model must be well understood to design an efficient distributed model. In this research, we identify the bottlenecks in model synchronization in parallel stochastic gradient descent (PSGD)-based SVM algorithm with respect to the training model synchronization frequency (MSF). Our research shows that by optimizing the MSF in the data sets that we used, a reduction of 98% in communication time can be gained (16x - 24x speed up) with respect to high-frequency model synchronization. The training model optimization discussed in this paper guarantees a higher accuracy than the sequential algorithm along with faster convergence.

READ FULL TEXT
research
11/20/2019

Local AdaAlter: Communication-Efficient Stochastic Gradient Descent with Adaptive Learning Rates

Recent years have witnessed the growth of large-scale distributed machin...
research
10/04/2019

Distributed Learning of Deep Neural Networks using Independent Subnet Training

Stochastic gradient descent (SGD) is the method of choice for distribute...
research
06/26/2018

Multi-Merge Budget Maintenance for Stochastic Gradient Descent SVM Training

Budgeted Stochastic Gradient Descent (BSGD) is a state-of-the-art techni...
research
04/13/2017

Fully Distributed and Asynchronized Stochastic Gradient Descent for Networked Systems

This paper considers a general data-fitting problem over a networked sys...
research
12/28/2018

A continuous-time analysis of distributed stochastic gradient

Synchronization in distributed networks of nonlinear dynamical systems p...
research
09/14/2015

Model Accuracy and Runtime Tradeoff in Distributed Deep Learning:A Systematic Study

This paper presents Rudra, a parameter server based distributed computin...
research
06/26/2018

Speeding Up Budgeted Stochastic Gradient Descent SVM Training with Precomputed Golden Section Search

Limiting the model size of a kernel support vector machine to a pre-defi...

Please sign up or login with your details

Forgot password? Click here to reset