GoSGD: Distributed Optimization for Deep Learning with Gossip Exchange

04/04/2018
by   Michael Blot, et al.
0

We address the issue of speeding up the training of convolutional neural networks by studying a distributed method adapted to stochastic gradient descent. Our parallel optimization setup uses several threads, each applying individual gradient descents on a local variable. We propose a new way of sharing information between different threads based on gossip algorithms that show good consensus convergence properties. Our method called GoSGD has the advantage to be fully asynchronous and decentralized.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/29/2016

Gossip training for deep learning

We address the issue of speeding up the training of convolutional networ...
research
10/18/2017

Asynchronous Decentralized Parallel Stochastic Gradient Descent

Recent work shows that decentralized parallel stochastic gradient decent...
research
05/21/2018

Stochastic modified equations for the asynchronous stochastic gradient descent

We propose a stochastic modified equations (SME) for modeling the asynch...
research
01/10/2022

Learning Without a Global Clock: Asynchronous Learning in a Physics-Driven Learning Network

In a neuron network, synapses update individually using local informatio...
research
07/16/2014

Online Asynchronous Distributed Regression

Distributed computing offers a high degree of flexibility to accommodate...
research
01/25/2019

DADAM: A Consensus-based Distributed Adaptive Gradient Method for Online Optimization

Adaptive gradient-based optimization methods such as ADAGRAD, RMSPROP, a...
research
06/11/2018

Gear Training: A new way to implement high-performance model-parallel training

The training of Deep Neural Networks usually needs tremendous computing ...

Please sign up or login with your details

Forgot password? Click here to reset