Analyzing the benefits of communication channels between deep learning models

04/19/2019
by   Philippe Lacaille, et al.
0

As artificial intelligence systems spread to more diverse and larger tasks in many domains, the machine learning algorithms, and in particular the deep learning models and the databases required to train them are getting bigger themselves. Some algorithms do allow for some scaling of large computations by leveraging data parallelism. However, they often require a large amount of data to be exchanged in order to ensure the shared knowledge throughout the compute nodes is accurate. In this work, the effect of different levels of communications between deep learning models is studied, in particular how it affects performance. The first approach studied looks at decentralizing the numerous computations that are done in parallel in training procedures such as synchronous and asynchronous stochastic gradient descent. In this setting, a simplified communication that consists of exchanging low bandwidth outputs between compute nodes can be beneficial. In the following chapter, the communication protocol is slightly modified to further include training instructions. Indeed, this is studied in a simplified setup where a pre-trained model, analogous to a teacher, can customize a randomly initialized model's training procedure to accelerate learning. Finally, a communication channel where two deep learning models can exchange a purposefully crafted language is explored while allowing for different ways of optimizing that language.

READ FULL TEXT
research
12/07/2020

Parallel Training of Deep Networks with Local Updates

Deep learning models trained on large data sets have been widely success...
research
02/28/2019

A block-random algorithm for learning on distributed, heterogeneous data

Most deep learning models are based on deep neural networks with multipl...
research
03/15/2018

GossipGraD: Scalable Deep Learning using Gossip Communication based Asynchronous Gradient Descent

In this paper, we present GossipGraD - a gossip communication protocol b...
research
03/06/2023

Towards provably efficient quantum algorithms for large-scale machine-learning models

Large machine learning models are revolutionary technologies of artifici...
research
10/28/2018

A Hitchhiker's Guide On Distributed Training of Deep Neural Networks

Deep learning has led to tremendous advancements in the field of Artific...
research
05/19/2022

Nebula-I: A General Framework for Collaboratively Training Deep Learning Models on Low-Bandwidth Cloud Clusters

The ever-growing model size and scale of compute have attracted increasi...
research
11/29/2020

Scaling down Deep Learning

Though deep learning models have taken on commercial and political relev...

Please sign up or login with your details

Forgot password? Click here to reset