Decoupled Neural Interfaces using Synthetic Gradients

08/18/2016
by   Max Jaderberg, et al.
0

Training directed neural networks typically requires forward-propagating data through a computation graph, followed by backpropagating error signal, to produce weight updates. All layers, or more generally, modules, of the network are therefore locked, in the sense that they must wait for the remainder of the network to execute forwards and propagate error backwards before they can be updated. In this work we break this constraint by decoupling modules by introducing a model of the future computation of the network graph. These models predict what the result of the modelled subgraph will produce using only local information. In particular we focus on modelling error gradients: by using the modelled synthetic gradient in place of true backpropagated error gradients we decouple subgraphs, and can update them independently and asynchronously i.e. we realise decoupled neural interfaces. We show results for feed-forward models, where every layer is trained asynchronously, recurrent neural networks (RNNs) where predicting one's future gradient extends the time over which the RNN can effectively model, and also a hierarchical RNN system with ticking at different timescales. Finally, we demonstrate that in addition to predicting gradients, the same framework can be used to predict inputs, resulting in models which are decoupled in both the forward and backwards pass -- amounting to independent networks which co-learn such that they can be composed into a single functioning corporation.

READ FULL TEXT

page 1

page 2

page 3

page 4

03/01/2017

Understanding Synthetic Gradients and Decoupled Neural Interfaces

When training neural networks, the use of Synthetic Gradients (SG) allow...
09/21/2020

Feed-Forward On-Edge Fine-tuning Using Static Synthetic Gradient Modules

Training deep learning models on embedded devices is typically avoided s...
12/22/2017

Benchmarking Decoupled Neural Interfaces with Synthetic Gradients

Artifical Neural Network are a particular class of learning system model...
04/11/2022

Time-Adaptive Recurrent Neural Networks

Data are often sampled irregularly in time. Dealing with this using Recu...
01/29/2019

Sample Complexity Bounds for Recurrent Neural Networks with Application to Combinatorial Graph Problems

Learning to predict solutions to real-valued combinatorial graph problem...
06/18/2017

Learning Hierarchical Information Flow with Recurrent Neural Modules

We propose ThalNet, a deep learning model inspired by neocortical commun...
01/14/2017

Long Timescale Credit Assignment in NeuralNetworks with External Memory

Credit assignment in traditional recurrent neural networks usually invol...

Code Repositories

tensorflow-synthetic_gradient

None


view repo

synthetic-gradient

Reference implementation of decoupled training with synthetic gradients.


view repo

tenet

Implements Decoupled Neural Interfaces using Synthetic Gradients https://arxiv.org/abs/1608.05343.


view repo

nnabla-dni

Decoupled Neural Interfaces with NNabla


view repo