Deep Control - a simple automatic gain control for memory efficient and high performance training of deep convolutional neural networks

06/13/2017
by   Brendan Ruff, et al.
0

Training a deep convolutional neural net typically starts with a random initialisation of all filters in all layers which severely reduces the forward signal and back-propagated error and leads to slow and sub-optimal training. Techniques that counter that focus on either increasing the signal or increasing the gradients adaptively but the model behaves very differently at the beginning of training compared to later when stable pathways through the net have been established. To compound this problem the effective minibatch size varies greatly between layers at different depths and between individual filters as activation sparsity typically increases with depth leading to a reduction in effective learning rate since gradients may superpose rather than add and this further compounds the covariate shift problem as deeper neurons are less able to adapt to upstream shift. Proposed here is a method of automatic gain control of the signal built into each convolutional neuron that achieves equivalent or superior performance than batch normalisation and is compatible with single sample or minibatch gradient descent. The same model is used both for training and inference. The technique comprises a scaled per sample map mean subtraction from the raw convolutional filter output followed by scaling of the difference.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/29/2019

Mean Shift Rejection: Training Deep Neural Networks Without Minibatch Statistics or Normalization

Deep convolutional neural networks are known to be unstable during train...
research
04/09/2022

FoundationLayerNorm: Scaling BERT and GPT to 1,000 Layers

The mainstream BERT/GPT model contains only 10 to 20 layers, and there i...
research
10/15/2015

Layer-Specific Adaptive Learning Rates for Deep Networks

The increasing complexity of deep learning architectures is resulting in...
research
03/30/2016

Deep Networks with Stochastic Depth

Very deep convolutional networks with hundreds of layers have led to sig...
research
10/28/2018

Distilling Critical Paths in Convolutional Neural Networks

Neural network compression and acceleration are widely demanded currentl...
research
05/20/2023

A Framework for Provably Stable and Consistent Training of Deep Feedforward Networks

We present a novel algorithm for training deep neural networks in superv...

Please sign up or login with your details

Forgot password? Click here to reset