Solving internal covariate shift in deep learning with linked neurons

This work proposes a novel solution to the problem of internal covariate shift and dying neurons using the concept of linked neurons. We define the neuron linkage in terms of two constraints: first, all neuron activations in the linkage must have the same operating point. That is to say, all of them share input weights. Secondly, a set of neurons is linked if and only if there is at least one member of the linkage that has a non-zero gradient in regard to the input of the activation function. This means that for any input in the activation function, there is at least one member of the linkage that operates in a non-flat and non-zero area. This simple change has profound implications in the network learning dynamics. In this article we explore the consequences of this proposal and show that by using this kind of units, internal covariate shift is implicitly solved. As a result of this, the use of linked neurons allows to train arbitrarily large networks without any architectural or algorithmic trick, effectively removing the need of using re-normalization schemes such as Batch Normalization, which leads to halving the required training time. It also solves the problem of the need for standarized input data. Results show that the units using the linkage not only do effectively solve the aforementioned problems, but are also a competitive alternative with respect to state-of-the-art with very promising results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/04/2016

Normalization Propagation: A Parametric Technique for Removing Internal Covariate Shift in Deep Networks

While the authors of Batch Normalization (BN) identify and address an im...
research
05/21/2015

Why Regularized Auto-Encoders learn Sparse Representation?

While the authors of Batch Normalization (BN) identify and address an im...
research
02/20/2017

Cosine Normalization: Using Cosine Similarity Instead of Dot Product in Neural Networks

Traditionally, multi-layer neural networks use dot product between the o...
research
02/11/2023

Synaptic Stripping: How Pruning Can Bring Dead Neurons Back To Life

Rectified Linear Units (ReLU) are the default choice for activation func...
research
05/29/2018

How Does Batch Normalization Help Optimization? (No, It Is Not About Internal Covariate Shift)

Batch Normalization (BatchNorm) is a widely adopted technique that enabl...
research
07/17/2018

Learning Neuron Non-Linearities with Kernel-Based Deep Neural Networks

The effectiveness of deep neural architectures has been widely supported...
research
09/28/2020

A thermodynamically consistent chemical spiking neuron capable of autonomous Hebbian learning

We propose a fully autonomous, thermodynamically consistent set of chemi...

Please sign up or login with your details

Forgot password? Click here to reset