Why ReLU Units Sometimes Die: Analysis of Single-Unit Error Backpropagation in Neural Networks

12/14/2018
by   Scott C. Douglas, et al.
0

Recently, neural networks in machine learning use rectified linear units (ReLUs) in early processing layers for better performance. Training these structures sometimes results in "dying ReLU units" with near-zero outputs. We first explore this condition via simulation using the CIFAR-10 dataset and variants of two popular convolutive neural network architectures. Our explorations show that the output activation probability Pr[y>0] is generally less than 0.5 at system convergence for layers that do not employ skip connections, and this activation probability tends to decrease as one progresses from input layer to output layer. Employing a simplified model of a single ReLU unit trained by a variant of error backpropagation, we then perform a statistical convergence analysis to explore the model's evolutionary behavior. Our analysis describes the potentially-slower convergence speeds of dying ReLU units, and this issue can occur regardless of how the weights are initialized.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/11/2015

Improving neural networks with bunches of neurons modeled by Kumaraswamy units: Preliminary study

Deep neural networks have recently achieved state-of-the-art results in ...
research
06/23/2021

Numerical influence of ReLU'(0) on backpropagation

In theory, the choice of ReLU'(0) in [0, 1] for a neural network has a n...
research
03/18/2019

On-line learning dynamics of ReLU neural networks using statistical physics techniques

We introduce exact macroscopic on-line learning dynamics of two-layer ne...
research
01/03/2023

Improving Performance in Neural Networks by Dendrites-Activated Connections

Computational units in artificial neural networks compute a linear combi...
research
05/25/2023

Neural Characteristic Activation Value Analysis for Improved ReLU Network Feature Learning

We examine the characteristic activation values of individual ReLU units...
research
09/19/2017

Training Better CNNs Requires to Rethink ReLU

With the rapid development of Deep Convolutional Neural Networks (DCNNs)...
research
02/10/2020

Deep Gated Networks: A framework to understand training and generalisation in deep learning

Understanding the role of (stochastic) gradient descent in training and ...

Please sign up or login with your details

Forgot password? Click here to reset