Numerical influence of ReLU'(0) on backpropagation

06/23/2021
by   David Bertoin, et al.
0

In theory, the choice of ReLU'(0) in [0, 1] for a neural network has a negligible influence both on backpropagation and training. Yet, in the real world, 32 bits default precision combined with the size of deep learning problems makes it a hyperparameter of training methods. We investigate the importance of the value of ReLU'(0) for several precision levels (16, 32, 64 bits), on various networks (fully connected, VGG, ResNet) and datasets (MNIST, CIFAR10, SVHN). We observe considerable variations of backpropagation outputs which occur around half of the time in 32 bits precision. The effect disappears with double precision, while it is systematic at 16 bits. For vanilla SGD training, the choice ReLU'(0) = 0 seems to be the most efficient. We also evidence that reconditioning approaches as batch-norm or ADAM tend to buffer the influence of ReLU'(0)'s value. Overall, the message we want to convey is that algorithmic differentiation of nonsmooth problems potentially hides parameters that could be tuned advantageously.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/14/2018

Why ReLU Units Sometimes Die: Analysis of Single-Unit Error Backpropagation in Neural Networks

Recently, neural networks in machine learning use rectified linear units...
research
10/13/2021

Clustering-Based Interpretation of Deep ReLU Network

Amongst others, the adoption of Rectified Linear Units (ReLUs) is regard...
research
09/30/2019

Random Bias Initialization Improving Binary Neural Network Training

Edge intelligence especially binary neural network (BNN) has attracted c...
research
08/20/2020

On transversality of bent hyperplane arrangements and the topological expressiveness of ReLU neural networks

Let F:R^n -> R be a feedforward ReLU neural network. It is well-known th...
research
07/13/2023

Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification

Deep learning has been widely used in many fields, but the model trainin...
research
06/16/2019

A Closer Look at Double Backpropagation

In recent years, an increasing number of neural network models have incl...

Please sign up or login with your details

Forgot password? Click here to reset