On the Impact of the Activation Function on Deep Neural Networks Training

02/19/2019
by   Soufiane Hayou, et al.
0

The weight initialization and the activation function of deep neural networks have a crucial impact on the performance of the training procedure. An inappropriate selection can lead to the loss of information of the input during forward propagation and the exponential vanishing/exploding of gradients during back-propagation. Understanding the theoretical properties of untrained random networks is key to identifying which deep networks may be trained successfully as recently demonstrated by Samuel et al (2017) who showed that for deep feedforward neural networks only a specific choice of hyperparameters known as the `Edge of Chaos' can lead to good performance. While the work by Samuel et al (2017) discuss trainability issues, we focus here on training acceleration and overall performance. We give a comprehensive theoretical analysis of the Edge of Chaos and show that we can indeed tune the initialization parameters and the activation function in order to accelerate the training and improve the performance.

READ FULL TEXT
research
05/21/2018

On the Selection of Initialization and Activation Function for Deep Neural Networks

The weight initialization and the activation function of deep neural net...
research
05/25/2021

Towards Understanding the Condensation of Two-layer Neural Networks at Initial Training

It is important to study what implicit regularization is imposed on the ...
research
04/10/2023

Criticality versus uniformity in deep neural networks

Deep feedforward networks initialized along the edge of chaos exhibit ex...
research
10/11/2019

The Expressivity and Training of Deep Neural Networks: toward the Edge of Chaos?

Expressivity is one of the most significant issues in assessing neural n...
research
02/21/2017

Convolution Aware Initialization

Initialization of parameters in deep neural networks has been shown to h...
research
11/26/2022

Why Neural Networks Work

We argue that many properties of fully-connected feedforward neural netw...
research
05/27/2021

Efficient and Accurate Gradients for Neural SDEs

Neural SDEs combine many of the best qualities of both RNNs and SDEs, an...

Please sign up or login with your details

Forgot password? Click here to reset