Gradient Acceleration in Activation Functions

06/26/2018
by   Sangchul Hahn, et al.
0

Dropout has been one of standard approaches to train deep neural networks, and it is known to regularize large models to avoid overfitting. The effect of dropout has been explained by avoiding co-adaptation. In this paper, however, we propose a new explanation of why dropout works and propose a new technique to design better activation functions. First, we show that dropout is an optimization technique to push the input towards the saturation area of nonlinear activation function by accelerating gradient information flowing even in the saturation area in backpropagation. Based on this explanation, we propose a new technique for activation functions, gradient acceleration in activation function (GAAF), that accelerates gradients to flow even in the saturation area. Then, input to the activation function can climb onto the saturation area which makes the network more robust because the model converges on a flat region. Experiment results support our explanation of dropout and confirm that the proposed GAAF technique improves performances with expected properties.

READ FULL TEXT
research
07/31/2023

STL: A Signed and Truncated Logarithm Activation Function for Neural Networks

Activation functions play an essential role in neural networks. They pro...
research
05/15/2020

A New Activation Function for Training Deep Neural Networks to Avoid Local Minimum

Activation functions have a major role to play and hence are very import...
research
02/27/2023

Moderate Adaptive Linear Units (MoLU)

We propose a new high-performance activation function, Moderate Adaptive...
research
11/14/2018

Drop-Activation: Implicit Parameter Reduction and Harmonic Regularization

Overfitting frequently occurs in deep learning. In this paper, we propos...
research
05/17/2021

Activation function design for deep networks: linearity and effective initialisation

The activation function deployed in a deep neural network has great infl...
research
12/21/2013

An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks

Catastrophic forgetting is a problem faced by many machine learning mode...

Please sign up or login with your details

Forgot password? Click here to reset