Overcoming Overfitting and Large Weight Update Problem in Linear Rectifiers: Thresholded Exponential Rectified Linear Units

06/04/2020
by   Vijay Pandey, et al.
22

In past few years, linear rectified unit activation functions have shown its significance in the neural networks, surpassing the performance of sigmoid activations. RELU (Nair Hinton, 2010), ELU (Clevert et al., 2015), PRELU (He et al., 2015), LRELU (Maas et al., 2013), SRELU (Jin et al., 2016), ThresholdedRELU, all these linear rectified activation functions have its own significance over others in some aspect. Most of the time these activation functions suffer from bias shift problem due to non-zero output mean, and high weight update problem in deep complex networks due to unit gradient, which results in slower training, and high variance in model prediction respectively. In this paper, we propose, "Thresholded exponential rectified linear unit" (TERELU) activation function that works better in alleviating in overfitting: large weight update problem. Along with alleviating overfitting problem, this method also gives good amount of non-linearity as compared to other linear rectifiers. We will show better performance on the various datasets using neural networks, considering TERELU activation method compared to other activations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/22/2018

E-swish: Adjusting Activations to Different Network Depths

Activation functions have a notorious impact on neural networks on both ...
research
02/04/2020

A Deep Conditioning Treatment of Neural Networks

We study the role of depth in training randomly initialized overparamete...
research
04/08/2020

The Loss Surfaces of Neural Networks with General Activation Functions

We present results extending the foundational work of Choromanska et al ...
research
11/30/2021

Beyond Periodicity: Towards a Unifying Framework for Activations in Coordinate-MLPs

Coordinate-MLPs are emerging as an effective tool for modeling multidime...
research
08/10/2018

Dropout is a special case of the stochastic delta rule: faster and more accurate deep learning

Multi-layer neural networks have lead to remarkable performance on many ...
research
07/25/2019

Fast generalization error bound of deep learning without scale invariance of activation functions

In theoretical analysis of deep learning, discovering which features of ...
research
11/27/2018

Dense xUnit Networks

Deep net architectures have constantly evolved over the past few years, ...

Please sign up or login with your details

Forgot password? Click here to reset