Parametric Flatten-T Swish: An Adaptive Non-linear Activation Function For Deep Learning

11/06/2020
by   Hock Hung Chieng, et al.
0

Activation function is a key component in deep learning that performs non-linear mappings between the inputs and outputs. Rectified Linear Unit (ReLU) has been the most popular activation function across the deep learning community. However, ReLU contains several shortcomings that can result in inefficient training of the deep neural networks, these are: 1) the negative cancellation property of ReLU tends to treat negative inputs as unimportant information for the learning, resulting in a performance degradation; 2) the inherent predefined nature of ReLU is unlikely to promote additional flexibility, expressivity, and robustness to the networks; 3) the mean activation of ReLU is highly positive and leads to bias shift effect in network layers; and 4) the multilinear structure of ReLU restricts the non-linear approximation power of the networks. To tackle these shortcomings, this paper introduced Parametric Flatten-T Swish (PFTS) as an alternative to ReLU. By taking ReLU as a baseline method, the experiments showed that PFTS improved classification accuracy on SVHN dataset by 0.31 0.97 DNN-6, and DNN-7, respectively. Besides, PFTS also achieved the highest mean rank among the comparison methods. The proposed PFTS manifested higher non-linear approximation power during training and thereby improved the predictive performance of the networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/15/2018

Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning

Activation functions are essential for deep learning methods to learn an...
research
10/17/2022

Nish: A Novel Negative Stimulated Hybrid Activation Function

An activation function has a significant impact on the efficiency and ro...
research
09/10/2020

Activate or Not: Learning Customized Activation

Modern activation layers use non-linear functions to activate the neuron...
research
04/03/2016

Multi-Bias Non-linear Activation in Deep Neural Networks

As a widely used non-linear activation, Rectified Linear Unit (ReLU) sep...
research
02/07/2023

Efficient Parametric Approximations of Neural Network Function Space Distance

It is often useful to compactly summarize important properties of model ...
research
04/08/2016

Norm-preserving Orthogonal Permutation Linear Unit Activation Functions (OPLU)

We propose a novel activation function that implements piece-wise orthog...
research
11/27/2021

Why KDAC? A general activation function for knowledge discovery

Named entity recognition based on deep learning (DNER) can effectively m...

Please sign up or login with your details

Forgot password? Click here to reset