Learning Activation Functions: A new paradigm of understanding Neural Networks

06/23/2019
by   Mohit Goyal, et al.
56

There has been limited research in the domain of activation functions, most of which has focused on improving the ease of optimization of neural networks (NNs). However, to develop a deeper understanding of deep learning, it becomes important to look at the non linear component of NNs more carefully. In this paper, we aim to provide a generic form of activation function along with appropriate mathematical grounding so as to allow for insights into the working of NNs in future. We propose "Self-Learnable Activation Functions" (SLAF), which are learned during training and are capable of approximating most of the existing activation functions. SLAF is given as a weighted sum of pre-defined basis elements which can serve for a good approximation of the optimal activation function. The coefficients for these basis elements allow a search in the entire space of continuous functions (consisting of all the conventional activations). We propose various training routines which can be used to achieve performance with SLAF equipped neural networks (SLNNs). We prove that SLNNs can approximate any neural network with lipschitz continuous activations, to any arbitrary error highlighting their capacity and possible equivalence with standard NNs. Also, SLNNs can be completely represented as a collections of finite degree polynomial upto the very last layer obviating several hyper parameters like width and depth. Since the optimization of SLNNs is still a challenge, we show that using SLAF along with standard activations (like ReLU) can provide performance improvements with only a small increase in number of parameters.

READ FULL TEXT
research
07/13/2023

Deep Network Approximation: Beyond ReLU to Diverse Activation Functions

This paper explores the expressive power of deep neural networks for a d...
research
08/25/2023

Linear Oscillation: The Aesthetics of Confusion for Vision Transformer

Activation functions are the linchpins of deep learning, profoundly infl...
research
01/03/2022

Deep neural networks for smooth approximation of physics with higher order and continuity B-spline base functions

This paper deals with the following important research question. Traditi...
research
03/28/2023

Function Approximation with Randomly Initialized Neural Networks for Approximate Model Reference Adaptive Control

Classical results in neural network approximation theory show how arbitr...
research
11/21/2019

DeepLABNet: End-to-end Learning of Deep Radial Basis Networks with Fully Learnable Basis Functions

From fully connected neural networks to convolutional neural networks, t...
research
06/11/2020

On the asymptotics of wide networks with polynomial activations

We consider an existing conjecture addressing the asymptotic behavior of...
research
12/30/2021

A Unified and Constructive Framework for the Universality of Neural Networks

One of the reasons why many neural networks are capable of replicating c...

Please sign up or login with your details

Forgot password? Click here to reset