On the Activation Function Dependence of the Spectral Bias of Neural Networks

08/09/2022
by   Qingguo Hong, et al.
12

Neural networks are universal function approximators which are known to generalize well despite being dramatically overparameterized. We study this phenomenon from the point of view of the spectral bias of neural networks. Our contributions are two-fold. First, we provide a theoretical explanation for the spectral bias of ReLU neural networks by leveraging connections with the theory of finite element methods. Second, based upon this theory we predict that switching the activation function to a piecewise linear B-spline, namely the Hat function, will remove this spectral bias, which we verify empirically in a variety of settings. Our empirical studies also show that neural networks with the Hat activation function are trained significantly faster using stochastic gradient descent and ADAM. Combined with previous work showing that the Hat activation function also improves generalization accuracy on image classification tasks, this indicates that using the Hat activation provides significant advantages over the ReLU on certain problems.

READ FULL TEXT
research
07/26/2022

One Simple Trick to Fix Your Bayesian Neural Network

One of the most popular estimation methods in Bayesian neural networks (...
research
01/14/2023

Understanding the Spectral Bias of Coordinate Based MLPs Via Training Dynamics

Recently, multi-layer perceptrons (MLPs) with ReLU activations have enab...
research
12/11/2020

ALReLU: A different approach on Leaky ReLU activation function to improve Neural Networks Performance

Despite the unresolved 'dying ReLU problem', the classical ReLU activati...
research
11/30/2021

Approximate Spectral Decomposition of Fisher Information Matrix for Simple ReLU Networks

We investigate the Fisher information matrix (FIM) of one hidden layer n...
research
12/02/2020

The Self-Simplifying Machine: Exploiting the Structure of Piecewise Linear Neural Networks to Create Interpretable Models

Today, it is more important than ever before for users to have trust in ...
research
10/19/2018

Leveraging Product as an Activation Function in Deep Networks

Product unit neural networks (PUNNs) are powerful representational model...
research
05/21/2021

Error Bounds of the Invariant Statistics in Machine Learning of Ergodic Itô Diffusions

This paper studies the theoretical underpinnings of machine learning of ...

Please sign up or login with your details

Forgot password? Click here to reset