Gaussian Process Neurons Learn Stochastic Activation Functions

11/29/2017
by   Sebastian Urban, et al.
0

We propose stochastic, non-parametric activation functions that are fully learnable and individual to each neuron. Complexity and the risk of overfitting are controlled by placing a Gaussian process prior over these functions. The result is the Gaussian process neuron, a probabilistic unit that can be used as the basic building block for probabilistic graphical models that resemble the structure of neural networks. The proposed model can intrinsically handle uncertainties in its inputs and self-estimate the confidence of its predictions. Using variational Bayesian inference and the central limit theorem, a fully deterministic loss function is derived, allowing it to be trained as efficiently as a conventional neural network using mini-batch gradient descent. The posterior distribution of activation functions is inferred from the training data alongside the weights of the network. The proposed model favorably compares to deep Gaussian processes, both in model complexity and efficiency of inference. It can be directly applied to recurrent or convolutional network structures, allowing its use in audio and image processing tasks. As an preliminary empirical evaluation we present experiments on regression and classification tasks, in which our model achieves performance comparable to or better than a Dropout regularized neural network with a fixed activation function. Experiments are ongoing and results will be added as they become available.

READ FULL TEXT
research
01/13/2023

Neural network with optimal neuron activation functions based on additive Gaussian process regression

Feed-forward neural networks (NN) are a staple machine learning method w...
research
05/02/2020

A survey on modern trainable activation functions

In the literature, there is a strong interest to identify and define act...
research
06/30/2022

Consensus Function from an L_p^q-norm Regularization Term for its Use as Adaptive Activation Functions in Neural Networks

The design of a neural network is usually carried out by defining the nu...
research
02/18/2021

Recurrent Rational Networks

Latest insights from biology show that intelligence does not only emerge...
research
11/21/2020

Central and Non-central Limit Theorems arising from the Scattering Transform and its Neural Activation Generalization

Motivated by analyzing complicated and non-stationary time series, we st...
research
10/24/2019

A Bayesian Approach to Recurrence in Neural Networks

We begin by reiterating that common neural network activation functions ...
research
03/28/2019

On the Stability and Generalization of Learning with Kernel Activation Functions

In this brief we investigate the generalization properties of a recently...

Please sign up or login with your details

Forgot password? Click here to reset