Ensemble Neural Networks (ENN): A gradient-free stochastic method

08/03/2019
by   Yuntian Chen, et al.
15

In this study, an efficient stochastic gradient-free method, the ensemble neural networks (ENN), is developed. In the ENN, the optimization process relies on covariance matrices rather than derivatives. The covariance matrices are calculated by the ensemble randomized maximum likelihood algorithm (EnRML), which is an inverse modeling method. The ENN is able to simultaneously provide estimations and perform uncertainty quantification since it is built under the Bayesian framework. The ENN is also robust to small training data size because the ensemble of stochastic realizations essentially enlarges the training dataset. This constitutes a desirable characteristic, especially for real-world engineering applications. In addition, the ENN does not require the calculation of gradients, which enables the use of complicated neuron models and loss functions in neural networks. We experimentally demonstrate benefits of the proposed model, in particular showing that the ENN performs much better than the traditional Bayesian neural networks (BNN). The EnRML in ENN is a substitution of gradient-based optimization algorithms, which means that it can be directly combined with the feed-forward process in other existing (deep) neural networks, such as convolutional neural networks (CNN) and recurrent neural networks (RNN), broadening future applications of the ENN.

READ FULL TEXT
research
11/23/2016

Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling

Recurrent neural networks (RNNs) have shown promising performance for la...
research
08/25/2020

Channel-Directed Gradients for Optimization of Convolutional Neural Networks

We introduce optimization methods for convolutional neural networks that...
research
10/06/2020

A Novel Neural Network Training Framework with Data Assimilation

In recent years, the prosperity of deep learning has revolutionized the ...
research
09/15/2015

Adapting Resilient Propagation for Deep Learning

The Resilient Propagation (Rprop) algorithm has been very popular for ba...
research
09/07/2021

Revisiting Recursive Least Squares for Training Deep Neural Networks

Recursive least squares (RLS) algorithms were once widely used for train...
research
06/26/2022

Batch-Ensemble Stochastic Neural Networks for Out-of-Distribution Detection

Out-of-distribution (OOD) detection has recently received much attention...
research
08/30/2019

Partitioned integrators for thermodynamic parameterization of neural networks

Stochastic Gradient Langevin Dynamics, the "unadjusted Langevin algorith...

Please sign up or login with your details

Forgot password? Click here to reset