Optimal Training of Mean Variance Estimation Neural Networks

02/17/2023
by   Laurens Sluijterman, et al.
0

This paper focusses on the optimal implementation of a Mean Variance Estimation network (MVE network) (Nix and Weigend, 1994). This type of network is often used as a building block for uncertainty estimation methods in a regression setting, for instance Concrete dropout (Gal et al., 2017) and Deep Ensembles (Lakshminarayanan et al., 2017). Specifically, an MVE network assumes that the data is produced from a normal distribution with a mean function and variance function. The MVE network outputs a mean and variance estimate and optimizes the network parameters by minimizing the negative loglikelihood. In this paper, we discuss two points: firstly, the convergence difficulties reported in recent work can be relatively easily prevented by following the recommendation from the original authors that a warm-up period should be used. During this period, only the mean is optimized assuming a fixed variance. This recommendation is often not used in practice. We experimentally demonstrate how essential this step is. We also examine if keeping the mean estimate fixed after the warm-up leads to different results than estimating both the mean and the variance simultaneously after the warm-up. We do not observe a substantial difference. Secondly, we propose a novel improvement of the MVE network: separate regularization of the mean and the variance estimate. We demonstrate, both on toy examples and on a number of benchmark UCI regression data sets, that following the original recommendations and the novel separate regularization can lead to significant improvements.

READ FULL TEXT
research
04/29/2019

A Closed Form Approximation of Moments of New Generalization of Negative Binomial Distribution

In this paper, we propose a closed form approximation to the mean and va...
research
11/23/2021

Variance Reduction in Deep Learning: More Momentum is All You Need

Variance reduction (VR) techniques have contributed significantly to acc...
research
05/09/2012

Improved Mean and Variance Approximations for Belief Net Responses via Network Doubling

A Bayesian belief network models a joint distribution with an directed a...
research
11/25/2011

On l_1 Mean and Variance Filtering

This paper addresses the problem of segmenting a time-series with respec...
research
12/14/2019

Bayesian Linear Regression on Deep Representations

A simple approach to obtaining uncertainty-aware neural networks for reg...
research
01/07/2021

A Novel Regression Loss for Non-Parametric Uncertainty Optimization

Quantification of uncertainty is one of the most promising approaches to...
research
01/05/2020

Self-Orthogonality Module: A Network Architecture Plug-in for Learning Orthogonal Filters

In this paper, we investigate the empirical impact of orthogonality regu...

Please sign up or login with your details

Forgot password? Click here to reset