Batch Inverse-Variance Weighting: Deep Heteroscedastic Regression

07/09/2021
by   Vincent Mai, et al.
0

Heteroscedastic regression is the task of supervised learning where each label is subject to noise from a different distribution. This noise can be caused by the labelling process, and impacts negatively the performance of the learning algorithm as it violates the i.i.d. assumptions. In many situations however, the labelling process is able to estimate the variance of such distribution for each label, which can be used as an additional information to mitigate this impact. We adapt an inverse-variance weighted mean square error, based on the Gauss-Markov theorem, for parameter optimization on neural networks. We introduce Batch Inverse-Variance, a loss function which is robust to near-ground truth samples, and allows to control the effective learning rate. Our experimental results show that BIV improves significantly the performance of the networks on two noisy datasets, compared to L2 loss, inverse-variance weighting, as well as a filtering-based baseline.

READ FULL TEXT

page 8

page 15

research
04/07/2021

Harmless label noise and informative soft-labels in supervised classification

Manual labelling of training examples is common practice in supervised l...
research
01/19/2021

Variance Based Samples Weighting for Supervised Deep Learning

In the context of supervised learning of a function by a Neural Network ...
research
01/05/2022

Sample Efficient Deep Reinforcement Learning via Uncertainty Estimation

In model-free deep reinforcement learning (RL) algorithms, using noisy v...
research
04/01/2022

Unimodal-Concentrated Loss: Fully Adaptive Label Distribution Learning for Ordinal Regression

Learning from a label distribution has achieved promising results on ord...
research
02/07/2022

Noise Regularizes Over-parameterized Rank One Matrix Recovery, Provably

We investigate the role of noise in optimization algorithms for learning...
research
03/26/2022

A Robust Optimization Method for Label Noisy Datasets Based on Adaptive Threshold: Adaptive-k

SGD does not produce robust results on datasets with label noise. Becaus...
research
05/10/2023

Supervised learning with probabilistic morphisms and kernel mean embeddings

In this paper I propose a concept of a correct loss function in a genera...

Please sign up or login with your details

Forgot password? Click here to reset