Detecting Label Noise via Leave-One-Out Cross-Validation

03/21/2021
by   Yu-Hang Tang, et al.
0

We present a simple algorithm for identifying and correcting real-valued noisy labels from a mixture of clean and corrupted sample points using Gaussian process regression. A heteroscedastic noise model is employed, in which additive Gaussian noise terms with independent variances are associated with each and all of the observed labels. Optimizing the noise model using maximum likelihood estimation leads to the containment of the GPR model's predictive error by the posterior standard deviation in leave-one-out cross-validation. A multiplicative update scheme is proposed for solving the maximum likelihood estimation problem under non-negative constraints. While we provide proof of convergence for certain special cases, the multiplicative scheme has empirically demonstrated monotonic convergence behavior in virtually all our numerical experiments. We show that the presented method can pinpoint corrupted sample points and lead to better regression models when trained on synthetic and real-world scientific data sets.

READ FULL TEXT

page 4

page 6

research
06/09/2021

Robust Prediction Interval estimation for Gaussian Processes by Cross-Validation method

Probabilistic regression models typically use the Maximum Likelihood Est...
research
01/08/2021

Fast calculation of Gaussian Process multiple-fold cross-validation residuals and their covariances

We generalize fast Gaussian process leave-one-out formulae to multiple-f...
research
01/26/2022

Improved Maximum Likelihood Estimation of ARMA Models

In this paper we propose a new optimization model for maximum likelihood...
research
10/19/2022

Autoregressive Generative Modeling with Noise Conditional Maximum Likelihood Estimation

We introduce a simple modification to the standard maximum likelihood es...
research
02/27/2017

Scalable and Distributed Clustering via Lightweight Coresets

Coresets are compact representations of data sets such that models train...
research
09/15/2020

Interpolating the Trace of the Inverse of Matrix 𝐀 + t 𝐁

We develop heuristic interpolation methods for the function t ↦trace( (𝐀...
research
03/01/2023

Understanding the Diffusion Objective as a Weighted Integral of ELBOs

Diffusion models in the literature are optimized with various objectives...

Please sign up or login with your details

Forgot password? Click here to reset