Robust covariance estimation with missing values and cell-wise contamination

06/01/2023
by   Karim Lounici, et al.
0

Large datasets are often affected by cell-wise outliers in the form of missing or erroneous data. However, discarding any samples containing outliers may result in a dataset that is too small to accurately estimate the covariance matrix. Moreover, most robust procedures designed to address this problem are not effective on high-dimensional data as they rely crucially on invertibility of the covariance operator. In this paper, we propose an unbiased estimator for the covariance in the presence of missing values that does not require any imputation step and still achieves minimax statistical accuracy with the operator norm. We also advocate for its use in combination with cell-wise outlier detection methods to tackle cell-wise contamination in a high-dimensional and low-rank setting, where state-of-the-art methods may suffer from numerical instability and long computation times. To complement our theoretical findings, we conducted an experimental study which demonstrates the superiority of our approach over the state of the art both in low and high dimension settings.

READ FULL TEXT

page 24

page 25

research
04/10/2011

Slicing: Nonsingular Estimation of High Dimensional Covariance Matrices Using Multiway Kronecker Delta Covariance Structures

Nonsingular estimation of high dimensional covariance matrices is an imp...
research
07/22/2021

Robust low-rank covariance matrix estimation with a general pattern of missing values

This paper tackles the problem of robust covariance matrix estimation wh...
research
12/14/2017

Fast robust correlation for high dimensional data

The product moment covariance is a cornerstone of multivariate data anal...
research
08/20/2019

Optimal estimation of functionals of high-dimensional mean and covariance matrix

Motivated by portfolio allocation and linear discriminant analysis, we c...
research
07/25/2023

Minimum regularized covariance trace estimator and outlier detection for functional data

In this paper, we propose the Minimum Regularized Covariance Trace (MRCT...
research
01/17/2018

Robust Modifications of U-statistics and Applications to Covariance Estimation Problems

Let Y be a d-dimensional random vector with unknown mean μ and covarianc...
research
06/01/2014

l_1-regularized Outlier Isolation and Regression

This paper proposed a new regression model called l_1-regularized outlie...

Please sign up or login with your details

Forgot password? Click here to reset