Outlier detection in non-elliptical data by kernel MRCD

08/07/2020
by   Peter J. Rousseeuw, et al.
0

The minimum regularized covariance determinant method (MRCD) is a robust estimator for multivariate location and scatter, which detects outliers by fitting a robust covariance matrix to the data. Its regularization ensures that the covariance matrix is well-conditioned in any dimension. The MRCD assumes that the non-outlying observations are roughly elliptically distributed, but many datasets are not of that form. Moreover, the computation time of MRCD increases substantially when the number of variables goes up, and nowadays datasets with many variables are common. The proposed Kernel Minimum Regularized Covariance Determinant (KMRCD) estimator addresses both issues. It is not restricted to elliptical data because it implicitly computes the MRCD estimates in a kernel induced feature space. A fast algorithm is constructed that starts from kernel-based initial estimates and exploits the kernel trick to speed up the subsequent computations. Based on the KMRCD estimates, a rule is proposed to flag outliers. The KMRCD algorithm performs well in simulations, and is illustrated on real-life data.

READ FULL TEXT
research
07/19/2017

Regularization of the Kernel Matrix via Covariance Matrix Shrinkage Estimation

The kernel trick concept, formulated as an inner product in a feature sp...
research
10/12/2019

Real-time outlier detection for large datasets by RT-DetMCD

Modern industrial machines can generate gigabytes of data in seconds, fr...
research
07/25/2023

Minimum regularized covariance trace estimator and outlier detection for functional data

In this paper, we propose the Minimum Regularized Covariance Trace (MRCT...
research
02/04/2020

Robust Generative Restricted Kernel Machines using Weighted Conjugate Feature Duality

In the past decade, interest in generative models has grown tremendously...
research
12/28/2019

Flagging and handling cellwise outliers by robust estimation of a covariance matrix

We propose a method for detecting cellwise outliers. Given a robust cova...
research
12/14/2017

Fast robust correlation for high dimensional data

The product moment covariance is a cornerstone of multivariate data anal...
research
08/18/2021

Structure Parameter Optimized Kernel Based Online Prediction with a Generalized Optimization Strategy for Nonstationary Time Series

In this paper, sparsification techniques aided online prediction algorit...

Please sign up or login with your details

Forgot password? Click here to reset