High-dimensional outlier detection using random projections

by   P. Navarro-Esteban, et al.

There exist multiple methods to detect outliers in multivariate data in the literature, but most of them require to estimate the covariance matrix. The higher the dimension, the more complex the estimation of the matrix becoming impossible in high dimensions. In order to avoid estimating this matrix, we propose a novel random projections-based procedure to detect outliers in Gaussian multivariate data. It consists in projecting the data in several one-dimensional subspaces where an appropriate univariate outlier detection method, similar to Tukey's method but with a threshold depending on the initial dimension and the sample size, is applied. The required number of projections is determined using sequential analysis. Simulated and real datasets illustrate the performance of the proposed method.


page 1

page 2

page 3

page 4


Covariance matrix testing in high dimension using random projections

Estimation and hypothesis tests for the covariance matrix in high dimens...

Random Subspace Learning Approach to High-Dimensional Outliers Detection

We introduce and develop a novel approach to outlier detection based on ...

Scalable Multiple Changepoint Detection for Functional Data Sequences

We propose the Multiple Changepoint Isolation (MCI) method for detecting...

The use of fourth order cumulant tensors to detect outlier features modelled by a t-Student copula

In this paper we use multivariate cumulant of order 4 to distinguish bet...

Matrix optimization based Euclidean embedding with outliers

Euclidean embedding from noisy observations containing outlier errors is...

Outlier detection in non-elliptical data by kernel MRCD

The minimum regularized covariance determinant method (MRCD) is a robust...

Real-time outlier detection for large datasets by RT-DetMCD

Modern industrial machines can generate gigabytes of data in seconds, fr...