Robust selection of predictors and conditional outlier detection in a perturbed large-dimensional regression context

04/25/2021
by   Matteo Farnè, et al.
0

This paper presents a fast methodology, called ROBOUT, to identify outliers in a response variable conditional on a set of linearly related predictors, retrieved from a large granular dataset. ROBOUT is shown to be effective and particularly versatile compared to existing methods in the presence of a number of data idiosyncratic features. ROBOUT is able to identify observations with outlying conditional variance when the dataset contains element-wise sparse variables, and the set of predictors contains multivariate outliers. Existing integrated methodologies like SPARSE-LTS and RLARS are systematically sub-optimal under those conditions. ROBOUT entails a robust selection stage of the statistically relevant predictors (by using a Huber or a quantile loss), the estimation of a robust regression model based on the selected predictors (by LTS, GS or MM), and a criterion to identify conditional outliers based on a robust measure of the residuals' dispersion. We conduct a comprehensive simulation study in which the different variants of the proposed algorithm are tested under an exhaustive set of different perturbation scenarios. The methodology is also applied to a granular supervisory banking dataset collected by the European Central Bank.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/19/2014

Sparse Quantile Huber Regression for Efficient and Robust Estimation

We consider new formulations and methods for sparse quantile regression ...
research
09/08/2020

Conditional Uncorrelation and Efficient Non-approximate Subset Selection in Sparse Regression

Given m d-dimensional responsors and n d-dimensional predictors, sparse ...
research
04/22/2021

Conditional Selective Inference for Robust Regression and Outlier Detection using Piecewise-Linear Homotopy Continuation

In practical data analysis under noisy environment, it is common to firs...
research
11/01/2021

A robust partial least squares approach for function-on-function regression

The function-on-function linear regression model in which the response a...
research
09/28/2011

Robust Parametric Classification and Variable Selection by a Minimum Distance Criterion

We investigate a robust penalized logistic regression algorithm based on...
research
12/07/2019

Cellwise Robust M Regression

The cellwise robust M regression estimator is introduced as the first es...
research
03/08/2022

Detection and treatment of outliers for multivariate robust loss reserving

Traditional techniques for calculating outstanding claim liabilities suc...

Please sign up or login with your details

Forgot password? Click here to reset