Robust penalized least squares of depth trimmed residuals regression for high-dimensional data

09/04/2023
by   Yijun Zuo, et al.
0

Challenges with data in the big-data era include (i) the dimension p is often larger than the sample size n (ii) outliers or contaminated points are frequently hidden and more difficult to detect. Challenge (i) renders most conventional methods inapplicable. Thus, it attracts tremendous attention from statistics, computer science, and bio-medical communities. Numerous penalized regression methods have been introduced as modern methods for analyzing high-dimensional data. Disproportionate attention has been paid to the challenge (ii) though. Penalized regression methods can do their job very well and are expected to handle the challenge (ii) simultaneously. Most of them, however, can break down by a single outlier (or single adversary contaminated point) as revealed in this article. The latter systematically examines leading penalized regression methods in the literature in terms of their robustness, provides quantitative assessment, and reveals that most of them can break down by a single outlier. Consequently, a novel robust penalized regression method based on the least sum of squares of depth trimmed residuals is proposed and studied carefully. Experiments with simulated and real data reveal that the newly proposed method can outperform some leading competitors in estimation and prediction accuracy in the cases considered.

READ FULL TEXT
research
08/10/2019

A Survey of Tuning Parameter Selection for High-dimensional Regression

Penalized (or regularized) regression, as represented by Lasso and its v...
research
03/19/2021

Robust penalized empirical likelihood in high dimensional longitudinal data analysis

As an effective nonparametric method, empirical likelihood (EL) is appea...
research
12/02/2017

Scalable Sparse Cox's Regression for Large-Scale Survival Data via Broken Adaptive Ridge

This paper develops a new sparse Cox regression method for high-dimensio...
research
12/11/2022

Retire: Robust Expectile Regression in High Dimensions

High-dimensional data can often display heterogeneity due to heterosceda...
research
03/03/2021

Detecting Outliers in High-dimensional Data with Mixed Variable Types using Conditional Gaussian Regression Models

Outlier detection has gained increasing interest in recent years, due to...
research
10/30/2019

Learning pairwise Markov network structures using correlation neighborhoods

Markov networks are widely studied and used throughout multivariate stat...
research
06/09/2018

Deterministic Stretchy Regression

An extension of the regularized least-squares in which the estimation pa...

Please sign up or login with your details

Forgot password? Click here to reset