High-dimensional Measurement Error Models for Lipschitz Loss

10/26/2022
by   Xin Ma, et al.
0

Recently emerging large-scale biomedical data pose exciting opportunities for scientific discoveries. However, the ultrahigh dimensionality and non-negligible measurement errors in the data may create difficulties in estimation. There are limited methods for high-dimensional covariates with measurement error, that usually require knowledge of the noise distribution and focus on linear or generalized linear models. In this work, we develop high-dimensional measurement error models for a class of Lipschitz loss functions that encompasses logistic regression, hinge loss and quantile regression, among others. Our estimator is designed to minimize the L_1 norm among all estimators belonging to suitable feasible sets, without requiring any knowledge of the noise distribution. Subsequently, we generalize these estimators to a Lasso analog version that is computationally scalable to higher dimensions. We derive theoretical guarantees in terms of finite sample statistical error bounds and sign consistency, even when the dimensionality increases exponentially with the sample size. Extensive simulation studies demonstrate superior performance compared to existing methods in classification and quantile regression problems. An application to a gender classification task based on brain functional connectivity in the Human Connectome Project data illustrates improved accuracy under our approach, and the ability to reliably identify significant brain connections that drive gender differences.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/30/2018

Simulation-Selection-Extrapolation: Estimation in High-Dimensional Errors-in-Variables Models

This paper considers errors-in-variables models in a high-dimensional se...
research
04/09/2012

Non-asymptotic Oracle Inequalities for the High-Dimensional Cox Regression via Lasso

We consider the finite sample properties of the regularized high-dimensi...
research
10/21/2019

High-dimensional robust approximated M-estimators for mean regression with asymmetric data

Asymmetry along with heteroscedasticity or contamination often occurs wi...
research
12/30/2014

A General Framework for Robust Testing and Confidence Regions in High-Dimensional Quantile Regression

We propose a robust inferential procedure for assessing uncertainties of...
research
08/20/2021

Robust adaptive Lasso in high-dimensional logistic regression with an application to genomic classification of cancer patients

Penalized logistic regression is extremely useful for binary classificat...
research
10/10/2018

On the Properties of Simulation-based Estimators in High Dimensions

Considering the increasing size of available data, the need for statisti...
research
11/09/2017

Oracle inequalities for sign constrained generalized linear models

High-dimensional data have recently been analyzed because of data collec...

Please sign up or login with your details

Forgot password? Click here to reset