Outlier-robust sparse/low-rank least-squares regression and robust matrix completion

12/12/2020
by   Philip Thompson, et al.
0

We consider high-dimensional least-squares regression when a fraction ϵ of the labels are contaminated by an arbitrary adversary. We analyze such problem in the statistical learning framework with a subgaussian distribution and linear hypothesis class on the space of d_1× d_2 matrices. As such, we allow the noise to be heterogeneous. This framework includes sparse linear regression and low-rank trace-regression. For a p-dimensional s-sparse parameter, we show that a convex regularized M-estimator using a sorted Huber-type loss achieves the near-optimal subgaussian rate √(slog(ep/s))+√(log(1/δ)/n)+ϵlog(1/ϵ), with probability at least 1-δ. For a (d_1× d_2)-dimensional parameter with rank r, a nuclear-norm regularized M-estimator using the same sorted Huber-type loss achieves the subgaussian rate √(rd_1/n)+√(rd_2/n)+√(log(1/δ)/n)+ϵlog(1/ϵ), again optimal up to a log factor. In a second part, we study the trace-regression problem when the parameter is the sum of a matrix with rank r plus a s-sparse matrix assuming the "low-spikeness" condition. Unlike multivariate regression studied in previous work, the design in trace-regression lacks positive-definiteness in high-dimensions. Still, we show that a regularized least-squares estimator achieves the subgaussian rate √(rd_1/n)+√(rd_2/n)+√(slog(d_1d_2)/n) +√(log(1/δ)/n). Lastly, we consider noisy matrix completion with non-uniform sampling when a fraction ϵ of the sampled low-rank matrix is corrupted by outliers. If only the low-rank matrix is of interest, we show that a nuclear-norm regularized Huber-type estimator achieves, up to log factors, the optimal rate adaptively to the corruption level. The above mentioned rates require no information on (s,r,ϵ).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/29/2010

Nuclear norm penalization and optimal rates for noisy low rank matrix completion

This paper deals with the trace regression model where n entries or line...
research
11/29/2021

Rank-Constrained Least-Squares: Prediction and Inference

In this work, we focus on the high-dimensional trace regression model wi...
research
04/23/2015

Regularization-free estimation in trace regression with symmetric positive semidefinite matrices

Over the past few years, trace regression models have received considera...
research
07/02/2020

Partial Trace Regression and Low-Rank Kraus Decomposition

The trace regression model, a direct extension of the well-studied linea...
research
09/06/2022

A spectral least-squares-type method for heavy-tailed corrupted regression with unknown covariance & heterogeneous noise

We revisit heavy-tailed corrupted least-squares linear regression assumi...
research
12/04/2019

High Dimensional Latent Panel Quantile Regression with an Application to Asset Pricing

We propose a generalization of the linear panel quantile regression mode...
research
11/04/2021

Consistent Estimation for PCA and Sparse Regression with Oblivious Outliers

We develop machinery to design efficiently computable and consistent est...

Please sign up or login with your details

Forgot password? Click here to reset