Taming the heavy-tailed features by shrinkage and clipping

10/24/2017
by   Ziwei Zhu, et al.
0

In this paper, we consider the generalized linear models (GLM) with heavy-tailed features and corruptions. Besides clipping the response, we propose to shrink the feature vector by its ℓ_4-norm under the low dimensional regime and clip each entry of the feature vector in the high-dimensional regime. Under bounded fourth moment assumptions, we show that the MLE based on shrunk or clipped data enjoys nearly the minimax optimal rate with exponential deviation bound. Simulations demonstrate significant improvement in statistical performance by feature shrinkage and clipping in linear regression with heavy-tailed noise and logistic regression with noisy labels. We also apply shrinkage to deep features of MNIST images and find that classifiers trained by shrunk deep features are fairly robust to noisy labels: it achieves 0.9% testing error in the presence of 40% mislabeled data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/02/2018

ℓ_1-regression with Heavy-tailed Distributions

In this paper, we consider the problem of linear regression with heavy-t...
research
07/23/2021

Robust Estimation of High-Dimensional Vector Autoregressive Models

High-dimensional time series data appear in many scientific areas in the...
research
05/23/2023

Two Results on Low-Rank Heavy-Tailed Multiresponse Regressions

This paper gives two theoretical results on estimating low-rank paramete...
research
09/16/2022

Truthful Generalized Linear Models

In this paper we study estimating Generalized Linear Models (GLMs) in th...
research
11/29/2022

Residual Permutation Test for High-Dimensional Regression Coefficient Testing

We consider the problem of testing whether a single coefficient is equal...
research
01/10/2022

Differentially Private ℓ_1-norm Linear Regression with Heavy-tailed Data

We study the problem of Differentially Private Stochastic Convex Optimiz...
research
04/04/2019

Multivariate outlier detection based on a robust Mahalanobis distance with shrinkage estimators

A collection of robust Mahalanobis distances for multivariate outlier de...

Please sign up or login with your details

Forgot password? Click here to reset