Query Complexity of Least Absolute Deviation Regression via Robust Uniform Convergence

02/03/2021
by   Xue Chen, et al.
0

Consider a regression problem where the learner is given a large collection of d-dimensional data points, but can only query a small subset of the real-valued labels. How many queries are needed to obtain a 1+ϵ relative error approximation of the optimum? While this problem has been extensively studied for least squares regression, little is known for other losses. An important example is least absolute deviation regression (ℓ_1 regression) which enjoys superior robustness to outliers compared to least squares. We develop a new framework for analyzing importance sampling methods in regression problems, which enables us to show that the query complexity of least absolute deviation regression is Θ(d/ϵ^2) up to logarithmic factors. We further extend our techniques to show the first bounds on the query complexity for any ℓ_p loss with p∈(1,2). As a key novelty in our analysis, we introduce the notion of robust uniform convergence, which is a new approximation guarantee for the empirical loss. While it is inspired by uniform convergence in statistical learning, our approach additionally incorporates a correction term to avoid unnecessary variance due to outliers. This can be viewed as a new connection between statistical learning theory and variance reduction techniques in stochastic optimization, which should be of independent interest.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/09/2018

Robust and Sparse Regression in GLM by Stochastic Optimization

The generalized linear model (GLM) plays a key role in regression analys...
research
02/13/2018

Online Variance Reduction for Stochastic Optimization

Modern stochastic optimization methods often rely on uniform sampling wh...
research
09/27/2020

Robust regression with covariate filtering: Heavy tails and adversarial contamination

We study the problem of linear regression where both covariates and resp...
research
03/12/2018

M-estimation in high-dimensional linear model

We mainly study the M-estimation method for the high-dimensional linear ...
research
07/18/2012

Stochastic optimization and sparse statistical recovery: An optimal algorithm for high dimensions

We develop and analyze stochastic optimization algorithms for problems i...
research
11/20/2016

Dealing with Range Anxiety in Mean Estimation via Statistical Queries

We give algorithms for estimating the expectation of a given real-valued...
research
01/24/2014

The Sampling-and-Learning Framework: A Statistical View of Evolutionary Algorithms

Evolutionary algorithms (EAs), a large class of general purpose optimiza...

Please sign up or login with your details

Forgot password? Click here to reset