Suboptimality of Constrained Least Squares and Improvements via Non-Linear Predictors

by   Tomas Vaškevičius, et al.

We study the problem of predicting as well as the best linear predictor in a bounded Euclidean ball with respect to the squared loss. When only boundedness of the data generating distribution is assumed, we establish that the least squares estimator constrained to a bounded Euclidean ball does not attain the classical O(d/n) excess risk rate, where d is the dimension of the covariates and n is the number of samples. In particular, we construct a bounded distribution such that the constrained least squares estimator incurs an excess risk of order Ω(d^3/2/n) hence refuting a recent conjecture of Ohad Shamir [JMLR 2015]. In contrast, we observe that non-linear predictors can achieve the optimal rate O(d/n) with no assumptions on the distribution of the covariates. We discuss additional distributional assumptions sufficient to guarantee an O(d/n) excess risk rate for the least squares estimator. Among them are certain moment equivalence assumptions often used in the robust statistics literature. While such assumptions are central in the analysis of unbounded and heavy-tailed settings, our work indicates that in some cases, they also rule out unfavorable bounded distributions.



There are no comments yet.


page 1

page 2

page 3

page 4


Distribution-Free Robust Linear Regression

We study random design linear regression with no assumptions on the dist...

Robust Linear Regression: Optimal Rates in Polynomial Time

We obtain a robust and computationally efficient estimator for Linear Re...

Beyond Least-Squares: Fast Rates for Regularized Empirical Risk Minimization through Self-Concordance

We consider learning methods based on the regularization of a convex emp...

Loss minimization and parameter estimation with heavy tails

This work studies applications and generalizations of a simple estimatio...

Conditional Risk Minimization for Stochastic Processes

We study the task of learning from non-i.i.d. data. In particular, we ai...

ℓ_1-regression with Heavy-tailed Distributions

In this paper, we consider the problem of linear regression with heavy-t...

Distribution-free properties of isotonic regression

It is well known that the isotonic least squares estimator is characteri...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.