On the Variance, Admissibility, and Stability of Empirical Risk Minimization

05/29/2023
by   Gil Kur, et al.
3

It is well known that Empirical Risk Minimization (ERM) with squared loss may attain minimax suboptimal error rates (Birgé and Massart, 1993). The key message of this paper is that, under mild assumptions, the suboptimality of ERM must be due to large bias rather than variance. More precisely, in the bias-variance decomposition of the squared error of the ERM, the variance term necessarily enjoys the minimax rate. In the case of fixed design, we provide an elementary proof of this fact using the probabilistic method. Then, we prove this result for various models in the random design setting. In addition, we provide a simple proof of Chatterjee's admissibility theorem (Chatterjee, 2014, Theorem 1.4), which states that ERM cannot be ruled out as an optimal method, in the fixed design setting, and extend this result to the random design setting. We also show that our estimates imply stability of ERM, complementing the main result of Caponnetto and Rakhlin (2006) for non-Donsker classes. Finally, we show that for non-Donsker classes, there are functions close to the ERM, yet far from being almost-minimizers of the empirical loss, highlighting the somewhat irregular nature of the loss landscape.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/18/2022

On the minimax rate of the Gaussian sequence model under bounded convex constraints

We determine the exact minimax rate of a Gaussian sequence model under b...
research
09/07/2016

Chaining Bounds for Empirical Risk Minimization

This paper extends the standard chaining technique to prove excess risk ...
research
02/08/2019

Beyond Least-Squares: Fast Rates for Regularized Empirical Risk Minimization through Self-Concordance

We consider learning methods based on the regularization of a convex emp...
research
01/05/2021

A unifying approach on bias and variance analysis for classification

Standard bias and variance (B V) terminologies were originally defined...
research
09/01/2022

Testing for the Important Components of Posterior Predictive Variance

We give a decomposition of the posterior predictive variance using the l...
research
02/24/2021

On the Minimal Error of Empirical Risk Minimization

We study the minimal error of the Empirical Risk Minimization (ERM) proc...
research
01/10/2023

A Unified Theory of Diversity in Ensemble Learning

We present a theory of ensemble diversity, explaining the nature and eff...

Please sign up or login with your details

Forgot password? Click here to reset