The Implicit Regularization of Ordinary Least Squares Ensembles

10/10/2019
by   Daniel LeJeune, et al.
0

Ensemble methods that average over a collection of independent predictors that are each limited to a subsampling of both the examples and features of the training data command a significant presence in machine learning, such as the ever-popular random forest, yet the nature of the subsampling effect, particularly of the features, is not well understood. We study the case of an ensemble of linear predictors, where each individual predictor is fit using ordinary least squares on a random submatrix of the data matrix. We show that, under standard Gaussianity assumptions, when the number of features selected for each predictor is optimally tuned, the asymptotic risk of a large ensemble is equal to the asymptotic ridge regression risk, which is known to be optimal among linear predictors in this setting. In addition to eliciting this implicit regularization that results from subsampling, we also connect this ensemble to the dropout technique used in training deep (neural) networks, another strategy that has been shown to have a ridge-like regularizing effect.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/19/2020

Implicit Regularization of Random Feature Models

Random Feature (RF) models are used as efficient parametric approximatio...
research
07/06/2023

Learning Curves for Heterogeneous Feature-Subsampled Ridge Ensembles

Feature bagging is a well-established ensembling method which aims to re...
research
08/01/2022

Accelerated and interpretable oblique random survival forests

The oblique random survival forest (RSF) is an ensemble supervised learn...
research
06/25/2020

Fast, Accurate, and Simple Models for Tabular Data via Augmented Distillation

Automated machine learning (AutoML) can produce complex model ensembles ...
research
06/08/2023

Maximally Machine-Learnable Portfolios

When it comes to stock returns, any form of predictability can bolster r...
research
10/20/2022

Bagging in overparameterized learning: Risk characterization and risk monotonization

Bagging is a commonly used ensemble technique in statistics and machine ...
research
08/12/2020

Predicting MOOCs Dropout Using Only Two Easily Obtainable Features from the First Week's Activities

While Massive Open Online Course (MOOCs) platforms provide knowledge in ...

Please sign up or login with your details

Forgot password? Click here to reset