Gain Confidence, Reduce Disappointment: A New Approach to Cross-Validation for Sparse Regression

06/26/2023
by   Ryan Cory-Wright, et al.
0

Ridge regularized sparse regression involves selecting a subset of features that explains the relationship between a design matrix and an output vector in an interpretable manner. To select the sparsity and robustness of linear regressors, techniques like leave-one-out cross-validation are commonly used for hyperparameter tuning. However, cross-validation typically increases the cost of sparse regression by several orders of magnitude. Additionally, validation metrics are noisy estimators of the test-set error, with different hyperparameter combinations giving models with different amounts of noise. Therefore, optimizing over these metrics is vulnerable to out-of-sample disappointment, especially in underdetermined settings. To address this, we make two contributions. First, we leverage the generalization theory literature to propose confidence-adjusted variants of leave-one-out that display less propensity to out-of-sample disappointment. Second, we leverage ideas from the mixed-integer literature to obtain computationally tractable relaxations of confidence-adjusted leave-one-out, thereby minimizing it without solving as many MIOs. Our relaxations give rise to an efficient coordinate descent scheme which allows us to obtain significantly lower leave-one-out errors than via other methods in the literature. We validate our theory by demonstrating we obtain significantly sparser and comparably accurate solutions than via popular methods like GLMNet and suffer from less out-of-sample disappointment. On synthetic datasets, our confidence adjustment procedure generates significantly fewer false discoveries, and improves out-of-sample performance by 2-5 compared to cross-validating without confidence adjustment. Across a suite of 13 real datasets, a calibrated version of our procedure improves the test set error by an average of 4 adjustment.

READ FULL TEXT
research
06/11/2023

Blocked Cross-Validation: A Precise and Efficient Method for Hyperparameter Tuning

Hyperparameter tuning plays a crucial role in optimizing the performance...
research
02/20/2019

Cross Validation for Penalized Quantile Regression with a Case-Weight Adjusted Solution Path

Cross validation is widely used for selecting tuning parameters in regul...
research
11/20/2020

Optimizing Approximate Leave-one-out Cross-validation to Tune Hyperparameters

For a large class of regularized models, leave-one-out cross-validation ...
research
06/16/2016

Assessing and tuning brain decoders: cross-validation, caveats, and guidelines

Decoding, ie prediction from brain images or signals, calls for empirica...
research
03/14/2018

How to evaluate sentiment classifiers for Twitter time-ordered data?

Social media are becoming an increasingly important source of informatio...
research
07/19/2021

Can we globally optimize cross-validation loss? Quasiconvexity in ridge regression

Models like LASSO and ridge regression are extensively used in practice ...
research
07/31/2019

A Leisurely Look at Versions and Variants of the Cross Validation Estimator

Many versions of cross-validation (CV) exist in the literature; and each...

Please sign up or login with your details

Forgot password? Click here to reset