Leave Zero Out: Towards a No-Cross-Validation Approach for Model Selection

by   Weikai Li, et al.

As the main workhorse for model selection, Cross Validation (CV) has achieved an empirical success due to its simplicity and intuitiveness. However, despite its ubiquitous role, CV often falls into the following notorious dilemmas. On the one hand, for small data cases, CV suffers a conservatively biased estimation, since some part of the limited data has to hold out for validation. On the other hand, for large data cases, CV tends to be extremely cumbersome, e.g., intolerant time-consuming, due to the repeated training procedures. Naturally, a straightforward ambition for CV is to validate the models with far less computational cost, while making full use of the entire given data-set for training. Thus, instead of holding out the given data, a cheap and theoretically guaranteed auxiliary/augmented validation is derived strategically in this paper. Such an embarrassingly simple strategy only needs to train models on the entire given data-set once, making the model-selection considerably efficient. In addition, the proposed validation approach is suitable for a wide range of learning settings due to the independence of both augmentation and out-of-sample estimation on learning process. In the end, we demonstrate the accuracy and computational benefits of our proposed method by extensive evaluation on multiple data-sets, models and tasks.


page 2

page 9


Fast and Informative Model Selection using Learning Curve Cross-Validation

Common cross-validation (CV) methods like k-fold cross-validation or Mon...

Building Robust Machine Learning Models for Small Chemical Science Data: The Case of Shear Viscosity

Shear viscosity, though being a fundamental property of all liquids, is ...

Clustering Indices based Automatic Classification Model Selection

Classification model selection is a process of identifying a suitable mo...

Bayesian leave-one-out cross-validation for large data

Model inference, such as model comparison, model checking, and model sel...

Train on Validation: Squeezing the Data Lemon

Model selection on validation data is an essential step in machine learn...

Approximate Cross-validation: Guarantees for Model Assessment and Selection

Cross-validation (CV) is a popular approach for assessing and selecting ...

Return of the Infinitesimal Jackknife

The error or variability of machine learning algorithms is often assesse...

Please sign up or login with your details

Forgot password? Click here to reset