Train on Validation: Squeezing the Data Lemon

02/16/2018
by   Guy Tennenholtz, et al.
0

Model selection on validation data is an essential step in machine learning. While the mixing of data between training and validation is considered taboo, practitioners often violate it to increase performance. Here, we offer a simple, practical method for using the validation set for training, which allows for a continuous, controlled trade-off between performance and overfitting of model selection. We define the notion of on-average-validation-stable algorithms as one in which using small portions of validation data for training does not overfit the model selection process. We then prove that stable algorithms are also validation stable. Finally, we demonstrate our method on the MNIST and CIFAR-10 datasets using stable algorithms as well as state-of-the-art neural networks. Our results show significant increase in test performance with a minor trade-off in bias admitted to the model selection process.

READ FULL TEXT

page 6

page 7

page 8

page 17

research
09/12/2009

A Nonconformity Approach to Model Selection for SVMs

We investigate the issue of model selection and the use of the nonconfor...
research
04/30/2015

Model Selection and Overfitting in Genetic Programming: Empirical Study [Extended Version]

Genetic Programming has been very successful in solving a large area of ...
research
05/24/2019

Perturbed Model Validation: A New Framework to Validate Model Relevance

This paper introduces PMV (Perturbed Model Validation), a new technique ...
research
07/01/2023

CMA-ES for Post Hoc Ensembling in AutoML: A Great Success and Salvageable Failure

Many state-of-the-art automated machine learning (AutoML) systems use gr...
research
12/08/2020

Robustness of Accuracy Metric and its Inspirations in Learning with Noisy Labels

For multi-class classification under class-conditional label noise, we p...
research
04/01/2020

Evaluation of Model Selection for Kernel Fragment Recognition in Corn Silage

Model selection when designing deep learning systems for specific use-ca...
research
12/24/2020

Leave Zero Out: Towards a No-Cross-Validation Approach for Model Selection

As the main workhorse for model selection, Cross Validation (CV) has ach...

Please sign up or login with your details

Forgot password? Click here to reset