Have we been Naive to Select Machine Learning Models? Noisy Data are here to Stay!

07/14/2022
by   Felipe Costa Farias, et al.
0

The model selection procedure is usually a single-criterion decision making in which we select the model that maximizes a specific metric in a specific set, such as the Validation set performance. We claim this is very naive and can perform poor selections of over-fitted models due to the over-searching phenomenon, which over-estimates the performance on that specific set. Futhermore, real world data contains noise that should not be ignored by the model selection procedure and must be taken into account when performing model selection. Also, we have defined four theoretical optimality conditions that we can pursue to better select the models and analyze them by using a multi-criteria decision-making algorithm (TOPSIS) that considers proxies to the optimality conditions to select reasonable models.

READ FULL TEXT
research
04/20/2021

Constrained Bayesian Hierarchical Models for Gaussian Data: A Model Selection Criterion Approach

Consider the setting where there are B>1 candidate statistical models, a...
research
08/29/2020

Model selection for estimation of causal parameters

A popular technique for selecting and tuning machine learning estimators...
research
11/05/2019

Bias-aware model selection for machine learning of doubly robust functionals

While model selection is a well-studied topic in parametric and nonparam...
research
09/11/2019

Counterfactual Cross-Validation: Effective Causal Model Selection from Observational Data

What is the most effective way to select the best causal model among pot...
research
06/24/2016

The optimality of coarse categories in decision-making and information storage

An agent who lacks preferences and instead makes decisions using criteri...
research
11/27/2022

Asymptotic Optimality of Myopic Ranking and Selection Procedures

Ranking and selection (R S) is a popular model for studying discrete-e...
research
03/06/2018

Model Selection as a Multiple Testing Procedure: Improving Akaike's Information Criterion

By interpreting the model selection problem as a multiple hypothesis tes...

Please sign up or login with your details

Forgot password? Click here to reset