Learning the hypotheses space from data through a U-curve algorithm: a statistically consistent complexity regularizer for Model Selection

by   Diego Marcondes, et al.

This paper proposes a data-driven systematic, consistent and non-exhaustive approach to Model Selection, that is an extension of the classical agnostic PAC learning model. In this approach, learning problems are modeled not only by a hypothesis space β„‹, but also by a Learning Space 𝕃(β„‹), a poset of subspaces of β„‹, which covers β„‹ and satisfies a property regarding the VC dimension of related subspaces, that is a suitable algebraic search space for Model Selection algorithms. Our main contributions are a data-driven general learning algorithm to perform regularized Model Selection on 𝕃(β„‹) and a framework under which one can, theoretically, better estimate a target hypothesis with a given sample size by properly modeling 𝕃(β„‹) and employing high computational power. A remarkable consequence of this approach are conditions under which a non-exhaustive search of 𝕃(β„‹) can return an optimal solution. The results of this paper lead to a practical property of Machine Learning, that the lack of experimental data may be mitigated by a high computational capacity. In a context of continuous popularization of computational power, this property may help understand why Machine Learning has become so important, even where data is expensive and hard to get.



There are no comments yet.


page 1

page 2

page 3

page 4

βˆ™ 01/26/2020

Learning the Hypotheses Space from data Part I: Learning Space and U-curve Property

The agnostic PAC learning model consists of: a Hypothesis Space H, a pro...
βˆ™ 01/30/2020

Learning the Hypotheses Space from data Part II: Convergence and Feasibility

In part I we proposed a structure for a general Hypotheses Space H, the ...
βˆ™ 06/23/2020

A Robust Consistent Information Criterion for Model Selection based on Empirical Likelihood

Conventional likelihood-based information criteria for model selection r...
βˆ™ 06/04/2018

Agreement-based Learning

Model selection is a problem that has occupied machine learning research...
βˆ™ 02/04/2014

Sequential Model-Based Ensemble Optimization

One of the most tedious tasks in the application of machine learning is ...
βˆ™ 02/17/2021

Joint Continuous and Discrete Model Selection via Submodularity

In model selection problems for machine learning, the desire for a well-...
βˆ™ 09/26/2020

Small Data, Big Decisions: Model Selection in the Small-Data Regime

Highly overparametrized neural networks can display curiously strong gen...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.