Learning the hypotheses space from data through a U-curve algorithm: a statistically consistent complexity regularizer for Model Selection

09/08/2021
by   Diego Marcondes, et al.
0

This paper proposes a data-driven systematic, consistent and non-exhaustive approach to Model Selection, that is an extension of the classical agnostic PAC learning model. In this approach, learning problems are modeled not only by a hypothesis space ℋ, but also by a Learning Space 𝕃(ℋ), a poset of subspaces of ℋ, which covers ℋ and satisfies a property regarding the VC dimension of related subspaces, that is a suitable algebraic search space for Model Selection algorithms. Our main contributions are a data-driven general learning algorithm to perform regularized Model Selection on 𝕃(ℋ) and a framework under which one can, theoretically, better estimate a target hypothesis with a given sample size by properly modeling 𝕃(ℋ) and employing high computational power. A remarkable consequence of this approach are conditions under which a non-exhaustive search of 𝕃(ℋ) can return an optimal solution. The results of this paper lead to a practical property of Machine Learning, that the lack of experimental data may be mitigated by a high computational capacity. In a context of continuous popularization of computational power, this property may help understand why Machine Learning has become so important, even where data is expensive and hard to get.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/26/2020

Learning the Hypotheses Space from data Part I: Learning Space and U-curve Property

The agnostic PAC learning model consists of: a Hypothesis Space H, a pro...
research
01/30/2020

Learning the Hypotheses Space from data Part II: Convergence and Feasibility

In part I we proposed a structure for a general Hypotheses Space H, the ...
research
10/16/2021

On Model Selection Consistency of Lasso for High-Dimensional Ising Models on Tree-like Graphs

We consider the problem of high-dimensional Ising model selection using ...
research
09/01/2023

Subjectivity in Unsupervised Machine Learning Model Selection

Model selection is a necessary step in unsupervised machine learning. De...
research
06/04/2018

Agreement-based Learning

Model selection is a problem that has occupied machine learning research...
research
02/17/2021

Joint Continuous and Discrete Model Selection via Submodularity

In model selection problems for machine learning, the desire for a well-...
research
07/21/2023

Bounded P-values in Parametric Programming-based Selective Inference

Selective inference (SI) has been actively studied as a promising framew...

Please sign up or login with your details

Forgot password? Click here to reset