Learning the Hypotheses Space from data Part I: Learning Space and U-curve Property

01/26/2020
by   Diego Marcondes, et al.
0

The agnostic PAC learning model consists of: a Hypothesis Space H, a probability distribution P, a sample complexity function m_H(ϵ,δ): [0,1]^2Z_+ of precision ϵ and confidence 1 - δ, a finite i.i.d. sample D_N, a cost function ℓ and a learning algorithm A(H,D_N), which estimates ĥ∈H that approximates a target function h^∈H seeking to minimize out-of-sample error. In this model, prior information is represented by H and ℓ, while problem solution is performed through their instantiation in several applied learning models, with specific algebraic structures for H and corresponding learning algorithms. However, these applied models use additional important concepts not covered by the classic PAC learning theory: model selection and regularization. This paper presents an extension of this model which covers these concepts. The main principle added is the selection, based solely on data, of a subspace of H with a VC-dimension compatible with the available sample. In order to formalize this principle, the concept of Learning Space L(H), which is a poset of subsets of H that covers H and satisfies a property regarding the VC dimension of related subspaces, is presented as the natural search space for model selection algorithms. A remarkable result obtained on this new framework are conditions on L(H) and ℓ that lead to estimated out-of-sample error surfaces, which are true U-curves on L(H) chains, enabling a more efficient search on L(H). Hence, in this new framework, the U-curve optimization problem becomes a natural component of model selection algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/30/2020

Learning the Hypotheses Space from data Part II: Convergence and Feasibility

In part I we proposed a structure for a general Hypotheses Space H, the ...
research
09/10/2021

Near Instance Optimal Model Selection for Pure Exploration Linear Bandits

The model selection problem in the pure exploration linear bandit settin...
research
08/31/2022

Fine-Grained Distribution-Dependent Learning Curves

Learning curves plot the expected error of a learning algorithm as a fun...
research
11/09/2020

A Theory of Universal Learning

How quickly can a given class of concepts be learned from examples? It i...
research
07/22/2014

The U-curve optimization problem: improvements on the original algorithm and time complexity analysis

The U-curve optimization problem is characterized by a decomposable in U...
research
07/20/2022

Learning Underspecified Models

This paper examines whether one can learn to play an optimal action whil...

Please sign up or login with your details

Forgot password? Click here to reset