Use Of Vapnik-Chervonenkis Dimension in Model Selection

08/20/2018
by   Merlin Mpoudeu, et al.
0

In this dissertation, I derive a new method to estimate the Vapnik-Chervonenkis Dimension (VCD) for the class of linear functions. This method is inspired by the technique developed by Vapnik et al. Vapnik et al. (1994). My contribution rests on the approximation of the expected maximum difference between two empirical Losses (EMDBTEL). In fact, I use a cross-validated form of the error to compute the EMDBTEL, and I make the bound on the EMDBTEL tighter by minimizing a constant in of its right upper bound. I also derive two bounds for the true unknown risk using the additive (ERM1) and the multiplicative (ERM2) Chernoff bounds. These bounds depend on the estimated VCD and the empirical risk. These bounds can be used to perform model selection and to declare with high probability, the chosen model will perform better without making strong assumptions about the data generating process (DG). I measure the accuracy of my technique on simulated datasets and also on three real datasets. The model selection provided by VCD was always as good as if not better than the other methods under reasonable conditions.

READ FULL TEXT
research
11/15/2011

Estimated VC dimension for risk bounds

Vapnik-Chervonenkis (VC) dimension is a fundamental measure of the gener...
research
09/22/2022

Evaluating undercounts in epidemics: response to Maruotti et al. 2022

Maruotti et al. 2022 used a mark-recapture approach to estimate bounds o...
research
03/15/2023

Distribution-free Deviation Bounds of Learning via Model Selection with Cross-validation Risk Estimation

Cross-validation techniques for risk estimation and model selection are ...
research
04/17/2022

Sharper Bounds on Four Lattice Constants

The Korkine–Zolotareff (KZ) reduction, and its generalisations, are wide...
research
05/09/2023

Robust Model Selection with Application in Single-Cell Multiomics Data

Model selection is critical in the modern statistics and machine learnin...
research
09/19/2017

Estimating model evidence using ensemble-based data assimilation with localization - The model selection problem

In recent years, there has been a growing interest in applying data assi...
research
05/18/2016

The Quality of the Covariance Selection Through Detection Problem and AUC Bounds

We consider the problem of quantifying the quality of a model selection ...

Please sign up or login with your details

Forgot password? Click here to reset