Concentration inequalities of the cross-validation estimate for stable predictors

11/23/2010
by   Matthieu CORNEC, et al.
0

In this article, we derive concentration inequalities for the cross-validation estimate of the generalization error for stable predictors in the context of risk assessment. The notion of stability has been first introduced by DEWA79 and extended by KEA95, BE01 and KUNIY02 to characterize class of predictors with infinite VC dimension. In particular, this covers k-nearest neighbors rules, bayesian algorithm (KEA95), boosting,... General loss functions and class of predictors are considered. We use the formalism introduced by DUD03 to cover a large variety of cross-validation procedures including leave-one-out cross-validation, k-fold cross-validation, hold-out cross-validation (or split sample), and the leave-υ-out cross-validation. In particular, we give a simple rule on how to choose the cross-validation, depending on the stability of the class of predictors. In the special case of uniform stability, an interesting consequence is that the number of elements in the test set is not required to grow to infinity for the consistency of the cross-validation procedure. In this special case, the particular interest of leave-one-out cross-validation is emphasized.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/30/2010

Concentration inequalities of the cross-validation estimator for Empirical Risk Minimiser

In this article, we derive concentration inequalities for the cross-vali...
research
11/23/2010

Estimating Subagging by cross-validation

In this article, we derive concentration inequalities for the cross-vali...
research
06/19/2017

An a Priori Exponential Tail Bound for k-Folds Cross-Validation

We consider a priori generalization bounds developed in terms of cross-v...
research
02/21/2022

Consistent Cross Validation with stable learners

This paper investigates the efficiency of different cross-validation (CV...
research
10/18/2021

Gradient boosting with extreme-value theory for wildfire prediction

This paper details the approach of the team Kohrrelation in the 2021 Ext...
research
09/11/2019

Aggregated Hold-Out

Aggregated hold-out (Agghoo) is a method which averages learning rules s...
research
02/02/2023

Uncertainty in Fairness Assessment: Maintaining Stable Conclusions Despite Fluctuations

Several recent works encourage the use of a Bayesian framework when asse...

Please sign up or login with your details

Forgot password? Click here to reset