Estimating the Standard Error of Cross-Validation-Based Estimators of Classification Rules Performance

08/01/2019
by   Waleed A. Yousef, et al.
0

First, we analyze the variance of the Cross Validation (CV)-based estimators used for estimating the performance of classification rules. Second, we propose a novel estimator to estimate this variance using the Influence Function (IF) approach that had been used previously very successfully to estimate the variance of the bootstrap-based estimators. The motivation for this research is that, as the best of our knowledge, the literature lacks a rigorous method for estimating the variance of the CV-based estimators. What is available is a set of ad-hoc procedures that have no mathematical foundation since they ignore the covariance structure among dependent random variables. The conducted experiments show that the IF proposed method has small RMS error with some bias. However, surprisingly, the ad-hoc methods still work better than the IF-based method. Unfortunately, this is due to the lack of enough smoothness if compared to the bootstrap estimator. This opens the research for three points: (1) more comprehensive simulation study to clarify when the IF method win or loose; (2) more mathematical analysis to figure out why the ad-hoc methods work well; and (3) more mathematical treatment to figure out the connection between the appropriate amount of "smoothness" and decreasing the bias of the IF method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2013

Estimating the Maximum Expected Value: An Analysis of (Nested) Cross Validation and the Maximum Sample Average

We investigate the accuracy of the two most common estimators for the ma...
research
07/31/2019

A Leisurely Look at Versions and Variants of the Cross Validation Estimator

Many versions of cross-validation (CV) exist in the literature; and each...
research
03/09/2017

Cross-validation

This text is a survey on cross-validation. We define all classical cross...
research
04/19/2019

Average Density Estimators: Efficiency and Bootstrap Consistency

This paper highlights a tension between semiparametric efficiency and bo...
research
07/30/2019

AUC: Nonparametric Estimators and Their Smoothness

Nonparametric estimation of a statistic, in general, and of the error ra...
research
06/04/2020

Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks

In networks with binary activations and or binary weights the training b...
research
03/28/2013

Discrete Optimization of Statistical Sample Sizes in Simulation by Using the Hierarchical Bootstrap Method

The Bootstrap method application in simulation supposes that value of ra...

Please sign up or login with your details

Forgot password? Click here to reset