Testing Cross-Validation Variants in Ranking Environments

05/25/2021
by   Balázs R. Sziklai, et al.
0

This research investigates how to determine whether two rankings can come from the same distribution. We evaluate three hybrid tests: Wilcoxon's, Dietterich's, and Alpaydin's statistical tests combined with cross-validation, each operating with folds ranging from 5 to 10, thus altogether 18 variants. We have used the framework of a popular comparative statistical test, the Sum of Ranking Differences, but our results are representative of all ranking environments. To compare these methods, we have followed an innovative approach borrowed from Economics. We designed eight scenarios for testing type I and II errors. These represent typical situations (i.e., different data structures) that cross-validation (CV) tests face routinely. The optimal CV method depends on the preferences regarding the minimization of type I/II errors, size of the input, and expected patterns in the data. The Wilcoxon method with eight folds proved to be the best under all three investigated input sizes, although there were scenarios and decision aspects where other methods, namely Wilcoxon 10 and Alpaydin 10, performed better.

READ FULL TEXT

page 9

page 11

research
09/03/2021

A New Approach to Multilabel Stratified Cross Validation with Application to Large and Sparse Gene Ontology Datasets

Multilabel learning is an important topic in machine learning research. ...
research
03/23/2017

Cross-Validation with Confidence

Cross-validation is one of the most popular model selection methods in s...
research
03/14/2018

How to evaluate sentiment classifiers for Twitter time-ordered data?

Social media are becoming an increasingly important source of informatio...
research
06/25/2014

Mass-Univariate Hypothesis Testing on MEEG Data using Cross-Validation

Recent advances in statistical theory, together with advances in the com...
research
03/18/2020

Bootstrap Bias Corrected Cross Validation applied to Super Learning

Super learner algorithm can be applied to combine results of multiple ba...
research
02/09/2018

Automatic Passenger Counting: Introducing the t-Test Induced Equivalence Test

Automatic passenger counting in public transport has been emerging rapid...
research
08/23/2022

Integrative conformal p-values for powerful out-of-distribution testing with labeled outliers

This paper develops novel conformal methods to test whether a new observ...

Please sign up or login with your details

Forgot password? Click here to reset