Estimation of Predictive Performance in High-Dimensional Data Settings using Learning Curves

06/08/2022
by   Jeroen M. Goedhart, et al.
0

In high-dimensional prediction settings, it remains challenging to reliably estimate the test performance. To address this challenge, a novel performance estimation framework is presented. This framework, called Learn2Evaluate, is based on learning curves by fitting a smooth monotone curve depicting test performance as a function of the sample size. Learn2Evaluate has several advantages compared to commonly applied performance estimation methodologies. Firstly, a learning curve offers a graphical overview of a learner. This overview assists in assessing the potential benefit of adding training samples and it provides a more complete comparison between learners than performance estimates at a fixed subsample size. Secondly, a learning curve facilitates in estimating the performance at the total sample size rather than a subsample size. Thirdly, Learn2Evaluate allows the computation of a theoretically justified and useful lower confidence bound. Furthermore, this bound may be tightened by performing a bias correction. The benefits of Learn2Evaluate are illustrated by a simulation study and applications to omics data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/11/2019

Minimizers of the Empirical Risk and Risk Monotonicity

Plotting a learner's average performance against the number of training ...
research
03/16/2023

Sample size determination via learning-type curves

This paper is concerned with sample size determination methodology for p...
research
11/06/2012

Sample Size Planning for Classification Models

In biospectroscopy, suitably annotated and statistically independent sam...
research
10/21/2020

Learning Curves for Analysis of Deep Networks

A learning curve models a classifier's test error as a function of the n...
research
05/21/2008

Kendall's tau in high-dimensional genomic parsimony

High-dimensional data models, often with low sample size, abound in many...
research
11/25/2022

A Survey of Learning Curves with Bad Behavior: or How More Data Need Not Lead to Better Performance

Plotting a learner's generalization performance against the training set...
research
02/08/2022

A Neural Phillips Curve and a Deep Output Gap

Many problems plague the estimation of Phillips curves. Among them is th...

Please sign up or login with your details

Forgot password? Click here to reset