Expected Validation Performance and Estimation of a Random Variable's Maximum

10/01/2021
by   Jesse Dodge, et al.
0

Research in NLP is often supported by experimental results, and improved reporting of such results can lead to better understanding and more reproducible science. In this paper we analyze three statistical estimators for expected validation performance, a tool used for reporting performance (e.g., accuracy) as a function of computational budget (e.g., number of hyperparameter tuning experiments). Where previous work analyzing such estimators focused on the bias, we also examine the variance and mean squared error (MSE). In both synthetic and realistic scenarios, we evaluate three estimators and find the unbiased estimator has the highest variance, and the estimator with the smallest variance has the largest bias; the estimator with the smallest MSE strikes a balance between bias and variance, displaying a classic bias-variance tradeoff. We use expected validation performance to compare between different models, and analyze how frequently each estimator leads to drawing incorrect conclusions about which of two models performs best. We find that the two biased estimators lead to the fewest incorrect conclusions, which hints at the importance of minimizing variance and MSE.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2019

Synthetic estimation for the complier average causal effect

We propose an improved estimator of the complier average causal effect (...
research
05/21/2022

Bias-robust Integration of Observational and Experimental Estimators

We describe a simple approach for combining an unbiased and a (possibly)...
research
04/28/2020

Showing Your Work Doesn't Always Work

In natural language processing, a recently popular line of work explores...
research
02/22/2021

Dither computing: a hybrid deterministic-stochastic computing framework

Stochastic computing has a long history as an alternative method of perf...
research
02/26/2021

Simultaneous Bandwidths Determination for DK-HAC Estimators and Long-Run Variance Estimation in Nonparametric Settings

We consider the derivation of data-dependent simultaneous bandwidths for...
research
09/06/2019

Show Your Work: Improved Reporting of Experimental Results

Research in natural language processing proceeds, in part, by demonstrat...
research
03/03/2021

Minimax MSE Bounds and Nonlinear VAR Prewhitening for Long-Run Variance Estimation Under Nonstationarity

We establish new mean-squared error (MSE) bounds for long-run variance (...

Please sign up or login with your details

Forgot password? Click here to reset