Approximate Cross-Validation for Structured Models

06/23/2020
by   Soumya Ghosh, et al.
0

Many modern data analyses benefit from explicitly modeling dependence structure in data – such as measurements across time or space, ordered words in a sentence, or genes in a genome. Cross-validation is the gold standard to evaluate these analyses but can be prohibitively slow due to the need to re-run already-expensive learning algorithms many times. Previous work has shown approximate cross-validation (ACV) methods provide a fast and provably accurate alternative in the setting of empirical risk minimization. But this existing ACV work is restricted to simpler models by the assumptions that (i) data are independent and (ii) an exact initial model fit is available. In structured data analyses, (i) is always untrue, and (ii) is often untrue. In the present work, we address (i) by extending ACV to models with dependence structure. To address (ii), we verify – both theoretically and empirically – that ACV quality deteriorates smoothly with noise in the initial fit. We demonstrate the accuracy and computational benefits of our proposed methods on a diverse set of real-world applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/05/2018

Cross validation residuals for generalised least squares and other correlated data models

Cross validation residuals are well known for the ordinary least squares...
research
10/24/2018

Leave-one-out cross-validation for non-factorizable normal models

Cross-validation can be used to measure a model's predictive accuracy fo...
research
06/12/2018

CID Models on Real-world Social Networks and Goodness of Fit Measurements

Assessing the model fit quality of statistical models for network data i...
research
08/24/2020

Approximate Cross-Validation with Low-Rank Data in High Dimensions

Many recent advances in machine learning are driven by a challenging tri...
research
06/01/2018

Return of the Infinitesimal Jackknife

The error or variability of machine learning algorithms is often assesse...
research
06/12/2018

CID Models on Real-world Social Networks and GOF Measurements

Assessing the model fit quality of statistical models for network data i...
research
06/19/2020

Efficient implementations of echo state network cross-validation

Background/introduction: Cross-validation is still uncommon in time seri...

Please sign up or login with your details

Forgot password? Click here to reset