Iterative Approximate Cross-Validation

03/05/2023
by   Yuetian Luo, et al.
0

Cross-validation (CV) is one of the most popular tools for assessing and selecting predictive models. However, standard CV suffers from high computational cost when the number of folds is large. Recently, under the empirical risk minimization (ERM) framework, a line of works proposed efficient methods to approximate CV based on the solution of the ERM problem trained on the full dataset. However, in large-scale problems, it can be hard to obtain the exact solution of the ERM problem, either due to limited computational resources or due to early stopping as a way of preventing overfitting. In this paper, we propose a new paradigm to efficiently approximate CV when the ERM problem is solved via an iterative first-order algorithm, without running until convergence. Our new method extends existing guarantees for CV approximation to hold along the whole trajectory of the algorithm, including at convergence, thus generalizing existing CV approximation methods. Finally, we illustrate the accuracy and computational efficiency of our method through a range of empirical studies.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/02/2020

Approximate Cross-validation: Guarantees for Model Assessment and Selection

Cross-validation (CV) is a popular approach for assessing and selecting ...
research
04/16/2021

Overfitting in Bayesian Optimization: an empirical study and early-stopping solution

Bayesian Optimization (BO) is a successful methodology to tune the hyper...
research
07/02/2019

Double Cross Validation for the Number of Factors in Approximate Factor Models

Determining the number of factors is essential to factor analysis. In th...
research
11/14/2017

On Optimal Generalizability in Parametric Learning

We consider the parametric learning problem, where the objective of the ...
research
09/25/2022

Algorithms that Approximate Data Removal: New Results and Limitations

We study the problem of deleting user data from machine learning models ...
research
11/04/2014

Simple approximate MAP Inference for Dirichlet processes

The Dirichlet process mixture (DPM) is a ubiquitous, flexible Bayesian n...
research
02/18/2020

Estimating the Penalty Level of ℓ_1-minimization via Two Gaussian Approximation Methods

In this paper, we aim to give a theoretical approximation for the penalt...

Please sign up or login with your details

Forgot password? Click here to reset