Fast Cross-Validation for Incremental Learning

06/30/2015
by   Pooria Joulani, et al.
0

Cross-validation (CV) is one of the main tools for performance estimation and parameter tuning in machine learning. The general recipe for computing CV estimate is to run a learning algorithm separately for each CV fold, a computationally expensive process. In this paper, we propose a new approach to reduce the computational burden of CV-based performance estimation. As opposed to all previous attempts, which are specific to a particular learning model or problem domain, we propose a general method applicable to a large class of incremental learning algorithms, which are uniquely fitted to big data problems. In particular, our method applies to a wide range of supervised and unsupervised learning tasks with different performance criteria, as long as the base learning algorithm is incremental. We show that the running time of the algorithm scales logarithmically, rather than linearly, in the number of CV folds. Furthermore, the algorithm has favorable properties for parallel and distributed implementation. Experiments with state-of-the-art incremental learning algorithms confirm the practicality of the proposed method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/25/2018

Nested cross-validation when selecting classifiers is overzealous for most practical applications

When selecting a classification algorithm to be applied to a particular ...
research
06/19/2020

Efficient implementations of echo state network cross-validation

Background/introduction: Cross-validation is still uncommon in time seri...
research
02/07/2022

Asynchronous Parallel Incremental Block-Coordinate Descent for Decentralized Machine Learning

Machine learning (ML) is a key technique for big-data-driven modelling a...
research
04/28/2023

LAVA: Data Valuation without Pre-Specified Learning Algorithms

Traditionally, data valuation is posed as a problem of equitably splitti...
research
07/08/2022

Model predictivity assessment: incremental test-set selection and accuracy evaluation

Unbiased assessment of the predictivity of models learnt by supervised m...
research
01/18/2023

Data thinning for convolution-closed distributions

We propose data thinning, a new approach for splitting an observation in...
research
12/18/2019

Incremental ELMVIS for unsupervised learning

An incremental version of the ELMVIS+ method is proposed in this paper. ...

Please sign up or login with your details

Forgot password? Click here to reset