Cross-validation for change-point regression: pitfalls and solutions

by   Florian Pein, et al.

Cross-validation is the standard approach for tuning parameter selection in many non-parametric regression problems. However its use is less common in change-point regression, perhaps as its prediction error-based criterion may appear to permit small spurious changes and hence be less well-suited to estimation of the number and location of change-points. We show that in fact the problems of cross-validation with squared error loss are more severe and can lead to systematic under- or over-estimation of the number of change-points, and highly suboptimal estimation of the mean function in simple settings where changes are easily detectable. We propose two simple approaches to remedy these issues, the first involving the use of absolute error rather than squared error loss, and the second involving modifying the holdout sets used. For the latter, we provide conditions that permit consistent estimation of the number of change-points for a general change-point estimation procedure. We show these conditions are satisfied for optimal partitioning using new results on its performance when supplied with the incorrect number of change-points. Numerical experiments show that the absolute error approach in particular is competitive with common change-point methods using classical tuning parameter choices when error distributions are well-specified, but can substantially outperform these in misspecified models. An implementation of our methodology is available in the R package crossvalidationCP on CRAN.



page 1

page 2

page 3

page 4


Cross-Validation for Correlated Data

K-fold cross-validation (CV) with squared error loss is widely used for ...

Change-point regression with a smooth additive disturbance

We assume a nonparametric regression model with signals given by the sum...

Objective Bayesian Analysis for Change Point Problems

In this paper we present an objective approach to change point analysis....

Statistical learning and cross-validation for point processes

This paper presents the first general (supervised) statistical learning ...

A consistent clustering-based approach to estimating the number of change-points in highly dependent time-series

The problem of change-point estimation is considered under a general fra...

Tuning Parameter Selection for Penalized Estimation via R2

The tuning parameter selection strategy for penalized estimation is cruc...

Multiscale change point detection via gradual bandwidth adjustment in moving sum processes

A method for the detection of changes in the expectation in univariate s...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.