Cross-Validation for Correlated Data
K-fold cross-validation (CV) with squared error loss is widely used for evaluating predictive models, especially when strong distributional data assumptions cannot be taken. However, CV with squared error loss is not free from distributional assumptions, in particular in cases involving non-i.i.d data. This paper analyzes CV for correlated data. We present a criterion for suitability of CV, and introduce a bias corrected cross-validation prediction error estimator, CV_c, which is suitable in many settings involving correlated data, where CV is invalid. Our theoretical results are also demonstrated numerically.
READ FULL TEXT