A new formula for fast computation of segmented cross validation residuals in linear regression modelling – providing efficient regularisation parameter estimation in Ridge R
In the present paper we prove a new theorem, resulting in an exact updating formula for linear regression model residuals to calculate the segmented cross-validation residuals for any choice of cross-validation strategy without model refitting. The required matrix inversions are limited by the cross-validation segment sizes and can be executed with high efficiency in parallel. The well-known formula for leave-one-out cross-validation follows as a special case of our theorem. In situations where the cross-validation segments consist of small groups of repeated measurements, we suggest a heuristic strategy for fast serial approximations of the cross-validated residuals and associated PRESS statistic. We also suggest strategies for quick estimation of the exact minimum PRESS value and full PRESS function over a selected interval of regularisation values. The computational effectiveness of the parameter selection for Ridge-/Tikhonov regression modelling resulting from our theoretical findings and heuristic arguments is demonstrated for several practical applications.
READ FULL TEXT