High-dimensional data segmentation in regression settings permitting heavy tails and temporal dependence
We propose a data segmentation methodology for the high-dimensional linear regression problem where the regression parameters are allowed to undergo multiple changes. The proposed methodology, MOSEG, proceeds in two stages where the data is first scanned for multiple change points using a moving window-based procedure, which is followed by a location refinement stage. MOSEG enjoys computational efficiency thanks to the adoption of a coarse grid in the first stage, as well as achieving theoretical consistency in estimating both the total number and the locations of the change points without requiring independence or sub-Gaussianity. In particular, it nearly matches minimax optimal rates when Gaussianity is assumed. We also propose MOSEG.MS, a multiscale extension of MOSEG which, while comparable to MOSEG in terms of computational complexity, achieves theoretical consistency for a broader parameter space that permits multiscale change points. We demonstrate good performance of the proposed methods in comparative simulation studies and also in applications to climate science and economic datasets.
READ FULL TEXT