Spectral Deconfounding and Perturbed Sparse Linear Models
Standard high-dimensional regression methods assume that the underlying coefficient vector is sparse. This might not be true in some cases, in particular in presence of hidden, confounding variables. Such hidden confounding can be represented as a high-dimensional linear model where the sparse coefficient vector is perturbed. We develop and investigate a class of methods for such model. We propose some spectral transformations, which change the singular values of the design matrix, as a preprocessing step for the data which serves as input for the Lasso. We show that, under some assumptions, one can achieve the optimal ℓ_1-error rate for estimating the underlying sparse coefficient vector. We also illustrate the performance on simulated data and a real-world genomic dataset.
READ FULL TEXT