Fast Regression for Structured Inputs
We study the โ_p regression problem, which requires finding ๐ฑโโ^d that minimizes ๐๐ฑ-๐_p for a matrix ๐โโ^n ร d and response vector ๐โโ^n. There has been recent interest in developing subsampling methods for this problem that can outperform standard techniques when n is very large. However, all known subsampling approaches have run time that depends exponentially on p, typically, d^๐ช(p), which can be prohibitively expensive. We improve on this work by showing that for a large class of common structured matrices, such as combinations of low-rank matrices, sparse matrices, and Vandermonde matrices, there are subsampling based methods for โ_p regression that depend polynomially on p. For example, we give an algorithm for โ_p regression on Vandermonde matrices that runs in time ๐ช(nlog^3 n+(dp^2)^0.5+ฯยทpolylog n), where ฯ is the exponent of matrix multiplication. The polynomial dependence on p crucially allows our algorithms to extend naturally to efficient algorithms for โ_โ regression, via approximation of โ_โ by โ_๐ช(log n). Of practical interest, we also develop a new subsampling algorithm for โ_p regression for arbitrary matrices, which is simpler than previous approaches for p โฅ 4.
READ FULL TEXT