Faster Least Squares Optimization
We investigate randomized methods for solving overdetermined linear least-squares problems, where the Hessian is approximated based on a random projection of the data matrix. We consider a random subspace embedding which is either drawn at the beginning and then fixed, or, refreshed at each iteration. We provide an exact finite-time analysis of the refreshed embeddings method for a broad class of random matrices, an exact asymptotic analysis of the fixed embedding method with a Gaussian matrix, and a non-asymptotic analysis of the fixed embedding method for Gaussian and SRHT matrices, with and without momentum acceleration. Surprisingly, we show that, for Gaussian matrices, the refreshed sketching method with no momentum yields the same asymptotic rate of convergence as the fixed embedding method accelerated with momentum. Furthermore, we characterize optimal step sizes and prove that, for a broad class of random matrices including the Gaussian ensemble, momentum does not accelerate the refreshed embeddings method. Hence, among the class of randomized algorithms we consider, a fixed subspace embedding with momentum yields the fastest rate of convergence, along with the lowest computational complexity. Then, picking the accelerated, fixed embedding method as the algorithm of choice, we obtain a faster algorithm by optimizing over the choice of the sketching dimension. Our choice of the sketch size yields an algorithm, for solving overdetermined least-squares problem, with a lower computational complexity compared to current state-of-the-art least-squares iterative methods based on randomized pre-conditioners. In particular, given the sketched data matrix, as the sample size grows, the resulting computational complexity becomes sub-linear in the problem dimensions. We validate numerically our guarantees on large sample datasets, both for Gaussian and SRHT embeddings.
READ FULL TEXT