Weighted SGD for ℓ_p Regression with Randomized Preconditioning

02/12/2015
by   Jiyan Yang, et al.
0

In recent years, stochastic gradient descent (SGD) methods and randomized linear algebra (RLA) algorithms have been applied to many large-scale problems in machine learning and data analysis. We aim to bridge the gap between these two methods in solving constrained overdetermined linear regression problems---e.g., ℓ_2 and ℓ_1 regression problems. We propose a hybrid algorithm named pwSGD that uses RLA techniques for preconditioning and constructing an importance sampling distribution, and then performs an SGD-like iterative process with weighted sampling on the preconditioned system. We prove that pwSGD inherits faster convergence rates that only depend on the lower dimension of the linear system, while maintaining low computation complexity. Particularly, when solving ℓ_1 regression with size n by d, pwSGD returns an approximate solution with ϵ relative error in the objective value in O( n ·nnz(A) + poly(d)/ϵ^2) time. This complexity is uniformly better than that of RLA methods in terms of both ϵ and d when the problem is unconstrained. For ℓ_2 regression, pwSGD returns an approximate solution with ϵ relative error in the objective value and the solution vector measured in prediction norm in O( n ·nnz(A) + poly(d) (1/ϵ) /ϵ) time. We also provide lower bounds on the coreset complexity for more general regression problems, indicating that still new ideas will be needed to extend similar RLA preconditioning ideas to weighted SGD algorithms for more general regression problems. Finally, the effectiveness of such algorithms is illustrated numerically on both synthetic and real datasets.

READ FULL TEXT
research
07/16/2022

Adaptive Sketches for Robust Regression with Importance Sampling

We introduce data structures for solving robust regression through stoch...
research
02/09/2018

Large Scale Constrained Linear Regression Revisited: Faster Algorithms via Preconditioning

In this paper, we revisit the large-scale constrained linear regression ...
research
10/21/2013

Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm

We obtain an improved finite-sample guarantee on the linear convergence ...
research
02/01/2017

On SGD's Failure in Practice: Characterizing and Overcoming Stalling

Stochastic Gradient Descent (SGD) is widely used in machine learning pro...
research
07/11/2013

Fast gradient descent for drifting least squares regression, with application to bandits

Online learning algorithms require to often recompute least squares regr...
research
06/07/2020

An Efficient Algorithm For Generalized Linear Bandit: Online Stochastic Gradient Descent and Thompson Sampling

We consider the contextual bandit problem, where a player sequentially m...
research
02/10/2015

Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments

In this era of large-scale data, distributed systems built on top of clu...

Please sign up or login with your details

Forgot password? Click here to reset