Stochastic Orthant-Wise Limited-Memory Quasi-Newton Methods

04/26/2017

∙

The ℓ_1-regularized sparse model has been popular in machine learning society. The orthant-wise quasi-Newton (OWL-QN) method is a representative fast algorithm for training the model. However, the proof of the convergence has been pointed out to be incorrect by multiple sources, and up until now, its convergence has not been proved at all. In this paper, we propose a stochastic OWL-QN method for solving ℓ_1-regularized problems, both with convex and non-convex loss functions. We address technical difficulties that have existed many years. We propose three alignment steps which are generalized from the the original OWL-QN algorithm, to encourage the parameter update be orthant-wise. We adopt several practical features from recent stochastic variants of L-BFGS and the variance reduction method for subsampled gradients. To the best of our knowledge, this is the first orthant-wise algorithms with comparable theoretical convergence rate with stochastic first order algorithms. We prove a linear convergence rate for our algorithm under strong convexity, and experimentally demonstrate that our algorithm achieves state-of-art performance on ℓ_1 regularized logistic regression and convolutional neural networks.

READ FULL TEXT

Stochastic Orthant-Wise Limited-Memory Quasi-Newton Methods

Sign in with Google

Consider DeepAI Pro