Orthant Based Proximal Stochastic Gradient Method for ℓ_1-Regularized Optimization

04/07/2020
by   Tianyi Chen, et al.
0

Sparsity-inducing regularization problems are ubiquitous in machine learning applications, ranging from feature selection to model compression. In this paper, we present a novel stochastic method – Orthant Based Proximal Stochastic Gradient Method (OBProx-SG) – to solve perhaps the most popular instance, i.e., the l1-regularized problem. The OBProx-SG method contains two steps: (i) a proximal stochastic gradient step to predict a support cover of the solution; and (ii) an orthant step to aggressively enhance the sparsity level via orthant face projection. Compared to the state-of-the-art methods, e.g., Prox-SG, RDA and Prox-SVRG, the OBProx-SG not only converges to the global optimal solutions (in convex scenario) or the stationary points (in non-convex scenario), but also promotes the sparsity of the solutions substantially. Particularly, on a large number of convex problems, OBProx-SG outperforms the existing methods comprehensively in the aspect of sparsity exploration and objective values. Moreover, the experiments on non-convex deep neural networks, e.g., MobileNetV1 and ResNet18, further demonstrate its superiority by achieving the solutions of much higher sparsity without sacrificing generalization accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/19/2014

A Proximal Stochastic Gradient Method with Progressive Variance Reduction

We consider the problem of minimizing the sum of two convex functions: o...
research
07/15/2020

A General Family of Stochastic Proximal Gradient Methods for Deep Learning

We study the training of regularized neural networks where the regulariz...
research
09/17/2018

Zeroth-order (Non)-Convex Stochastic Optimization via Conditional Gradient and Gradient Updates

In this paper, we propose and analyze zeroth-order stochastic approximat...
research
01/28/2022

Learning Proximal Operators to Discover Multiple Optima

Finding multiple solutions of non-convex optimization problems is a ubiq...
research
05/18/2021

Make $\ell_1$ Regularization Effective in Training Sparse CNN

Compressed Sensing using 𝓁1 regularization is among the most powerful an...
research
04/22/2021

A Dimension-Insensitive Algorithm for Stochastic Zeroth-Order Optimization

This paper concerns a convex, stochastic zeroth-order optimization (S-ZO...
research
02/07/2021

Structured Sparsity Inducing Adaptive Optimizers for Deep Learning

The parameters of a neural network are naturally organized in groups, so...

Please sign up or login with your details

Forgot password? Click here to reset