Accelerated Doubly Stochastic Gradient Algorithm for Large-scale Empirical Risk Minimization

by   Zebang Shen, et al.
Zhejiang University

Nowadays, algorithms with fast convergence, small memory footprints, and low per-iteration complexity are particularly favorable for artificial intelligence applications. In this paper, we propose a doubly stochastic algorithm with a novel accelerating multi-momentum technique to solve large scale empirical risk minimization problem for learning tasks. While enjoying a provably superior convergence rate, in each iteration, such algorithm only accesses a mini batch of samples and meanwhile updates a small block of variable coordinates, which substantially reduces the amount of memory reference when both the massive sample size and ultra-high dimensionality are involved. Empirical studies on huge scale datasets are conducted to illustrate the efficiency of our method in practice.


page 1

page 2

page 3

page 4


Accelerated Variance Reduced Block Coordinate Descent

Algorithms with fast convergence, small number of data access, and low p...

Doubly Accelerated Stochastic Variance Reduced Dual Averaging Method for Regularized Empirical Risk Minimization

In this paper, we develop a new accelerated stochastic gradient method f...

Learning Large Scale Sparse Models

In this work, we consider learning sparse models in large scale settings...

Stochastic Gradient Made Stable: A Manifold Propagation Approach for Large-Scale Optimization

Stochastic gradient descent (SGD) holds as a classical method to build l...

Decentralized Stochastic Gradient Tracking for Non-convex Empirical Risk Minimization

This paper studies a decentralized stochastic gradient tracking (DSGT) a...

Decentralized Stochastic Gradient Tracking for Empirical Risk Minimization

Recent works have shown superiorities of decentralized SGD to centralize...

Please sign up or login with your details

Forgot password? Click here to reset