SGD with Variance Reduction beyond Empirical Risk Minimization

10/16/2015
by   Massil Achab, et al.
0

We introduce a doubly stochastic proximal gradient algorithm for optimizing a finite average of smooth convex functions, whose gradients depend on numerically expensive expectations. Our main motivation is the acceleration of the optimization of the regularized Cox partial-likelihood (the core model used in survival analysis), but our algorithm can be used in different settings as well. The proposed algorithm is doubly stochastic in the sense that gradient steps are done using stochastic gradient descent (SGD) with variance reduction, where the inner expectations are approximated by a Monte-Carlo Markov-Chain (MCMC) algorithm. We derive conditions on the MCMC number of iterations guaranteeing convergence, and obtain a linear rate of convergence under strong convexity and a sublinear rate without this assumption. We illustrate the fact that our algorithm improves the state-of-the-art solver for regularized Cox partial-likelihood on several datasets from survival analysis.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/04/2017

A Convergence Analysis for A Class of Practical Variance-Reduction Stochastic Gradient MCMC

Stochastic gradient Markov Chain Monte Carlo (SG-MCMC) has been develope...
research
02/11/2018

SGD and Hogwild! Convergence Without the Bounded Gradients Assumption

Stochastic gradient descent (SGD) is the optimization algorithm of choic...
research
07/27/2019

The Wang-Landau Algorithm as Stochastic Optimization and its Acceleration

We show that the Wang-Landau algorithm can be formulated as a stochastic...
research
03/19/2023

Provable Convergence of Variational Monte Carlo Methods

The Variational Monte Carlo (VMC) is a promising approach for computing ...
research
10/04/2016

Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite-Sum Structure

Stochastic optimization algorithms with variance reduction have proven s...
research
07/06/2020

Stochastic Stein Discrepancies

Stein discrepancies (SDs) monitor convergence and non-convergence in app...

Please sign up or login with your details

Forgot password? Click here to reset