Convergence of the mini-batch SIHT algorithm

09/29/2022
by   Saeed Damadi, et al.
0

The Iterative Hard Thresholding (IHT) algorithm has been considered extensively as an effective deterministic algorithm for solving sparse optimizations. The IHT algorithm benefits from the information of the batch (full) gradient at each point and this information is a crucial key for the convergence analysis of the generated sequence. However, this strength becomes a weakness when it comes to machine learning and high dimensional statistical applications because calculating the batch gradient at each iteration is computationally expensive or impractical. Fortunately, in these applications the objective function has a summation structure that can be taken advantage of to approximate the batch gradient by the stochastic mini-batch gradient. In this paper, we study the mini-batch Stochastic IHT (SIHT) algorithm for solving the sparse optimizations. As opposed to previous works where increasing and variable mini-batch size is necessary for derivation, we fix the mini-batch size according to a lower bound that we derive and show our work. To prove stochastic convergence of the objective value function we first establish a critical sparse stochastic gradient descent property. Using this stochastic gradient descent property we show that the sequence generated by the stochastic mini-batch SIHT is a supermartingale sequence and converges with probability one. Unlike previous work we do not assume the function to be a restricted strongly convex. To the best of our knowledge, in the regime of sparse optimization, this is the first time in the literature that it is shown that the sequence of the stochastic function values converges with probability one by fixing the mini-batch size for all steps.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/14/2022

MBGDT:Robust Mini-Batch Gradient Descent

In high dimensions, most machine learning method perform fragile even th...
research
08/27/2018

Accelerating Asynchronous Stochastic Gradient Descent for Neural Machine Translation

In order to extract the best possible performance from asynchronous stoc...
research
11/17/2017

A Resizable Mini-batch Gradient Descent based on a Randomized Weighted Majority

Determining the appropriate batch size for mini-batch gradient descent i...
research
07/21/2021

Differentiable Annealed Importance Sampling and the Perils of Gradient Noise

Annealed importance sampling (AIS) and related algorithms are highly eff...
research
08/08/2019

Mini-batch Metropolis-Hastings MCMC with Reversible SGLD Proposal

Traditional MCMC algorithms are computationally intensive and do not sca...
research
10/17/2014

mS2GD: Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting

We propose a mini-batching scheme for improving the theoretical complexi...
research
05/03/2023

Select without Fear: Almost All Mini-Batch Schedules Generalize Optimally

We establish matching upper and lower generalization error bounds for mi...

Please sign up or login with your details

Forgot password? Click here to reset