G̅_mst:An Unbiased Stratified Statistic and a Fast Gradient Optimization Algorithm Based on It

10/07/2021
by   Aixiang Chen, et al.
0

-The fluctuation effect of gradient expectation and variance caused by parameter update between consecutive iterations is neglected or confusing by current mainstream gradient optimization algorithms. The work in this paper remedy this issue by introducing a novel unbiased stratified statistic G̅_mst , a sufficient condition of fast convergence for G̅_mst also is established. A novel algorithm named MSSG designed based on G̅_mst outperforms other sgd-like algorithms. Theoretical conclusions and experimental evidence strongly suggest to employ MSSG when training deep model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/21/2022

MSTGD:A Memory Stochastic sTratified Gradient Descent Method with an Exponential Convergence Rate

The fluctuation effect of gradient expectation and variance caused by pa...
research
11/20/2017

Unbiased Simulation for Optimizing Stochastic Function Compositions

In this paper, we introduce an unbiased gradient simulation algorithms f...
research
02/06/2019

On the Variance of Unbiased Online Recurrent Optimization

The recently proposed Unbiased Online Recurrent Optimization algorithm (...
research
03/03/2022

AdaFamily: A family of Adam-like adaptive gradient methods

We propose AdaFamily, a novel method for training deep neural networks. ...
research
05/31/2023

Toward Understanding Why Adam Converges Faster Than SGD for Transformers

While stochastic gradient descent (SGD) is still the most popular optimi...
research
02/06/2023

U-Clip: On-Average Unbiased Stochastic Gradient Clipping

U-Clip is a simple amendment to gradient clipping that can be applied to...
research
11/04/2022

How Does Adaptive Optimization Impact Local Neural Network Geometry?

Adaptive optimization methods are well known to achieve superior converg...

Please sign up or login with your details

Forgot password? Click here to reset