An Asymptotic Analysis of Minibatch-Based Momentum Methods for Linear Regression Models

11/02/2021
by   Yuan Gao, et al.
0

Momentum methods have been shown to accelerate the convergence of the standard gradient descent algorithm in practice and theory. In particular, the minibatch-based gradient descent methods with momentum (MGDM) are widely used to solve large-scale optimization problems with massive datasets. Despite the success of the MGDM methods in practice, their theoretical properties are still underexplored. To this end, we investigate the theoretical properties of MGDM methods based on the linear regression models. We first study the numerical convergence properties of the MGDM algorithm and further provide the theoretically optimal tuning parameters specification to achieve faster convergence rate. In addition, we explore the relationship between the statistical properties of the resulting MGDM estimator and the tuning parameters. Based on these theoretical findings, we give the conditions for the resulting estimator to achieve the optimal statistical efficiency. Finally, extensive numerical experiments are conducted to verify our theoretical results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/13/2023

On the asymptotic properties of a bagging estimator with a massive dataset

Bagging is a useful method for large-scale statistical analysis, especia...
research
10/04/2020

Provable Acceleration of Neural Net Training via Polyak's Momentum

Incorporating a so-called "momentum" dynamic in gradient descent methods...
research
01/14/2022

The Implicit Regularization of Momentum Gradient Descent with Early Stopping

The study on the implicit regularization induced by gradient-based optim...
research
04/13/2023

Statistical Analysis of Fixed Mini-Batch Gradient Descent Estimator

We study here a fixed mini-batch gradient decent (FMGD) algorithm to sol...
research
05/10/2021

A Sharp Analysis of Covariate Adjusted Precision Matrix Estimation via Alternating Gradient Descent with Hard Thresholding

In this paper, we present a sharp analysis for an alternating gradient d...
research
05/06/2022

Network Gradient Descent Algorithm for Decentralized Federated Learning

We study a fully decentralized federated learning algorithm, which is a ...
research
09/14/2020

A Qualitative Study of the Dynamic Behavior of Adaptive Gradient Algorithms

The dynamic behavior of RMSprop and Adam algorithms is studied through a...

Please sign up or login with your details

Forgot password? Click here to reset