DeepAI
Log In Sign Up

Momentum Q-learning with Finite-Sample Convergence Guarantee

07/30/2020
by   Bowen Weng, et al.
8

Existing studies indicate that momentum ideas in conventional optimization can be used to improve the performance of Q-learning algorithms. However, the finite-sample analysis for momentum-based Q-learning algorithms is only available for the tabular case without function approximations. This paper analyzes a class of momentum-based Q-learning algorithms with finite-sample guarantee. Specifically, we propose the MomentumQ algorithm, which integrates the Nesterov's and Polyak's momentum schemes, and generalizes the existing momentum-based Q-learning algorithms. For the infinite state-action space case, we establish the convergence guarantee for MomentumQ with linear function approximations and Markovian sampling. In particular, we characterize the finite-sample convergence rate which is provably faster than the vanilla Q-learning. This is the first finite-sample analysis for momentum-based Q-learning algorithms with function approximations. For the tabular case under synchronous sampling, we also obtain a finite-sample convergence rate that is slightly better than the SpeedyQ <cit.> when choosing a special family of step sizes. Finally, we demonstrate through various experiments that the proposed MomentumQ outperforms other momentum-based Q-learning algorithms.

READ FULL TEXT
05/20/2020

Finite-sample Analysis of Greedy-GQ with Linear Function Approximation under Markovian Noise

Greedy-GQ is an off-policy two timescale algorithm for optimal control i...
07/15/2020

Analysis of Q-learning with Adaptation and Momentum Restart for Gradient Descent

Existing convergence analyses of Q-learning mostly focus on the vanilla ...
02/06/2019

Finite-Sample Analysis for SARSA and Q-Learning with Linear Function Approximation

Though the convergence of major reinforcement learning algorithms has be...
05/15/2020

Momentum with Variance Reduction for Nonconvex Composition Optimization

Composition optimization is widely-applied in nonconvex machine learning...
09/30/2021

Electronic Observables for Relaxed Bilayer 2D Heterostructures in Momentum Space

We generalize the transformations and duality found in incommensurate 2D...
04/28/2021

Finite-sample Efficient Conformal Prediction

Conformal prediction is a generic methodology for finite-sample valid di...
04/28/2022

On the Convergence of Momentum-Based Algorithms for Federated Stochastic Bilevel Optimization Problems

In this paper, we studied the federated stochastic bilevel optimization ...