AsyncQVI: Asynchronous-Parallel Q-Value Iteration for Reinforcement Learning with Near-Optimal Sample Complexity

12/03/2018
by   Yibo Zeng, et al.
0

In this paper, we propose AsyncQVI: Asynchronous-Parallel Q-value Iteration to solve Reinforcement Learning (RL) problems. Given an RL problem with |S| states, |A| actions, and a discounted factor γ∈(0,1), AsyncQVI returns an ε-optimal policy with probability at least 1-δ at the sample complexity Õ(|S||A|/(1-γ)^5ε^2(1/δ)). AsyncQVI is the first asynchronous-parallel RL algorithm with convergence rate analysis and an explicit sample complexity. The above sample complexity of AsyncQVI nearly matches the lower bound. Furthermore, AsyncQVI is scalable since it has low memory footprint at O(|S|) and also has an efficient asynchronous-parallel implementation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/09/2021

Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning

Recent theoretical work studies sample-efficient reinforcement learning ...
research
07/04/2022

Doubly-Asynchronous Value Iteration: Making Value Iteration Asynchronous in Actions

Value iteration (VI) is a foundational dynamic programming method, impor...
research
03/24/2021

Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation

Policy optimization methods are popular reinforcement learning algorithm...
research
03/07/2018

Accelerated Methods for Deep Reinforcement Learning

Deep reinforcement learning (RL) has achieved many recent successes, yet...
research
05/25/2023

Sample Efficient Reinforcement Learning in Mixed Systems through Augmented Samples and Its Applications to Queueing Networks

This paper considers a class of reinforcement learning problems, which i...
research
07/13/2022

A Near-Optimal Primal-Dual Method for Off-Policy Learning in CMDP

As an important framework for safe Reinforcement Learning, the Constrain...
research
10/23/2019

Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles

Reinforcement learning (RL) methods have been shown to be capable of lea...

Please sign up or login with your details

Forgot password? Click here to reset