A Finite Sample Complexity Bound for Distributionally Robust Q-learning

02/26/2023
by   Shengbo Wang, et al.
0

We consider a reinforcement learning setting in which the deployment environment is different from the training environment. Applying a robust Markov decision processes formulation, we extend the distributionally robust Q-learning framework studied in Liu et al. [2022]. Further, we improve the design and analysis of their multi-level Monte Carlo estimator. Assuming access to a simulator, we prove that the worst-case expected sample complexity of our algorithm to learn the optimal robust Q-function within an ϵ error in the sup norm is upper bounded by Õ(|S||A|(1-γ)^-5ϵ^-2p_∧^-6δ^-4), where γ is the discount rate, p_∧ is the non-zero minimal support probability of the transition kernels and δ is the uncertainty size. This is the first sample complexity result for the model-free robust RL problem. Simulation studies further validate our theoretical results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/05/2023

Improved Sample Complexity Bounds for Distributionally Robust Reinforcement Learning

We consider the problem of learning a control policy that is robust agai...
research
11/21/2018

Model-Based Reinforcement Learning in Contextual Decision Processes

We study the sample complexity of model-based reinforcement learning in ...
research
08/23/2022

Convergence bounds for nonlinear least squares for tensor recovery

We consider the problem of approximating a function in general nonlinear...
research
12/19/2020

Sample Complexity of Adversarially Robust Linear Classification on Separated Data

We consider the sample complexity of learning with adversarial robustnes...
research
05/09/2021

Non-asymptotic Performances of Robust Markov Decision Processes

In this paper, we study the non-asymptotic performance of optimal policy...
research
05/28/2023

Sample Complexity of Variance-reduced Distributionally Robust Q-learning

Dynamic decision making under distributional shifts is of fundamental in...
research
09/08/2023

Learning Zero-Sum Linear Quadratic Games with Improved Sample Complexity

Zero-sum Linear Quadratic (LQ) games are fundamental in optimal control ...

Please sign up or login with your details

Forgot password? Click here to reset