Replicability in Reinforcement Learning

05/31/2023
by   Amin Karbasi, et al.
0

We initiate the mathematical study of replicability as an algorithmic property in the context of reinforcement learning (RL). We focus on the fundamental setting of discounted tabular MDPs with access to a generative model. Inspired by Impagliazzo et al. [2022], we say that an RL algorithm is replicable if, with high probability, it outputs the exact same policy after two executions on i.i.d. samples drawn from the generator when its internal randomness is the same. We first provide an efficient ρ-replicable algorithm for (ε, δ)-optimal policy estimation with sample and time complexity O(N^3·log(1/δ)/(1-γ)^5·ε^2·ρ^2), where N is the number of state-action pairs. Next, for the subclass of deterministic algorithms, we provide a lower bound of order Ω(N^3/(1-γ)^3·ε^2·ρ^2). Then, we study a relaxed version of replicability proposed by Kalavasis et al. [2023] called TV indistinguishability. We design a computationally efficient TV indistinguishable algorithm for policy estimation whose sample complexity is O(N^2·log(1/δ)/(1-γ)^5·ε^2·ρ^2). At the cost of exp(N) running time, we transform these TV indistinguishable algorithms to ρ-replicable ones without increasing their sample complexity. Finally, we introduce the notion of approximate-replicability where we only require that two outputted policies are close under an appropriate statistical divergence (e.g., Renyi) and show an improved sample complexity of O(N·log(1/δ)/(1-γ)^5·ε^2·ρ^2).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/18/2023

Optimistic Natural Policy Gradient: a Simple Efficient Policy Optimization Framework for Online RL

While policy optimization algorithms have played an important role in re...
research
02/10/2023

Towards Minimax Optimality of Model-based Robust Reinforcement Learning

We study the sample complexity of obtaining an ϵ-optimal policy in Robus...
research
07/13/2020

A Provably Efficient Sample Collection Strategy for Reinforcement Learning

A common assumption in reinforcement learning (RL) is to have access to ...
research
01/20/2022

Reproducibility in Learning

We introduce the notion of a reproducible algorithm in the context of le...
research
07/01/2020

Sequential Transfer in Reinforcement Learning with a Generative Model

We are interested in how to design reinforcement learning agents that pr...
research
05/27/2022

KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal

In this work, we consider and analyze the sample complexity of model-fre...
research
02/18/2020

Empirical Policy Evaluation with Supergraphs

We devise and analyze algorithms for the empirical policy evaluation pro...

Please sign up or login with your details

Forgot password? Click here to reset