Distributionally Robust Offline Reinforcement Learning with Linear Function Approximation

09/14/2022
by   Xiaoteng Ma, et al.
0

Among the reasons hindering reinforcement learning (RL) applications to real-world problems, two factors are critical: limited data and the mismatch between the testing environment (real environment in which the policy is deployed) and the training environment (e.g., a simulator). This paper attempts to address these issues simultaneously with distributionally robust offline RL, where we learn a distributionally robust policy using historical data obtained from the source environment by optimizing against a worst-case perturbation thereof. In particular, we move beyond tabular settings and consider linear function approximation. More specifically, we consider two settings, one where the dataset is well-explored and the other where the dataset has sufficient coverage. We propose two algorithms – one for each of the two settings – that achieve error bounds Õ(d^1/2/N^1/2) and Õ(d^3/2/N^1/2) respectively, where d is the dimension in the linear function approximation and N is the number of trajectories in the dataset. To the best of our knowledge, they provide the first non-asymptotic results of the sample complexity in this setting. Diverse experiments are conducted to demonstrate our theoretical findings, showing the superiority of our algorithm against the non-robust one.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/08/2021

Understanding the Effects of Dataset Characteristics on Offline Reinforcement Learning

In real world, affecting the environment by a weak policy can be expensi...
research
07/17/2023

Natural Actor-Critic for Robust Reinforcement Learning with Function Approximation

We study robust reinforcement learning (RL) with the goal of determining...
research
11/30/2022

Efficient Reinforcement Learning Through Trajectory Generation

A key barrier to using reinforcement learning (RL) in many real-world ap...
research
10/26/2021

The Difficulty of Passive Learning in Deep Reinforcement Learning

Learning to act from observational data without active environmental int...
research
06/11/2021

Corruption-Robust Offline Reinforcement Learning

We study the adversarial robustness in offline reinforcement learning. G...
research
02/20/2023

Reinforcement Learning with Function Approximation: From Linear to Nonlinear

Function approximation has been an indispensable component in modern rei...
research
07/25/2023

The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation

Theoretical guarantees in reinforcement learning (RL) are known to suffe...

Please sign up or login with your details

Forgot password? Click here to reset