The Least Restriction for Offline Reinforcement Learning

07/05/2021
by   Zizhou Su, et al.
0

Many practical applications of reinforcement learning (RL) constrain the agent to learn from a fixed offline dataset of logged interactions, which has already been gathered, without offering further possibility for data collection. However, commonly used off-policy RL algorithms, such as the Deep Q Network and the Deep Deterministic Policy Gradient, are incapable of learning without data correlated to the distribution under the current policy, making them ineffective for this offline setting. As the first step towards useful offline RL algorithms, we analysis the reason of instability in standard off-policy RL algorithms. It is due to the bootstrapping error. The key to avoiding this error, is ensuring that the agent's action space does not go out of the fixed offline dataset. Based on our consideration, a creative offline RL framework, the Least Restriction (LR), is proposed in this paper. The LR regards selecting an action as taking a sample from the probability distribution. It merely set a little limit for action selection, which not only avoid the action being out of the offline dataset but also remove all the unreasonable restrictions in earlier approaches (e.g. Batch-Constrained Deep Q-Learning). In the further, we will demonstrate that the LR, is able to learn robustly from different offline datasets, including random and suboptimal demonstrations, on a range of practical control tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/08/2021

Understanding the Effects of Dataset Characteristics on Offline Reinforcement Learning

In real world, affecting the environment by a weak policy can be expensi...
research
06/03/2019

Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction

Off-policy reinforcement learning aims to leverage experience collected ...
research
11/14/2020

PLAS: Latent Action Space for Offline Reinforcement Learning

The goal of offline reinforcement learning is to learn a policy from a f...
research
09/14/2023

Equivariant Data Augmentation for Generalization in Offline Reinforcement Learning

We present a novel approach to address the challenge of generalization i...
research
05/12/2021

Interpretable performance analysis towards offline reinforcement learning: A dataset perspective

Offline reinforcement learning (RL) has increasingly become the focus of...
research
02/07/2021

An Analysis of Frame-skipping in Reinforcement Learning

In the practice of sequential decision making, agents are often designed...
research
03/08/2021

Instabilities of Offline RL with Pre-Trained Neural Representation

In offline reinforcement learning (RL), we seek to utilize offline data ...

Please sign up or login with your details

Forgot password? Click here to reset