A Minimalist Approach to Offline Reinforcement Learning

06/12/2021
by   Scott Fujimoto, et al.
0

Offline reinforcement learning (RL) defines the task of learning from a fixed batch of data. Due to errors in value estimation from out-of-distribution actions, most offline RL algorithms take the approach of constraining or regularizing the policy with the actions contained in the dataset. Built on pre-existing RL algorithms, modifications to make an RL algorithm work offline comes at the cost of additional complexity. Offline RL algorithms introduce new hyperparameters and often leverage secondary components such as generative models, while adjusting the underlying RL algorithm. In this paper we aim to make a deep RL algorithm work while making minimal changes. We find that we can match the performance of state-of-the-art offline RL algorithms by simply adding a behavior cloning term to the policy update of an online RL algorithm and normalizing the data. The resulting algorithm is a simple to implement and tune baseline, while more than halving the overall run time by removing the additional computational overheads of previous methods.

READ FULL TEXT

page 5

page 8

research
06/01/2022

Know Your Boundaries: The Necessity of Explicit Behavioral Cloning in Offline RL

We introduce an offline reinforcement learning (RL) algorithm that expli...
research
11/02/2022

Behavior Prior Representation learning for Offline Reinforcement Learning

Offline reinforcement learning (RL) struggles in environments with rich ...
research
06/15/2021

On Multi-objective Policy Optimization as a Tool for Reinforcement Learning

Many advances that have improved the robustness and efficiency of deep r...
research
01/05/2023

Extreme Q-Learning: MaxEnt RL without Entropy

Modern Deep Reinforcement Learning (RL) algorithms require estimates of ...
research
02/26/2022

Statistically Efficient Advantage Learning for Offline Reinforcement Learning in Infinite Horizons

We consider reinforcement learning (RL) methods in offline domains witho...
research
02/01/2023

Selective Uncertainty Propagation in Offline RL

We study the finite-horizon offline reinforcement learning (RL) problem....
research
05/21/2022

User-Interactive Offline Reinforcement Learning

Offline reinforcement learning algorithms still lack trust in practice d...

Please sign up or login with your details

Forgot password? Click here to reset